# CPS Workshop: Twitter Scraping
This notebook is written by Vicky Lin and is meant to be used in conjunction to the CPS workshop on June 5, 2020. Social media data collection and analysis are still fairly new and do not have a streamlined process, making it incredibly difficult and time consuming. For this reason, this workbook uses a Python library built by a third party that will help in this process. This notebook draws from John Simpson's [Introduction to Twitter Scraping for Researchers](https://github.com/ualberta-rcg/twitter_scraping). Special thanks to Lisa Strohschein, John Simpson, Victoria Romanik, and the University of Alberta Department of Sociology.

On the GitHub page for this workshop, there is a Resources folder with extra resources and information on Python and Twitter. If you encounter further trouble after this workshop, this folder may be a good place to start.

For this workshop, you will need:
1. A functioning Python environment on the system it is run in and that you are authorized to install software on it, or a Google account where you can use Google Colab.
2. A Twitter Developper account.
3. A Twitter-created app.
4. A MongoDB instance set up and appropriately configured.

The Python library used in this notebook is called [TwitterAPI](https://github.com/geduldig/TwitterAPI) (no spaces).

## Installation and Authentication
Let's start by installing the TwitterAPI library.

In [None]:
!pip install TwitterAPI

Once the TwitterAPI library is installed, we should be able to open it to use throughout the workbook with the following command:

from TwitterAPI import TwitterAPI

The rest of this notebook assumes that the above two blocks of code have been run and executed successfully. Opening this workbook without running these two lines of code will likely cause errors later on. If you are encountering errors, you may need to run these two lines again.

#### A Note on Authentication
Taking advantage of Twitter Developer API and this notebook requires a Twitter Developer account. A Twitter Developer account has to be requested and the process may take a few days. Applying for a Twitter Developer account requires a regular Twitter account, and can be done here: https://developer.twitter.com/en/apply/user

There are two types of authentication that you may want to take advantage of. Here are the main differences:

You will need oAuth1 (User authentication) for the following:
- Post Tweets or other resources;
- Connect to Streaming endpoints;
- Search for users;
- Use any geo endpoints;
- Access Direct Messages or account credentials;
- Retrieve user's email addresses

You can use oAuth2 (Application-only authentication) for the following:
- Pull user timelines;
- Access friends and followers of any account;
- Access lists resources;
- Search in Tweets;
- Retrieve any user information, exclusing the user's email address

You can use either of these in this notebook, but be aware of which on you will be using and what they can do. Both authentication methods require some information about keys and tokens pased into the appropriate secton of the cell below. This key and token information is generated when you creae a profile for an app of the Twitter Developer site.

Paste in the required key and token information from the Twitter Developer site into the cell below and run it. You'll need to run the cell below to load your credentials and only _one_ of the suthorization methods below.

In [None]:
API_KEY = ''
API_KEY_SECRET = ''
ACCESS_TOKEN = ''
ACCESS_TOKEN_SECRET = ''

### oAuth1 (User Identification)

api = TwitterAPI(API_KEY,
                API_KEY_SECRET,
                ACCESS_TOKEN,
                ACCESS_TOKEN_SECRET)
api.auth

If successful, the output should look something like this:
`<requests_oauthlib.oauth1_auth.OAuth1 at 0x107b8bba8>`

### oAuth2 (App Identification)

In [None]:
api = TwitterAPI(API_KEY,
                API_KEY_SECRET,
                auth_type='oAuth2')
api.auth

If successful, the output should look something like this:
`<TwitterAPI.BearerAuth.BearerAuth at 0x107b9acc0>`