# Extracting current Tweets

## 1. Create options file
Before you can filter out relevant Tweets, you first need to determine some 'option' arguments. These arguments are:
1) **query_params**: How do you want to filter the Tweets? <br>
2) **output_dir**: In what folder do you want to save the extracted Tweets? <br>
3) **output_file_name**: What name do you want to give the output file? <br>
4) **output_csv**: Do you want to save the output file as .csv or as .json (default)? <br>

Define these arguments inside a dictionary and safe it as a .json file. To run the script, you only need to provide the path to this 'options' file as input argument.


In [7]:
https://developer.twitter.com/en/products/twitter-api/academic-research/application-info


# example of options.json
{"query_params": 
     {"query": "#EurovisionAgain lang:en has:geo -is:retweet",
         "max_results": 100,
         "end_time": "",
         "expansions": "geo.place_id",
         "tweet.fields": "id,author_id,created_at,text,entities,geo"
         },
     "output_dir": "output",
     "output_file_name": "extracted_tweets",
     "output_csv": "False"
 }

{'query_params': {'query': '#EurovisionAgain lang:en has:geo -is:retweet',
  'max_results': 100,
  'end_time': '',
  'expansions': 'geo.place_id',
  'tweet.fields': 'id,author_id,created_at,text,entities,geo'},
 'output_dir': 'output',
 'output_file_name': 'extracted_tweets',
 'output_csv': 'False'}

*Note: if an argument is left empty (such as 'end_time'), this argument will not be included in the search filter.*

### 1.1. Query parameters
`query_params`

There is an extensive list of arguments you can use to fine tune how you want to filter the recent Tweets. Some basic options are already listed in the `options.json` file, such as:
* **query**: this is the most import argument as it specifies what the tweets will be filtered on (in this case the hashtag '#EurovisionAgain', being written in English, having a geo-location, and not being a retweet)
* **max_results**: how many tweets do you want to extract?
* **end_time**: most recent UTC timestamp to which the Tweets will be provided (format: `YYYY-MM-DDTHH:mm:ssZ`)
* **expansions**: requests additional data objects that relate to the originally returned Tweets (in this case the geo-location of the tweet)
* **tweet.fields**: which specific Tweet fields will be delivered in each returned Tweet object <br>

For extra options have a look at: https://developer.twitter.com/en/docs/twitter-api/tweets/search/api-reference/get-tweets-search-recent

### 1.2. Ouput directory
`output_dir`

This is the path to where the output file containing all extracted tweets, will be saved. 

For example: in the folder 'output'

### 1.3. Ouput file name
`output_file_name`

This is the name the output file containing all extracted tweets will get. 

For example: 'extracted_tweets'

### 1.4. Output file type
`output_csv`

When this option is "True", the output file will be saved as a .csv file in the output directory. If this option is "False" (or left empty), the output file will be saved as a. json file.

For example: the file will now be saved in `output/extracted_tweets.json`

## 2. Set Bearer Token
Before you can connect with the Twitter server, you first need to save your Bearer Token as an environmental variable. 
To do this, open your terminal and use the following code (depending on your device):

* **Mac**: `export "BEARER_TOKEN"="<insert_bearer_token_here>"`
* **Windows**: `SET BEARER_TOKEN=<insert_bearer_token_here>`

If this doesn't work, you can also run the following code inside this notebook/your Python console:

In [1]:
import os
os.environ['BEARER_TOKEN']='<insert_bearer_token_here>'

## 3. Run script

In [2]:
from current_search import console

Run the following line in your terminal: <br> `python current_search.py -o options.json`

In [3]:
console('options.json')

{'data': [{'entities': {'annotations': [{'start': 7,
      'end': 12,
      'probability': 0.9294,
      'type': 'Person',
      'normalized_text': 'Carola'},
     {'start': 37,
      'end': 42,
      'probability': 0.9475,
      'type': 'Place',
      'normalized_text': 'Sweden'}],
    'urls': [{'start': 192,
      'end': 215,
      'url': 'https://t.co/4EwuOiWtiI',
      'expanded_url': 'https://twitter.com/jrawson/status/1244015578953760768',
      'display_url': 'twitter.com/jrawson/status…'}],
    'hashtags': [{'start': 57, 'end': 68, 'tag': 'eurovision'},
     {'start': 167, 'end': 183, 'tag': 'EurovisionAgain'},
     {'start': 184, 'end': 191, 'tag': 'carola'}]},
   'text': 'Seeing Carola presenting points from Sweden yesterday at #eurovision reminded me what a badass she is 😂 I fucking love this woman ♥ "Let me replant these daisies" 🌼 😂 #EurovisionAgain #carola https://t.co/4EwuOiWtiI',
   'author_id': '4839655199',
   'geo': {'place_id': '26b0db32cfda0432'},
   'id': '1396576

In [4]:
import pandas as pd
output = console('options.json')
df = pd.DataFrame(output['data'])
try:
    df['Location'] = ""
    for i, id in enumerate(df['geo']):
        if id['place_id'] in str(output['includes']):
            location = [location['full_name']
                        for location in output['includes']['places'] if id['place_id'] == location['id']][0]
            df.loc[i, 'Location'] = location
except KeyError:
    pass
df

Unnamed: 0,text,id,geo,created_at,entities,author_id,Location
0,Seeing Carola presenting points from Sweden ye...,1396576246537105415,{'place_id': '26b0db32cfda0432'},2021-05-23T21:18:18.000Z,"{'hashtags': [{'start': 57, 'end': 68, 'tag': ...",4839655199,"Walthamstow, London"
1,One of the things that stands out on #Eurovisi...,1396237735061594113,{'place_id': '7f15dd80ac78ef40'},2021-05-22T22:53:11.000Z,"{'hashtags': [{'start': 37, 'end': 53, 'tag': ...",1260279476870676485,"Bristol, England"
2,Somehow I've never had the opportunity to watc...,1396210354661871618,{'place_id': '2a0101ab07454619'},2021-05-22T21:04:23.000Z,"{'hashtags': [{'start': 48, 'end': 59, 'tag': ...",170280254,"Nuneaton, England"
3,If Italy win we know what happens. We have a c...,1396208286991589377,{'place_id': '3ddb10d62383799a'},2021-05-22T20:56:10.000Z,"{'hashtags': [{'start': 73, 'end': 80, 'tag': ...",14341032,"Bedford, England"
4,Looking forward to watching #Eurovision can’t ...,1396180419259809799,{'place_id': '1c9ddd0025efc17b'},2021-05-22T19:05:26.000Z,"{'hashtags': [{'start': 28, 'end': 39, 'tag': ...",802120822290792448,"Prestwood, South East"
5,It's not #EurovisionAgain it's #eurovision2021...,1395475083662086148,{'place_id': '7ae9e2f2ff7a87cd'},2021-05-20T20:22:41.000Z,"{'hashtags': [{'start': 9, 'end': 25, 'tag': '...",92016047,"Edinburgh, Scotland"
6,On this day in 2019 #DuncanLaurence writes #hi...,1395076845851316224,{'place_id': '03ee62e07f760312'},2021-05-19T18:00:13.000Z,"{'hashtags': [{'start': 20, 'end': 35, 'tag': ...",1005708111741628419,"Noordwijkerhout, Nederland"
7,A lot of acts getting their tits out to gain a...,1394754351273582596,{'place_id': '3cdad59a91d99400'},2021-05-18T20:38:45.000Z,"{'hashtags': [{'start': 57, 'end': 72, 'tag': ...",442708855,"Galway, Ireland"
8,Vote Ireland!!!!!!! And russia too #Eurovision...,1394739332532084742,{'place_id': '7807880b17af73d8'},2021-05-18T19:39:04.000Z,"{'hashtags': [{'start': 35, 'end': 46, 'tag': ...",162157097,"Guadalajara, Jalisco"
9,#ESC2021 #Eurovision #EurovisionAgain \n\nLET’...,1394732337477296128,{'place_id': '0073b76548e5984f'},2021-05-18T19:11:16.000Z,"{'hashtags': [{'start': 0, 'end': 8, 'tag': 'E...",75472028,"Sydney, New South Wales"


In [5]:
output['meta']

{'newest_id': '1396576246537105415',
 'oldest_id': '1394609586242396160',
 'result_count': 16}