# **Python APIs**

## **Basic APIs**
No authentication required. No inputs requried.
1. https://api.publicapis.org/entries -> Get a list of any or all public APIs currently cataloged in the project.
2. https://cat-fact.herokuapp.com/facts -> Get random cat facts via text message every day.
3. https://api.coindesk.com/v1/bpi/currentprice.json -> View the Bitcoin Price Index (BPI) in real-time.
4. https://www.boredapi.com/api/activity -> Bored is a free API to find something to do by getting suggestions for random activities.
5. https://dog.ceo/api/breeds/image/random -> Cheer yourself up with random dog images.
6. https://ipinfo.io/161.185.160.93/geo -> Get information about a specified IP address, such as geological info, company, and carrier name.
7. https://official-joke-api.appspot.com/random_joke -> Get random jokes. You can also get jokes according to type (e.g., programming jokes only).
8. https://randomuser.me/api/ -> Get information about a random fake user, including gender, name, email, address, etc.
9. https://api.zippopotam.us/us/33162 -> Get information about a specified ZIP code.
10. http://api.open-notify.org -> Get information about the international space station

In [3]:
import requests

# '.get' refers to the type of request: GET, POST, or many others but those two are most common
response = requests.get("https://api.publicapis.org/entries") 
print(response.status_code)

200


**API Status Codes**
Status codes are returned with every request that is made to a web server. Status codes indicate information about what happened with a request. Here are some codes that are relevant to GET requests:

* 200: Everything went okay, and the result has been returned (if any).
* 301: The server is redirecting you to a different endpoint. This can happen when a company switches domain names, or an endpoint name is changed.
* 400: The server thinks you made a bad request. This can happen when you don’t send along the right data, among other things.
* 401: The server thinks you’re not authenticated. Many APIs require login ccredentials, so this happens when you don’t send the right credentials to access an API.
* 403: The resource you’re trying to access is forbidden: you don’t have the right permissions to see it.
* 404: The resource you tried to access wasn’t found on the server.
* 503: The server is not ready to handle the request.

In [4]:
response.text

'{"count":940,"entries":[{"API":"Axolotl","Description":"Collection of axolotl pictures and facts","Auth":"","HTTPS":true,"Cors":"unknown","Link":"https://theaxolotlapi.netlify.app/","Category":"Animals"},{"API":"Cat Facts","Description":"Daily cat facts","Auth":"","HTTPS":true,"Cors":"no","Link":"https://alexwohlbruck.github.io/cat-facts/","Category":"Animals"},{"API":"Cataas","Description":"Cat as a service (cats pictures and gifs)","Auth":"","HTTPS":true,"Cors":"unknown","Link":"https://cataas.com/","Category":"Animals"},{"API":"catAPI","Description":"Random pictures of cats","Auth":"","HTTPS":true,"Cors":"yes","Link":"https://github.com/ThatCopy/catAPI/wiki/Usage","Category":"Animals"},{"API":"Cats","Description":"Pictures of cats from Tumblr","Auth":"apiKey","HTTPS":true,"Cors":"unknown","Link":"https://docs.thecatapi.com/","Category":"Animals"},{"API":"Dog Facts","Description":"Random dog facts","Auth":"","HTTPS":true,"Cors":"unknown","Link":"https://dukengn.github.io/Dog-facts-A

In [5]:
import json 
json_data = json.loads(response.text)
json_formatted = json.dumps(json_data, indent=2)
print(json_formatted)

{
  "count": 940,
  "entries": [
    {
      "API": "Axolotl",
      "Description": "Collection of axolotl pictures and facts",
      "Auth": "",
      "HTTPS": true,
      "Cors": "unknown",
      "Link": "https://theaxolotlapi.netlify.app/",
      "Category": "Animals"
    },
    {
      "API": "Cat Facts",
      "Description": "Daily cat facts",
      "Auth": "",
      "HTTPS": true,
      "Cors": "no",
      "Link": "https://alexwohlbruck.github.io/cat-facts/",
      "Category": "Animals"
    },
    {
      "API": "Cataas",
      "Description": "Cat as a service (cats pictures and gifs)",
      "Auth": "",
      "HTTPS": true,
      "Cors": "unknown",
      "Link": "https://cataas.com/",
      "Category": "Animals"
    },
    {
      "API": "catAPI",
      "Description": "Random pictures of cats",
      "Auth": "",
      "HTTPS": true,
      "Cors": "yes",
      "Link": "https://github.com/ThatCopy/catAPI/wiki/Usage",
      "Category": "Animals"
    },
    {
      "API": "Cats",
  

In [6]:
import pandas as pd
df = pd.DataFrame(columns=['API', 'Description', 'Auth', 'HTTPS', 'Cors', 'Link', 'Category'])
df.set_index('API', inplace=True)

print(f"Total results: {json_data['count']}")
for entry in json_data['entries']:
  df.loc[entry['API']] = [entry['Description'], entry['Auth'], entry['HTTPS'], entry['Cors'], entry['Link'], entry['Category']]

df.to_csv('free_apis.csv')
df.head()

Total results: 940


Unnamed: 0_level_0,Description,Auth,HTTPS,Cors,Link,Category
API,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Axolotl,Collection of axolotl pictures and facts,,True,unknown,https://theaxolotlapi.netlify.app/,Animals
Cat Facts,Daily cat facts,,True,no,https://alexwohlbruck.github.io/cat-facts/,Animals
Cataas,Cat as a service (cats pictures and gifs),,True,unknown,https://cataas.com/,Animals
catAPI,Random pictures of cats,,True,yes,https://github.com/ThatCopy/catAPI/wiki/Usage,Animals
Cats,Pictures of cats from Tumblr,apiKey,True,unknown,https://docs.thecatapi.com/,Animals


### Endpoints
A "REST Web Service" is one type of API. 
Web services can have one to many 'endpoints.' Each endpoint provides a different functionality that can be 'called' by the user of that particular API. 

The publicapis.org example above was based on the *entries* endpoint. Check a web service's documentation to see what endpoints are available: https://api.publicapis.org. Based on that documentation, publicapis.org also offers three other endpoints: random, categories, health

In [7]:
# This endpoint returns a randomly selected public api
response = requests.get("https://api.publicapis.org/random") 
print("random: ", json.loads(response.text))

# This endpoint returns all possible public api categories
response = requests.get("https://api.publicapis.org/categories") 
print("categories: ", json.loads(response.text))

# This endpoint returns the current health status of the publicapis.org API
response = requests.get("https://api.publicapis.org/health") 
print("health: ", json.loads(response.text))

random:  {'count': 1, 'entries': [{'API': 'Zoho Books', 'Description': 'Online accounting software, built for your business', 'Auth': 'OAuth', 'HTTPS': True, 'Cors': 'unknown', 'Link': 'https://www.zoho.com/books/api/v3/', 'Category': 'Finance'}]}
categories:  ['Animals', 'Anime', 'Anti-Malware', 'Art & Design', 'Authentication', 'Books', 'Business', 'Calendar', 'Cloud Storage & File Sharing', 'Continuous Integration', 'Cryptocurrency', 'Currency Exchange', 'Data Validation', 'Development', 'Dictionaries', 'Documents & Productivity', 'Environment', 'Events', 'Finance', 'Food & Drink', 'Games & Comics', 'Geocoding', 'Government', 'Health', 'Jobs', 'Machine Learning', 'Music', 'News', 'Open Data', 'Open Source Projects', 'Patent', 'Personality', 'Phone', 'Photography', 'Science & Math', 'Security', 'Shopping', 'Social', 'Sports & Fitness', 'Test Data', 'Text Analysis', 'Tracking', 'Transportation', 'URL Shorteners', 'Vehicle', 'Video', 'Weather']
health:  {'alive': True}


### APIs with querystring (a.k.a. parameter) inputs
No authentication required. Inputs are required
1. https://api.agify.io?name=meelad -> Predict the age of a person based on their name.
2. https://api.genderize.io?name=luc -> Predict the gender of a person based on their name.
3. https://api.nationalize.io?name=nathaniel -> Predict the nationality of a person based on their name.
4. https://datausa.io/api/data?drilldowns=Nation&measures=Population -> Get US public data (e.g., population data, etc.).
5. https://api.ipify.org?format=json -> IPify is a free API that allows you to get your current IP address.
6. http://universities.hipolabs.com/search?country=United+States -> Get a list of universities in a specified country.

In [8]:
# Predict the age of a person based on their name
response = requests.get("https://api.agify.io?name=homer")
print(response.text)

# Predict the gender of a person based on their name
response = requests.get("https://api.genderize.io?name=homer")
print(response.text)

# https://api.nationalize.io?name=nathaniel
response = requests.get("https://api.nationalize.io?name=homer")
print(response.text)

# https://datausa.io/api/data?drilldowns=Nation&measures=Population
response = requests.get("https://datausa.io/api/data?drilldowns=Nation&measures=Population")
print(response.text)

# https://api.ipify.org?format=json
response = requests.get("https://api.ipify.org?format=json")
print(response.text)

# http://universities.hipolabs.com/search?country=United+States
response = requests.get("http://universities.hipolabs.com/search?country=United+States")
print(response.text)

{"name":"homer","age":71,"count":2043}
{"name":"homer","gender":"male","probability":0.97,"count":2075}
{"name":"homer","country":[{"country_id":"PH","probability":0.423783697420402},{"country_id":"US","probability":0.15056758473883325},{"country_id":"IT","probability":0.1322296682437725}]}
{"data":[{"ID Nation":"01000US","Nation":"United States","ID Year":2019,"Year":"2019","Population":328239523,"Slug Nation":"united-states"},{"ID Nation":"01000US","Nation":"United States","ID Year":2018,"Year":"2018","Population":327167439,"Slug Nation":"united-states"},{"ID Nation":"01000US","Nation":"United States","ID Year":2017,"Year":"2017","Population":325719178,"Slug Nation":"united-states"},{"ID Nation":"01000US","Nation":"United States","ID Year":2016,"Year":"2016","Population":323127515,"Slug Nation":"united-states"},{"ID Nation":"01000US","Nation":"United States","ID Year":2015,"Year":"2015","Population":321418821,"Slug Nation":"united-states"},{"ID Nation":"01000US","Nation":"United Stat

### **Case: Data.gov**

In [9]:
# To get meta-data about the datasets available through api.data.gov: https://docs.ckan.org/en/latest/api/index.html
response = requests.get("http://demo.ckan.org/api/3/action/package_list")
json_data = json.loads(response.text)
clean = json.dumps(json_data, indent=2)
print(clean)
print(f"Number of datasets: {len(json_data['result'])}")

{
  "help": "https://demo.ckan.org/api/3/action/help_show?name=package_list",
  "success": true,
  "result": [
    "sample-dataset-1",
    "test_dataset"
  ]
}
Number of datasets: 2


Now let's examine the details of one of the datasets in the list above, extract the url for the CSV dataset, and read the CSV into a DataFrame

In [10]:
response = requests.get("http://demo.ckan.org/api/3/action/package_show?id=who-covid-cases")
json_data = json.loads(response.text)
clean = json.dumps(json_data, indent=2)

# 'Un'-comment these four lines to see how to drill down to the JSON value you need
# print(clean) # prints the formatted result
# print(json_data['result'])  # prints everything inside the 'result' key
# print(json_data['result']['resources']) # prints everything inside the 'resources' key
# print(json_data['result']['resources'][0])  # prints everything inside the first (and only) item in the list '[]'
print(json_data['result']['resources'][0]['url']) # prints the 'url' key inside that item

url = json_data['result']['resources'][0]['url']

df = pd.read_csv(url)
df.tail()

KeyError: 'result'

Not the COVID dataset you were looking for? Try the package_search endpoint:

In [None]:
response = requests.get("http://demo.ckan.org/api/3/action/package_search?q=covid")
json_data = json.loads(response.text)

# Print the number of results
print(f"Number of results: {len(json_data['result']['results'])}")

# Iterate through the results and identify and print the urls
for r in json_data['result']['results']:
  url = r['url']
  if url != None:
    if '.csv' in url:
      try:
        df = pd.read_csv(url)
        df.to_csv(f"{r['name']}.csv")
      except:
        print(url)
    else:
      print(url)

# df = pd.read_csv(url)
# df.tail()

Number of results: 10
https://datosabiertos.bogota.gov.co/api/3/action/datastore_search
https://coronavirusecuador.com/data/
https://github.com/owid/covid-19-data/tree/master/public/data

unknown

https://datosabiertos.bogota.gov.co/api/3/action/datastore_search_sql
https://fsodnastorage.blob.core.windows.net/curated-output-api/DIM_County/DIM_County.csv


## **Key-Based Authentication**
This allows the provider to secure the API/Web Service, but it places the responsibility of security on the rules for issuing keys.
Start by getting a key from https://api.data.gov/signup/. Next, select an API from the available list by department: https://api.data.gov/. We'll use the US Department of Commerce as an example below: https://www.commerce.gov/data-and-reports/developer-resources/commercegov-api#api_basics

### Querystring/Parameter Keys

In [None]:
import requests
import json

key = 'rtJMQxUmwPu9ex3lmQtxL25FbLtfWBCgvUajcRXs'
url = "https://api.commerce.gov/api/news?api_key=" + key
request = requests.get(url)
json_data = json.loads(request.text)
clean_data = json.dumps(json_data, indent=2)
print(f"Number of articles: {len(json_data['data'])}")
print(clean_data)

Number of articles: 50
{
  "jsonapi": {
    "version": "1.0",
    "meta": {
      "links": {
        "self": {
          "href": "http://jsonapi.org/format/1.0/"
        }
      }
    },
    "parsed": true
  },
  "data": [
    {
      "type": "news",
      "id": "48",
      "self": "https://www.commerce.gov/news/press-releases/2017/10/former-secretary-commerce-pritzkers-official-portrait-unveiled",
      "nid": 48,
      "label": "Former Secretary of Commerce Pritzker\u2019s Official Portrait Unveiled",
      "created": 1508360246,
      "updated": 1513969347,
      "href": "https://www.commerce.gov/news/press-releases/2017/10/former-secretary-commerce-pritzkers-official-portrait-unveiled",
      "body": "<p>Today, U.S. Secretary of Commerce Wilbur Ross attended the unveiling of former U.S. Secretary of Commerce Penny Pritzker's official portrait.</p>\n\n<p>\u201cI commend Secretary Pritzker for her dedication to public service.\u201d said Secretary of Commerce Wilbur Ross \u201cForgoi

### Header Keys (e.g. Oauth 2.0 Bearer) and Body Data


---


What is the difference between headers and querystring parameters?
* Headers carry meta-info about the request, parameters carry actual data.
* The non-alphanumeric characters in parameters are automatically un-escape/decoded (e.g. the space " " is converted to "%20"). Not so for headers
* Parameters can be seen by end-users in the UTL (querystring) but headers are hidden to end-users.


---


Example: Azure ML Studio API

In [None]:
# Use this example to call Azure ML Studio Web Service API with custom data

import requests
import json 

data= {
        "Inputs": 
          {
          "input1": 
            {
              "ColumnNames": ["Marital Status", "Gender", "Income", "Children", "Cars", "Age", "Education", "Occupation", "Home Owner", "Commute Distance", "Region"],
              "Values": [[
                          input("What is your marital status? (Married/Single)"), 
                          input("What is your gender? (Female/Male)"), 
                          input("What is your annual income? (e.g. 90000)"), 
                          input("How many children do you have? (enter an integer)"), 
                          input("How many cars do you have? (enter an integer)"), 
                          input("How old are you? (enter an integer)"), 
                          input("What is your level of education? (options: Partial High School, High School, Partial College, Bachelors, Graduate)"), 
                          input("What is your occupation? (options: Clerical, Management, Professional)"), 
                          input("Are you an home owner? (Yes/No)"), 
                          input("What is your commute distance? (options: 0-1 Miles, 1-2 Miles, 2-5 Miles, 5-10 Miles, 10+ Miles)"),
                          input("Where do you live? (options: North America, Pacific, Europe)")
                          ]]
            #  "Values": [[ "Married", "Male", "200000", "3", "3", "40", "Graduate", "Professional", "Yes", "5-10 Miles", "North America" ]]
            },
        },
        "GlobalParameters": 
          {
          }
      }

# Body data must be json-encoded
body = str.encode(json.dumps(data))

url = 'https://ussouthcentral.services.azureml.net/workspaces/ddab2c5151cf4c44ad08dc0a26a7aa17/services/f5bd9d43e84741a5b0536c25ebf6b288/execute?api-version=2.0&details=true'
key = 'j++I0waAENjH+TGR4oynHqUBlSMEwuiUoS0fQ2ZJFYbG73USnTQSsXkIU9kmfUSkjvC9OHU3LQXGMceG+vPdZA=='
headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ key)}
req = requests.post(url=url, data=body, headers=headers)
result = json.loads(req.text)
prediction = result['Results']['output1']['value']['Values'][0][0]
if prediction == 'Yes':
  formatted = f"\033[1m\033[92m{prediction}\033[0m"
else:
  formatted = f"\033[1m\033[91m{prediction}\033[0m"
print("Will this customer likely purchase a bike?: " + formatted)
print(json.dumps(result, indent=2))

Will this customer likely purchase a bike?: [1m[92mYes[0m
{
  "Results": {
    "output1": {
      "type": "table",
      "value": {
        "ColumnNames": [
          "Scored Labels"
        ],
        "ColumnTypes": [
          "String"
        ],
        "Values": [
          [
            "Yes"
          ]
        ]
      }
    }
  }
}


### Google Places API
https://developers.google.com/maps

In [None]:
# For this to work, you must have an account with billing enabled
key = "***************************************"

import requests

# text search: ~ $32 for the first 100,000 requests; 1000 results per request
url = f"https://maps.googleapis.com/maps/api/place/details/json?query=restaurants%20in%20Sydney&key={key}"
# basic data: free, but you need the place_id
url = f"https://maps.googleapis.com/maps/api/place/details/json?place_id=ChIJN1t_tDeuEmsRUsoyG83frY4&fields=name%2Crating%2Cformatted_phone_number&key={key}"

payload={}
headers={}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

{
   "html_attributions" : [],
   "result" : {
      "formatted_phone_number" : "(02) 9374 4000",
      "name" : "Google Workplace 6",
      "rating" : 4.1
   },
   "status" : "OK"
}



## **Twitter Demo**

### Generate your Twitter Developer Account and Keys

---
First, sign up for a Twitter Developer Account: https://developer.twitter.com/en/apply-for-access. Here is a video tutorial: https://www.youtube.com/watch?v=vlvtqp44xoQ. At the very end, she shows you where to generate the API keys and tokens, but doesn't actually generate them in the video (for obvious reasons). The interface just recently changed, but the process is the same. Look for either a "key-looking" icon or a link to "Keys and Tokens." Generate these for things: 
1. consumer_key
2. consumer_secret
3. access_token_key
4. access_token_secret
5. Bearer token

---


Be sure to copy them to a safe place because, once generated and hidden, you will no longer be able to view them for security reasons. You will use them below.

* Twitter API Python examples: https://github.com/twitterdev/search-tweets-python
* Twitter API v2 Documentation: https://developer.twitter.com/en/docs/twitter-api/early-access

Endpoints for Reading Data
* **Tweet lookup**: Look up Tweets by ID.
* **User lookup**: Look up users by name or ID.
* **Recent search**: Query the most recent seven days of Tweets, and receive full-fidelity responses with this first release of our search Tweets functionality.
* **Timelines**: Returns the Tweets composed by, or mentioning, a specified Twitter account.
* **Filtered stream**: Filter the complete stream of real-time public Tweets.
* **Sampled stream**: Stream a sample of new Tweets as they are published, across ~1% of all public Tweets in real-time.
* **Hide replies**: Hides or unhides a reply to a Tweet.
* **Follows lookup**: Retrieve an account’s followers and who they are following using their user ID.

### Recent Search Basics

In [None]:
# This is the most basic Recent search request; only includes id and text

import requests
import os
import json

bearer_token = 'AAAAAAAAAAAAAAAAAAAAAKIqKQEAAAAArvCY9UR1YF2ItjDVfvxRj%2BacYLI%3Dobk23ATK7g32XQ7CQiotckLl5EkTIfqJQGvIWWSNOUNRxTN0Dc'
headers = {'Authorization':('Bearer '+ bearer_token)}

url = f'https://api.twitter.com/2/tweets/search/recent?query=covid%20autism&max_results=10'

response = requests.request("GET", url, headers=headers)
json_data = json.loads(response.text)
clean_data = json.dumps(json_data, indent=2, sort_keys=True)
print(clean_data)

{
  "data": [
    {
      "id": "1427651416437444614",
      "text": "RT @LmReynoldsteach: @Joyhenderson78 The insinuation that catching covid is better than having autism is really gross coming from fellow ed\u2026"
    },
    {
      "id": "1427648640303128600",
      "text": "@MastroSandi @angela_gauthie Very disturbing.  The ignorance re covid vaccines and Autism.  As Lady Gaga says \u201cyou were born this way\u201d.  All adults who work with children in any capacity should be vaccinated for covid 19. The choice is a new job."
    },
    {
      "id": "1427648411973607432",
      "text": "We take a look at studies on how COVID-19 and social isolation have affected autistic people, how camouflaging and age at diagnosis can shape first impressions, and reactions to our article on the controversial \u201ccost of autism\u201d paper https://t.co/OWbJrwRtYp"
    },
    {
      "id": "1427647228013842441",
      "text": "@JoeS619 @btysonmd @GeorgeFareed2 @orsomedico @PierreKory Unless th

### Recent Search: Expanded Results
There are many other data points we can get besides the tweet text and ID. We can get them by adding additional parameters into the querystring. Each line below includes the objects that we can collect data points about:

In [None]:
url = f'https://api.twitter.com/2/tweets/search/recent?query=covid%20autistic&max_results=10'

url += f'&user.fields=created_at,description,entities,id,location,name,profile_image_url,protected,public_metrics,url,username,verified,withheld'
url += f'&tweet.fields=attachments,author_id,context_annotations,conversation_id,created_at,entities,geo,id,in_reply_to_user_id,lang,possibly_sensitive,public_metrics,referenced_tweets,reply_settings,source,text,withheld'
url += f'&expansions=attachments.poll_ids,attachments.media_keys,author_id,geo.place_id,in_reply_to_user_id,referenced_tweets.id,entities.mentions.username,referenced_tweets.id.author_id'
url += f'&media.fields=duration_ms,height,media_key,preview_image_url,public_metrics,type,url,width'
url += f'&place.fields=contained_within,country,country_code,full_name,geo,id,name,place_type'
url += f'&poll.fields=duration_minutes,end_datetime,id,options,voting_status'

response = requests.request("GET", url, headers=headers)
json_data = json.loads(response.text)
clean_data = json.dumps(json_data, indent=2, sort_keys=True) 
print(clean_data)

{
  "data": [
    {
      "author_id": "1158722677932273665",
      "context_annotations": [
        {
          "domain": {
            "description": "Ongoing News Stories like 'Brexit'",
            "id": "123",
            "name": "Ongoing News Story"
          },
          "entity": {
            "id": "1220701888179359745",
            "name": "COVID-19"
          }
        }
      ],
      "conversation_id": "1427668193645056001",
      "created_at": "2021-08-17T16:26:36.000Z",
      "entities": {
        "urls": [
          {
            "description": "\"My husband then had to physically restrain my son with two members of staff to try to get him to have a test. Callum tried to be brave but he was petrified.\"",
            "display_url": "elizabethjohnston.org/airline-apolog\u2026",
            "end": 130,
            "expanded_url": "https://elizabethjohnston.org/airline-apologizes-after-petrified-autistic-child-forced-to-take-covid-19-test-despite-medical-exemption/",
     

Notice that the results are now broken up into three main categories instead of two:


1.   Data
2.   Results
3.   Meta

Let's print them out:

In [None]:
for item in dict(json_data).items():
  print(item[0])

data
includes
meta


Now let's print each of the sub-categories within each:

In [None]:
for item in dict(json_data).items():
  print(item[0])
  for sub_item in item[1]:
    print(f'  {type(sub_item).__name__}') # Using .__name__ after type() eliminates the tags. e.g. <class 'int'> to int

data
  dict
  dict
  dict
  dict
  dict
  dict
  dict
  dict
  dict
  dict
includes
  str
  str
meta
  str
  str
  str
  str


Let's take a closer look at the strings in this structure

In [None]:
for item in dict(json_data).items():
  print(item[0])
  for sub_item in item[1]:
    if type(sub_item) == dict:
      print(f'  {sub_item.keys()}') # if it is a dictionary (i.e. tweet), print the first 50 characters
    else:
      print(f'  {sub_item}')

data
  dict_keys(['conversation_id', 'author_id', 'public_metrics', 'lang', 'reply_settings', 'context_annotations', 'text', 'possibly_sensitive', 'entities', 'id', 'source', 'created_at'])
  dict_keys(['conversation_id', 'author_id', 'public_metrics', 'lang', 'reply_settings', 'context_annotations', 'in_reply_to_user_id', 'text', 'possibly_sensitive', 'entities', 'id', 'source', 'referenced_tweets', 'created_at'])
  dict_keys(['conversation_id', 'author_id', 'public_metrics', 'lang', 'reply_settings', 'text', 'possibly_sensitive', 'entities', 'id', 'source', 'referenced_tweets', 'created_at'])
  dict_keys(['conversation_id', 'author_id', 'public_metrics', 'lang', 'reply_settings', 'context_annotations', 'text', 'possibly_sensitive', 'entities', 'id', 'source', 'referenced_tweets', 'created_at'])
  dict_keys(['conversation_id', 'author_id', 'public_metrics', 'lang', 'reply_settings', 'context_annotations', 'in_reply_to_user_id', 'text', 'possibly_sensitive', 'entities', 'id', 'source',

The "data" object includes the dictionaries representing each tweet. Notice that some start with the key 'referenced_tweets' whereas the others start with 'id'--indicating the differnce between original tweets (those that start with 'id') and retweets or reply tweets. 

The "includes" object seems to include objects that are related to tweets, but not necessarily in a 1-to-1 relationship. For example, a tweet can refer to many users: the one who tweeted it and others mentioned in the tweet. This is nice because it would allow us to extract each of the data points more easily. But let's keep this simple by extracting. 

The "meta" object is the the same as before.

### Extracting the Information

Let's keep this example simple by extracting the tweet ID, text, retweet count, likes, and any media objects associated with the tweet. To do this, we will just iterate through the 'data' object and leave 'includes' alone (although you would want to scrape this into separate DataFrames and .csv files in practice).

In [None]:
for tweet in json_data['data']:
  print(f"id: \t\t{tweet['id']}")
  print(f"text: \t\t{tweet['text']}")
  print(f"retweets: \t{tweet['public_metrics']['retweet_count']}")
  print(f"likes: \t\t{tweet['public_metrics']['like_count']}")
  print()

id: 		1427668193645056001
text: 		Airline Apologizes After “Petrified” Autistic Child Forced to Take COVID-19 Test Despite Medical Exemption
https://t.co/xxQcr952vA
retweets: 	0
likes: 		1

id: 		1427664714159443970
text: 		@sam_autistic @Renee1up @covid19nz @jacindaardern Coming from the man who thinks covid deaths should be ignored because people die on a daily basis.
retweets: 	0
likes: 		0

id: 		1427662057088892936
text: 		RT @porkironandwine: I received 50 usd today through donations, I was wondering if I can ask for mutual aid as I, my two disabled parents a…
retweets: 	234
likes: 		0

id: 		1427661696244531204
text: 		RT @RoArquette: I’ve received a note from a women who has long term COVID and has been living in her car with her children one whom is auti…
retweets: 	636
likes: 		0

id: 		1427659291553505280
text: 		@sam_autistic @Renee1up @covid19nz @jacindaardern All those people will continue to die. Especially the ones that lacked healthcare. A covid death is a covid death.

In [None]:
for tweet in json_data['data']:
  print(f"id: \t\t{tweet['id']}")
  print(f"text: \t\t{tweet['text']}")
  print(f"retweets: \t{tweet['public_metrics']['retweet_count']}")
  print(f"likes: \t\t{tweet['public_metrics']['like_count']}")
  
  # Use an if statement so an error is not thrown for tweets without media
  media_key = ""  # Reset to empty each time through the loop so that we can use it for a condition later
  if 'attachments' in tweet:
    if 'media_keys' in tweet['attachments']:
      media_key = tweet['attachments']['media_keys'][0]

  if media_key != "":
    # If there is a media key in this tweet, iterate through tweet['includes']['media'] until we find it
    for media in json_data['includes']['media']:
      if media['media_key'] == media_key: # Only if the media_key matches the one we stored
        if media['type'] == 'photo':      # Only if it is a photo; ignore videos
          image_url = media['url']
          print(f"photo: \t\t{image_url}")

  print() # print blank line to separate results


id: 		1427668193645056001
text: 		Airline Apologizes After “Petrified” Autistic Child Forced to Take COVID-19 Test Despite Medical Exemption
https://t.co/xxQcr952vA
retweets: 	0
likes: 		1

id: 		1427664714159443970
text: 		@sam_autistic @Renee1up @covid19nz @jacindaardern Coming from the man who thinks covid deaths should be ignored because people die on a daily basis.
retweets: 	0
likes: 		0

id: 		1427662057088892936
text: 		RT @porkironandwine: I received 50 usd today through donations, I was wondering if I can ask for mutual aid as I, my two disabled parents a…
retweets: 	234
likes: 		0

id: 		1427661696244531204
text: 		RT @RoArquette: I’ve received a note from a women who has long term COVID and has been living in her car with her children one whom is auti…
retweets: 	636
likes: 		0

id: 		1427659291553505280
text: 		@sam_autistic @Renee1up @covid19nz @jacindaardern All those people will continue to die. Especially the ones that lacked healthcare. A covid death is a covid death.

### Iterating through and Storing Results

In [None]:
import requests
import pandas as pd
import json

bearer_token = 'AAAAAAAAAAAAAAAAAAAAAKIqKQEAAAAArvCY9UR1YF2ItjDVfvxRj%2BacYLI%3Dobk23ATK7g32XQ7CQiotckLl5EkTIfqJQGvIWWSNOUNRxTN0Dc'
headers = {'Authorization':('Bearer '+ bearer_token)}

url = f'https://api.twitter.com/2/tweets/search/recent?query=covid%20autistic&max_results=10'
url += f'&user.fields=created_at,description,entities,id,location,name,profile_image_url,protected,public_metrics,url,username,verified,withheld'
url += f'&tweet.fields=attachments,author_id,context_annotations,conversation_id,created_at,entities,geo,id,in_reply_to_user_id,lang,possibly_sensitive,public_metrics,referenced_tweets,reply_settings,source,text,withheld'
url += f'&expansions=attachments.poll_ids,attachments.media_keys,author_id,geo.place_id,in_reply_to_user_id,referenced_tweets.id,entities.mentions.username,referenced_tweets.id.author_id'
url += f'&media.fields=duration_ms,height,media_key,preview_image_url,public_metrics,type,url,width'
url += f'&place.fields=contained_within,country,country_code,full_name,geo,id,name,place_type'
url += f'&poll.fields=duration_minutes,end_datetime,id,options,voting_status'

response = requests.request("GET", url, headers=headers)
json_data = json.loads(response.text)

# Create the empty DataFrame with the columns you want
df = pd.DataFrame(columns=['id', 'retweets', 'likes', 'url', 'text'])
df.set_index('id', inplace=True)

for tweet in json_data['data']:
  media_key = ""  # Reset to empty each time through the loop so that we can use it for a condition later

  # Store the data into variables
  tweet_id = tweet['id']
  retweet_count = tweet['public_metrics']['retweet_count']
  like_count = tweet['public_metrics']['like_count']
  image_url = ""
  text = tweet['text']

  # Find out if there is media
  if 'attachments' in tweet:
    if 'media_keys' in tweet['attachments']:
      media_key = tweet['attachments']['media_keys'][0]

  # If there is a media key in this tweet, iterate through tweet['includes']['media'] until we find it
  if media_key != "":
    for media in json_data['includes']['media']:
      if media['media_key'] == media_key: # Only if the media_key matches the one we stored
        if media['type'] == 'photo':      # Only if it is a photo; ignore videos
          image_url = media['url']        # Store the url in a variable

  # Add the new data to a new record in the DataFrame
  df.loc[tweet_id] = [retweet_count, like_count, image_url, text]

df.head()

KeyError: 'data'

### Pagenation

To paginate through results (i.e. retrieve 10 at a time until n are found), we need to create a loop/iteration, a stopping condition, and use the since_id parameter for the endpoint. Let's create a variable n to indicate how many total results we want as well as a varibale (total_retrieved) to keep track of how many results we have so far. We also need to stop automatically if there are fewer than n tweets. This is a good context to use a "while" loop rather than a "for" loop.

In [None]:
n = 20                            # The total number of tweets we want
max_results = 10                  # The number of tweets to pull per request; must be between 10 and 100
total_retrieved = 0               # To keep track of when to stop
next_token = ""                   # Must be empty on first iteration
search_term = "pandemic%20autistic%20routine"    # To form an advanced query, see here: https://twitter.com/search-advanced?lang=en
since_id = "1425800000000000000"  # The id of the oldest tweet you want to retrieve

# stop when we have n results
while total_retrieved < n:

  # the first time through the loop, we do not need the next_token parameter
  if next_token == "":
    url = f'https://api.twitter.com/2/tweets/search/recent?query={search_term}&max_results={max_results}&since_id={since_id}'
  else:
    url = f'https://api.twitter.com/2/tweets/search/recent?query={search_term}&max_results={max_results}&since_id={since_id}&next_token={next_token}'

  # make the request to the Twitter API Recent Search endpoint
  response = requests.request("GET", url, headers=headers)
  json_data = json.loads(response.text)
  clean_data = json.dumps(json_data, indent=2, sort_keys=True)
  print(clean_data)

  # now that the data is json-formatted, we can easily keep track of how many results have been obtained so far:
  total_retrieved += int(json_data['meta']['result_count'])
  if 'next_token'in json_data['meta']: # Check to see if there are any more pages
    next_token = json_data['meta']['next_token']
  else:
    total_retrieved = n # end the loop

  

{
  "data": [
    {
      "id": "1427525731425431583",
      "text": "Discover how some individuals on the #autism spectrum are coping, and how the pandemic is highlighting the value of autistic people in the #workforce. People on the autism spectrum are navigating pandemic-forced routine disruption.  Learn how! \ud83d\udc49https://t.co/d8ekgtCSV9 https://t.co/VRzTA7ZCk6"
    },
    {
      "id": "1426075466671300608",
      "text": "Not realised how much of my life I spent hyper vigilant and anxious before the pandemic because being autistic means I need a routine, and to be away from people, but I never had one that allowed me to do that and remain financially solvent and creatively fulfilled before."
    }
  ],
  "meta": {
    "newest_id": "1427525731425431583",
    "oldest_id": "1426075466671300608",
    "result_count": 2
  }
}


KeyError: ignored

In [None]:
n = 20                            # The total number of tweets we want
max_results = 10                  # The number of tweets to pull per request; must be between 10 and 100
total_retrieved = 0               # To keep track of when to stop
next_token = ""                   # Must be empty on first iteration
search_term = "covid%20autism"    # To form an advanced query, see here: https://twitter.com/search-advanced?lang=en
since_id = "1371600000000000000"  # The id of the oldest tweet you want to retrieve
df = pd.DataFrame(columns=['Retweets', 'Likes', 'Text'], index=['ID'])

# stop when we have n results
while total_retrieved < n:

  # the first time through the loop, we do not need the next_token parameter
  if next_token == "":
    url = f'https://api.twitter.com/2/tweets/search/recent?query={search_term}&max_results={max_results}&since_id={since_id}'
  else:
    url = f'https://api.twitter.com/2/tweets/search/recent?query={search_term}&max_results={max_results}&since_id={since_id}&next_token={next_token}'

  # These are the extra parameters we will add to the querystring; we won't store them all though; just want you to see what's possible
  url += f'&user.fields=created_at,description,entities,id,location,name,profile_image_url,protected,public_metrics,url,username,verified,withheld'
  url += f'&tweet.fields=attachments,author_id,context_annotations,conversation_id,created_at,entities,geo,id,in_reply_to_user_id,lang,possibly_sensitive,public_metrics,referenced_tweets,reply_settings,source,text,withheld'
  url += f'&expansions=attachments.poll_ids,attachments.media_keys,author_id,geo.place_id,in_reply_to_user_id,referenced_tweets.id,entities.mentions.username,referenced_tweets.id.author_id'
  url += f'&media.fields=duration_ms,height,media_key,preview_image_url,public_metrics,type,url,width'
  url += f'&place.fields=contained_within,country,country_code,full_name,geo,id,name,place_type'
  url += f'&poll.fields=duration_minutes,end_datetime,id,options,voting_status'

  # make the request to the Twitter API Recent Search endpoint
  response = requests.request("GET", url, headers=headers)
  json_data = json.loads(response.text)

  # Create the empty DataFrame with the columns you want
  df = pd.DataFrame(columns=['id', 'retweets', 'likes', 'url', 'text'])
  df.set_index('id', inplace=True)

  for tweet in json_data['data']:
    media_key = ""  # Reset to empty each time through the loop so that we can use it for a condition later

    # Store the data into variables
    tweet_id = tweet['id']
    retweet_count = tweet['public_metrics']['retweet_count']
    like_count = tweet['public_metrics']['like_count']
    image_url = ""
    text = tweet['text']

    # Find out if there is media
    if 'attachments' in tweet:
      if 'media_keys' in tweet['attachments']:
        media_key = tweet['attachments']['media_keys'][0]

    # If there is a media key in this tweet, iterate through tweet['includes']['media'] until we find it
    if media_key != "":
      for media in json_data['includes']['media']:
        if media['media_key'] == media_key: # Only if the media_key matches the one we stored
          if media['type'] == 'photo':      # Only if it is a photo; ignore videos
            image_url = media['url']        # Store the url in a variable

    # Add the new data to a new record in the DataFrame
    df.loc[tweet_id] = [retweet_count, like_count, image_url, text]

  # keep track of how many results have been obtained so far:
  total_retrieved += int(json_data['meta']['result_count'])
  next_token = json_data['meta']['next_token']

df.head()

  

KeyError: 'data'

In [None]:
# In this example, only those tweets with photos/images are stored

n = 100                           # The total number of tweets we want
max_results = 100                 # The number of tweets to pull per request; must be between 10 and 100
total_retrieved = 0               # To keep track of when to stop
next_token = ""                   # Must be empty on first iteration
search_term = "covid"             # To form an advanced query, see here: https://twitter.com/search-advanced?lang=en
since_id = "1371590000000000000"  # The id of the oldest tweet you want to retrieve

# Create the empty DataFrame with the columns you want
df = pd.DataFrame(columns=['id', 'retweets', 'likes', 'url', 'text'])
df.set_index('id', inplace=True)

# stop when we have n results
while total_retrieved < n:

  # the first time through the loop, we do not need the next_token parameter
  if next_token == "":
    url = f'https://api.twitter.com/2/tweets/search/recent?query={search_term}&max_results={max_results}&since_id={since_id}'
  else:
    url = f'https://api.twitter.com/2/tweets/search/recent?query={search_term}&max_results={max_results}&since_id={since_id}&next_token={next_token}'

  # These are the extra parameters we will add to the querystring; we won't store them all though; just want you to see what's possible
  url += f'&tweet.fields=attachments,public_metrics,text'
  url += f'&expansions=attachments.media_keys'
  url += f'&media.fields=media_key,type,url'

  # make the request to the Twitter API Recent Search endpoint
  response = requests.request("GET", url, headers=headers)
  try:  # Just in case we get an error
    json_data = json.loads(response.text)
  except:
    print(response.text)
  

  for tweet in json_data['data']:
    media_key = ""  # Reset to empty each time through the loop so that we can use it for a condition later

    # Store the data into variables
    tweet_id = tweet['id']
    retweet_count = tweet['public_metrics']['retweet_count']
    like_count = tweet['public_metrics']['like_count']
    image_url = ""
    text = tweet['text']

    # Find out if there is media
    if 'attachments' in tweet:
      if 'media_keys' in tweet['attachments']:
        media_key = tweet['attachments']['media_keys'][0]

    # If there is a media key in this tweet, iterate through tweet['includes']['media'] until we find it
    if media_key != "":
      for media in json_data['includes']['media']:
        if media['media_key'] == media_key: # Only if the media_key matches the one we stored
          if media['type'] == 'photo':      # Only if it is a photo; ignore videos
            image_url = media['url']        # Store the url in a variable
            
            # Only iterate if a photo is found
            total_retrieved += 1
            
            # Only add the record in the DataFrame if a photo is found
            df.loc[tweet_id] = [retweet_count, like_count, image_url, text]
            break

  # keep track of where to start next time, but quit if there are no more results
  try:
    next_token = json_data['meta']['next_token']
  except:
    break  

print(f'Number of records:\t{len(df)}')
df.to_csv('twitter.csv')
df.head()

  

Number of records:	101


Unnamed: 0_level_0,retweets,likes,url,text
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1374114830643294213,1,0,https://pbs.twimg.com/media/ExHWJ_dXMAE6fxD.jpg,"Petro, su esposa y su hija, positivos para Cov..."
1374114830622265353,29,0,https://pbs.twimg.com/media/ExHF5HEWQAYaiWW.jpg,RT @ellyxshwt: personagens que nunca pegariam ...
1374114830529961984,0,0,https://pbs.twimg.com/media/ExHWHY-WYAUalyE.jpg,"This weekend with @UFWF, WCK’s Relief Team pro..."
1374114823013756930,3,0,https://pbs.twimg.com/media/ExHOdhAVEAE-yte.jpg,RT @BrentKleinman: Every Arizonan older than 1...
1374114818026835974,0,0,https://pbs.twimg.com/media/ExHWI4sXIAQdJty.jpg,Volunteered to help out at the covid vaccinati...


## Python-Twitter Package
A lightweight wrapper around the Twitter API which makes it a bit easier to use the Twitter API with some desirable default settings built-in: https://python-twitter.readthedocs.io/en/latest/

In [None]:
!pip install python-twitter

Collecting python-twitter
[?25l  Downloading https://files.pythonhosted.org/packages/b3/a9/2eb36853d8ca49a70482e2332aa5082e09b3180391671101b1612e3aeaf1/python_twitter-3.5-py2.py3-none-any.whl (67kB)
[K     |████▉                           | 10kB 18.8MB/s eta 0:00:01[K     |█████████▊                      | 20kB 23.8MB/s eta 0:00:01[K     |██████████████▋                 | 30kB 28.6MB/s eta 0:00:01[K     |███████████████████▌            | 40kB 20.2MB/s eta 0:00:01[K     |████████████████████████▎       | 51kB 16.7MB/s eta 0:00:01[K     |█████████████████████████████▏  | 61kB 18.8MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 6.7MB/s 
Installing collected packages: python-twitter
Successfully installed python-twitter-3.5


In [None]:
import twitter

# To get these credentials, create a Twitter Developer account here: https://developer.twitter.com/en/apply-for-access.html
# ...and a Twitter project here: https://developer.twitter.com/en/portal/projects-and-apps
# Then go to the dashboard and click the key icon next to your project here: https://developer.twitter.com/en/portal/dashboard
api = twitter.Api(consumer_key="5Fi3OVWC6vzPUbtTA3pNxGbEM", 
                  consumer_secret="7SE0qJep4DRFIYpcwzTgb4QOQxKdP8CVzdKUBVfEMIDa8HTf1V", 
                  access_token_key="1082016277-HW2vd49nrewPWPkcSDXHIJ1yMeSnA4kG3HAcn6C", 
                  access_token_secret="hj8HVX42EdGbgT8EzWAWQ5IgzqZBLAET0CJqmG2i6Czyi", 
                  sleep_on_rate_limit=True, # allows you to get around the rate limit to a certain degree
                  tweet_mode='extended')    # gives you all available information for a tweet

In [None]:
# For more help on formatting search terms: https://twitter.com/search-advanced

results = api.GetSearch(term='covid%20autism', result_type="recent", since='2020-01-01', 
                        count=5, include_entities=True, return_json=True)
results

{'search_metadata': {'completed_in': 0.051,
  'count': 5,
  'max_id': 1354216103505928199,
  'max_id_str': '1354216103505928199',
  'next_results': '?max_id=1354209972092104705&q=covid%2520autism%20since%3A2020-01-01&count=5&include_entities=1&result_type=recent',
  'query': 'covid%2520autism+since%3A2020-01-01',
  'refresh_url': '?since_id=1354216103505928199&q=covid%2520autism%20since%3A2020-01-01&result_type=recent&include_entities=1',
  'since_id': 0,
  'since_id_str': '0'},
 'statuses': [{'contributors': None,
   'coordinates': None,
   'created_at': 'Tue Jan 26 23:54:13 +0000 2021',
   'display_text_range': [0, 46],
   'entities': {'hashtags': [],
    'symbols': [],
    'urls': [],
    'user_mentions': []},
   'favorite_count': 1,
   'favorited': False,
   'full_text': 'will the covid vaccine make my autism go away?',
   'geo': None,
   'id': 1354216103505928199,
   'id_str': '1354216103505928199',
   'in_reply_to_screen_name': None,
   'in_reply_to_status_id': None,
   'in_reply

In [None]:
for k, v in dict(results['search_metadata']).items():
  print(k, v)

for status in results['statuses']:
  print(status['id'], status['created_at'])

completed_in 0.051
max_id 1354216103505928199
max_id_str 1354216103505928199
next_results ?max_id=1354209972092104705&q=covid%2520autism%20since%3A2020-01-01&count=5&include_entities=1&result_type=recent
query covid%2520autism+since%3A2020-01-01
refresh_url ?since_id=1354216103505928199&q=covid%2520autism%20since%3A2020-01-01&result_type=recent&include_entities=1
count 5
since_id 0
since_id_str 0
1354216103505928199 Tue Jan 26 23:54:13 +0000 2021
1354215186970914816 Tue Jan 26 23:50:35 +0000 2021
1354215150119772162 Tue Jan 26 23:50:26 +0000 2021
1354213332325076992 Tue Jan 26 23:43:13 +0000 2021
1354209972092104706 Tue Jan 26 23:29:52 +0000 2021


In [None]:
import urllib.parse as urlparse
from urllib.parse import parse_qs

url = dict(results['search_metadata'])['next_results']
parsed = urlparse.urlparse(url)
max_id = parse_qs(parsed.query)['max_id'][0]
print(max_id)

results = api.GetSearch(term='covid&autism', result_type="recent", since='2020-01-01', 
                        count=5, include_entities=True, return_json=True, max_id=max_id)

for k, v in dict(results['search_metadata']).items():
  print(k, v)
for status in results['statuses']:
  print(status['id'], status['created_at'])

1354209972092104705
completed_in 0.037
max_id 1354209972092104705
max_id_str 1354209972092104705
next_results ?max_id=1354190273283301381&q=covid%26autism%20since%3A2020-01-01&count=5&include_entities=1&result_type=recent
query covid%26autism+since%3A2020-01-01
refresh_url ?since_id=1354209972092104705&q=covid%26autism%20since%3A2020-01-01&result_type=recent&include_entities=1
count 5
since_id 0
since_id_str 0
1354196205258088450 Tue Jan 26 22:35:09 +0000 2021
1354195137472163841 Tue Jan 26 22:30:55 +0000 2021
1354191783064985605 Tue Jan 26 22:17:35 +0000 2021
1354191166237978633 Tue Jan 26 22:15:08 +0000 2021
1354190273283301382 Tue Jan 26 22:11:35 +0000 2021


In [None]:
# Automating pagination through tweet results
# Documentation: https://python-twitter.readthedocs.io/en/latest/twitter.html#twitter.api.Api.GetSearch

import pandas as pd
import json

loc = '/content/drive/MyDrive/Colab Notebooks/class/EMBA693R'

terms = "autism%20covid"  # '%20' is the code for a space
since = "2020-01-01"
batch = 5
count = 20
for start in range(0, count, batch):  # range(start, end, iteration)
  try:
    if start > 0 and result['id'] is not None:
      results = api.GetSearch(term=terms, result_type="recent", since=since, count=batch, return_json=True, include_entities=True, max_id=result['id'])
    else:
      raise
  except:
    results = api.GetSearch(term=terms, result_type="recent", since=since, count=batch, return_json=True, include_entities=True)

  for i, result in enumerate(results['statuses']):
    print(i + start, result['id'], result['created_at'])



0 1354216103505928199 Tue Jan 26 23:54:13 +0000 2021
1 1354215186970914816 Tue Jan 26 23:50:35 +0000 2021
2 1354215150119772162 Tue Jan 26 23:50:26 +0000 2021
3 1354213332325076992 Tue Jan 26 23:43:13 +0000 2021
4 1354209972092104706 Tue Jan 26 23:29:52 +0000 2021
5 1354209972092104706 Tue Jan 26 23:29:52 +0000 2021
6 1354196205258088450 Tue Jan 26 22:35:09 +0000 2021
7 1354195137472163841 Tue Jan 26 22:30:55 +0000 2021
8 1354191783064985605 Tue Jan 26 22:17:35 +0000 2021
9 1354191166237978633 Tue Jan 26 22:15:08 +0000 2021
10 1354191166237978633 Tue Jan 26 22:15:08 +0000 2021
11 1354190273283301382 Tue Jan 26 22:11:35 +0000 2021
12 1354187065672429569 Tue Jan 26 21:58:50 +0000 2021
13 1354187057644466176 Tue Jan 26 21:58:48 +0000 2021
14 1354185019787501570 Tue Jan 26 21:50:42 +0000 2021
15 1354185019787501570 Tue Jan 26 21:50:42 +0000 2021
16 1354184227068194822 Tue Jan 26 21:47:33 +0000 2021
17 1354181364669681667 Tue Jan 26 21:36:11 +0000 2021
18 1354180160107524096 Tue Jan 26 21:3

In [None]:
from monkeylearn import MonkeyLearn

data= {
  "data": [
    "This is a great tool!",
  ]
}



ml = MonkeyLearn('a104e43ab2d241bd36d33d72f79779f8e15bdf43')
data = ["This is a great tool!"]
model_id = 'cl_pi3C7JiL'
result = ml.classifiers.classify(model_id, data)
print(result.body)

[{'text': 'This is a great tool!', 'external_id': None, 'error': False, 'classifications': [{'tag_name': 'Positive', 'tag_id': 122921383, 'confidence': 0.998}]}]


In [None]:
!pip install monkeylearn

Collecting monkeylearn
  Downloading https://files.pythonhosted.org/packages/85/71/402de0a734641f015facd3d6cb24fce13c6b4bf67b6871d5425820c3cccd/monkeylearn-3.5.2.tar.gz
Building wheels for collected packages: monkeylearn
  Building wheel for monkeylearn (setup.py) ... [?25l[?25hdone
  Created wheel for monkeylearn: filename=monkeylearn-3.5.2-cp36-none-any.whl size=16107 sha256=42026715fdca7b5cec66b516e8a37c30f6acae0b909155f289da3794637399dc
  Stored in directory: /root/.cache/pip/wheels/45/d7/ce/6657bd945ba1aa207d91fa86219eb5b82f2b7247a7a520d420
Successfully built monkeylearn
Installing collected packages: monkeylearn
Successfully installed monkeylearn-3.5.2


In [None]:
# RapidAPI example

import requests

url = "https://finance-text-sentiment.p.rapidapi.com/sentiment_finance"

querystring = {"text":"Stilo International (LON:STL) Stock Price Passes Below 200 Day Moving Average"}

headers = {
    'x-rapidapi-key': "cb963ebacfmsh1e17ef70b3abb85p184ec6jsna3e7726518b0",
    'x-rapidapi-host': "finance-text-sentiment.p.rapidapi.com"
    }

response = requests.request("GET", url, headers=headers, params=querystring)

print(response.text)

{"sentiment": "Negative", "sentiment_score": "-0.9781697", "text": "Stilo International (LON:STL) Stock Price Passes Below 200 Day Moving Average"}


In [None]:
import requests

# Register new webhook for earnings
r = requests.post('https://finnhub.io/api/v1/webhook/add?token=c0a6ecv48v6vd3hq1scg', json={'event': 'earnings', 'symbol': 'AAPL'})
res = r.json()
print(res)

webhook_id = res['id']
# List webhook
r = requests.get('https://finnhub.io/api/v1/webhook/list?token=c0a6ecv48v6vd3hq1scg')
res = r.json()
print(res)

{'id': 5008, 's': 'ok'}
[{'id': 5008, 'name': 'earnings', 'symbol': 'AAPL', 'data': '{}'}]
