### **Data Acquisition Through APIs**

Exploring **APIs** (Application Programming Interfaces)

These are interfaces that allow communication between different systems.

The requests library is a Python library perfect for interacting with APIs.

In [3]:
import requests

#### **API Documentation --> Endpoints:** ####

/facts -- Retrieve and query facts

/users* -- Get user data


A **GET** request is used to obtain a response from the server with the information we have requested

In [8]:
import requests

response = requests.get('https://cat-fact.herokuapp.com/facts')

print(response)

<Response [200]>


Status codes:
- 2XX: All went well
- 3XX: Redirection
- 4XX: Invalid request. Resource no longer exists/user does not have access
- 5XX: Server errors

In [9]:
if response:
    print('Response OK')
else:
    print('Response Failed')

Response OK


Requests recognizes that status codes 4XX and 5XX are errors

In [10]:
print(response.status_code)

200


#### **HEADERS** ####

Headers are used to allow both the client and the server to interpret the data being sent and received

In [11]:
print(response.headers)

{'Server': 'Cowboy', 'Report-To': '{"group":"heroku-nel","max_age":3600,"endpoints":[{"url":"https://nel.heroku.com/reports?ts=1705167628&sid=929419e7-33ea-4e2f-85f0-7d8b7cd5cbd6&s=1rv%2FIUT44u8TL9Xn5XFj8ZhsuiDFIXGpfvozOxsVL4k%3D"}]}', 'Reporting-Endpoints': 'heroku-nel=https://nel.heroku.com/reports?ts=1705167628&sid=929419e7-33ea-4e2f-85f0-7d8b7cd5cbd6&s=1rv%2FIUT44u8TL9Xn5XFj8ZhsuiDFIXGpfvozOxsVL4k%3D', 'Nel': '{"report_to":"heroku-nel","max_age":3600,"success_fraction":0.005,"failure_fraction":0.05,"response_headers":["Via"]}', 'Connection': 'keep-alive', 'X-Powered-By': 'Express', 'Access-Control-Allow-Origin': '*', 'Content-Type': 'application/json; charset=utf-8', 'Content-Length': '1863', 'Etag': 'W/"747-x/u8ZT4YD7H4GhrXTZH8+n4vTKY"', 'Set-Cookie': 'connect.sid=s%3AnjmrcoDwdLJ4FSKyPHfGYSVaDCH32buy.BLSf07noJbwY%2Flm%2FkbVkxggqbv2dh9qCdW3PDNlsA7c; Path=/; HttpOnly', 'Date': 'Sat, 13 Jan 2024 17:40:28 GMT', 'Via': '1.1 vegur'}


In [12]:
response.headers['Content-Type']

'application/json; charset=utf-8'

JSON response

In [13]:
print(response.text)

[{"status":{"verified":true,"sentCount":1},"_id":"58e008780aac31001185ed05","user":"58e007480aac31001185ecef","text":"Owning a cat can reduce the risk of stroke and heart attack by a third.","__v":0,"source":"user","updatedAt":"2020-08-23T20:20:01.611Z","type":"cat","createdAt":"2018-03-29T20:20:03.844Z","deleted":false,"used":false},{"status":{"verified":true,"sentCount":1},"_id":"58e009390aac31001185ed10","user":"58e007480aac31001185ecef","text":"Most cats are lactose intolerant, and milk can cause painful stomach cramps and diarrhea. It's best to forego the milk and just give your cat the standard: clean, cool drinking water.","__v":0,"source":"user","updatedAt":"2020-08-23T20:20:01.611Z","type":"cat","createdAt":"2018-03-04T21:20:02.979Z","deleted":false,"used":false},{"status":{"verified":true,"sentCount":1},"_id":"588e746706ac2b00110e59ff","user":"588e6e8806ac2b00110e59c3","text":"Domestic cats spend about 70 percent of the day sleeping and 15 percent of the day grooming.","__v":

#### **JSON RESPONSES** ####

- json.loads() -> Convert a file/string containing JSON data to Python data structures
- json.dumps() -> The opposite

In [15]:
import json

response2 = json.loads(response.text)
response2

[{'status': {'verified': True, 'sentCount': 1},
  '_id': '58e008780aac31001185ed05',
  'user': '58e007480aac31001185ecef',
  'text': 'Owning a cat can reduce the risk of stroke and heart attack by a third.',
  '__v': 0,
  'source': 'user',
  'updatedAt': '2020-08-23T20:20:01.611Z',
  'type': 'cat',
  'createdAt': '2018-03-29T20:20:03.844Z',
  'deleted': False,
  'used': False},
 {'status': {'verified': True, 'sentCount': 1},
  '_id': '58e009390aac31001185ed10',
  'user': '58e007480aac31001185ecef',
  'text': "Most cats are lactose intolerant, and milk can cause painful stomach cramps and diarrhea. It's best to forego the milk and just give your cat the standard: clean, cool drinking water.",
  '__v': 0,
  'source': 'user',
  'updatedAt': '2020-08-23T20:20:01.611Z',
  'type': 'cat',
  'createdAt': '2018-03-04T21:20:02.979Z',
  'deleted': False,
  'used': False},
 {'status': {'verified': True, 'sentCount': 1},
  '_id': '588e746706ac2b00110e59ff',
  'user': '588e6e8806ac2b00110e59c3',
  '

Python can now interpret the data structure

In [18]:
for a, b in enumerate(response2, 1): #enumerate returns both the index and the corresponding value of each element in the sequence provided
    print(f"Respuesta {a}: ", b["text"], "\n") #\n is a new line

Respuesta 1:  Owning a cat can reduce the risk of stroke and heart attack by a third. 

Respuesta 2:  Most cats are lactose intolerant, and milk can cause painful stomach cramps and diarrhea. It's best to forego the milk and just give your cat the standard: clean, cool drinking water. 

Respuesta 3:  Domestic cats spend about 70 percent of the day sleeping and 15 percent of the day grooming. 

Respuesta 4:  The frequency of a domestic cat's purr is the same at which muscles and bones repair themselves. 

Respuesta 5:  Cats are the most popular pet in the United States: There are 88 million pet cats and 74 million dogs. 



#### **PARAMETERS** ####

In [19]:
import json

In [None]:
url = " *** "
querystring = {"api_key":" *** "}

response = requests.get(url, params=querystring)

print(response.text)

In [None]:
get_url = json.loads(response.text)
get_url['datos']

In [None]:
url = get_url['datos']

final_text = requests.get(url)
final_text.text

In [None]:
result = json.load(final_text)
result[0]

#### **TIMEOUT** ####

Is the maximum time the request waits to receive a response from the server

In [20]:
result = requests.get("https://cat-fact.herokuapp.com/facts", timeout = 5)
result

<Response [200]>

In [21]:
from requests.exceptions import Timeout

try: #try to do this
    response = requests.get("https://cat-fact.herokuapp.com/facts", timeout=0.01)

except Timeout: #if an exception occurs
    print('The request timed out')

else: #if no exception occurs
    print('The request did not time out')

The request timed out


#### **DECORATOR FUNCTIONS** ####

A decorator is a function that takes another function as an argument and extends or modifies the behavior of that function without changing its internal code

In [23]:
def decorator(f):
    def new_function(): 
        print("Extra functionality")
        f() #original function
    return new_function

@decorator #modifies the behavior of initial_function without changing its internal code
def initial_function():
    print("Initial functionality")

In [24]:
initial_function() #the extended version provided by the decorator is executed

Extra functionality
Initial functionality


#### When an application fails, you may want your application to retry the same application ####

I want all requests to https://cat-fact.herokuapp.com/facts to be retried three times

In [42]:
import time #will be used to enter a pause between function attempts


def decorator2(func, retries=3): #takes two arguments: func (the function to which the decorator will be applied) and retries (3)
    def retry_wrapper(*args, **kwargs): #new function decorated with their parameters
        attempts = 0
        while attempts < retries:
            try:
                return func(*args, **kwargs)
            except requests.exceptions.RequestException as e: #if an exception of type requests.exceptions.RequestException occurs
                print(e) #the error is printed
                time.sleep(2) #wait for 2 seconds
                attempts += 1 #the attempt counter is incremented

    return retry_wrapper


@decorator2
def get_data(url): #this function has three attempts
    r = requests.get(url)
    return r.text


text = get_data("https://cat-fact.herokuapp.com/facts")
json.loads(text)[0]["text"]

'Owning a cat can reduce the risk of stroke and heart attack by a third.'

#### **PRACTICE** ####

Using the API of this [web site](https://www.thecocktaildb.com/):

- Make a request and use the parameters of the request to get the drinks starting with the letter a. Then filter for drinks that have alcohol in them. How many results do we get?  
- Using the previous request and filter, i.e., alcoholic drinks starting with the letter a, create a list of cocktails that have the word `juice` in one of their ingredients.

In [56]:
import requests

request = requests.get('https://www.thecocktaildb.com/api/json/v1/1/search.php?f=a')

response3 = json.loads(request.text)
response3

{'drinks': [{'idDrink': '17222',
   'strDrink': 'A1',
   'strDrinkAlternate': None,
   'strTags': None,
   'strVideo': None,
   'strCategory': 'Cocktail',
   'strIBA': None,
   'strAlcoholic': 'Alcoholic',
   'strGlass': 'Cocktail glass',
   'strInstructions': 'Pour all ingredients into a cocktail shaker, mix and serve over ice into a chilled glass.',
   'strInstructionsES': 'Vierta todos los ingredientes en una coctelera, mezcle y sirva con hielo en un vaso frío.',
   'strInstructionsDE': 'Alle Zutaten in einen Cocktailshaker geben, mischen und über Eis in ein gekühltes Glas servieren.',
   'strInstructionsFR': None,
   'strInstructionsIT': 'Versare tutti gli ingredienti in uno shaker, mescolare e servire con ghiaccio in un bicchiere freddo.',
   'strInstructionsZH-HANS': None,
   'strInstructionsZH-HANT': None,
   'strDrinkThumb': 'https://www.thecocktaildb.com/images/media/drink/2x8thr1504816928.jpg',
   'strIngredient1': 'Gin',
   'strIngredient2': 'Grand Marnier',
   'strIngredien

In [59]:
request = requests.get('https://www.thecocktaildb.com/api/json/v1/1/search.php?f=a')

response = json.loads(request.text)

drinks_starting_with_a = [
    drink for drink in response['drinks'] 
    if drink['strDrink'][0].lower() == 'a' #get the first character of the name of the drink and convert it to lower case
    ]

for drink in drinks_starting_with_a:
    print(drink['strDrink'])

A1
ABC
Ace
ACID
Adam
AT&T
A. J.
Avalon
Apello
Affair
Abilene
Almeria
Addison
Applecar
Acapulco
Affinity
Aviation
After sex
Applejack
Afterglow
Afternoon
Alexander
Autodafé
Allegheny
Americano


In [61]:
len(drinks_starting_with_a)

25

In [60]:
drinks_starting_with_a_and_alcoholic = [
    drink for drink in response['drinks'] 
    if drink['strDrink'][0].lower() == 'a' and drink.get('strAlcoholic', '').lower() == 'alcoholic'
]

for drink in drinks_starting_with_a_and_alcoholic:
    print(drink['strDrink'])

A1
ABC
Ace
ACID
Adam
AT&T
A. J.
Avalon
Affair
Abilene
Almeria
Addison
Applecar
Acapulco
Affinity
Aviation
After sex
Applejack
Afternoon
Alexander
Autodafé
Allegheny
Americano


In [62]:
len(drinks_starting_with_a_and_alcoholic)

23

Apello and Afterglow do not have alcohol

In [65]:
drinks_with_juice = [
    drink for drink in drinks_starting_with_a_and_alcoholic
    if any(ingredient and 'juice' in ingredient.lower() for ingredient in [drink[f'strIngredient{i}'] for i in range(1, 16)])
]

for drink in drinks_with_juice:
    print(drink['strDrink'])

A1
Adam
A. J.
Avalon
Affair
Abilene
Applecar
Acapulco
Aviation
After sex
Autodafé
Allegheny


---------------