<a href="https://colab.research.google.com/github/ipeirotis/dealing_with_data/blob/master/02-WebAPIs/A-Accessing_Web_APIs_using_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Web APIs




## Web APIs (vs. Web Browsers): Two Ways to Interact with the Web

Think of the Internet as a massive digital warehouse filled with valuable data and services. You can interact with this warehouse in two primary ways: through a **web browser** or by using **Web APIs**. The critical difference lies in who (or what) performs the interaction and how.

---

### 1. Web Browser: Human Interaction

* **Who:** A person.
* **How:** You use a browser (e.g., Chrome, Safari) to visit websites like `finance.yahoo.com`.
* **What Happens:** The website sends back visually formatted pages with text, images, buttons, and ads designed for human interaction.
* **Purpose:** To inform or entertain people.
* **Analogy:** Walking into a department store, browsing shelves and price tags manually.


### 2. Web API: Automated Interaction

* **Who:** A computer program or script (e.g., Python script).
* **How:** Your script sends structured requests directly to an "API endpoint" (a dedicated URL).
* **What Happens:** The server responds with structured, raw data (commonly in JSON), optimized for immediate machine processing.
* **Purpose:** Provide data or trigger actions programmatically.
* **Analogy:** Sending a robot directly into the warehouse with a specific shopping list (e.g., "Retrieve price for item ID 12345"), skipping manual browsing entirely.

---


## Types of APIs for Business Analytics: "Read"-"Think"-"Do"

APIs can be categorized based on their core functionality:

### 1. Data Retrieval APIs ("Read" APIs)

These APIs give you access to data from external systems.

* **Live Data Examples:**

  * **Financial:** *Alpha Vantage API* for real-time stock prices.
  * **Social Media:** *Twitter API* for real-time tweet streams.
  * **Logistics:** *Google Maps API* for current traffic data.

* **Historical Data Examples:**

  * **Government Records:** *U.S. National Archives API* for historical data.
  * **Academic Research:** *JSTOR API* for scholarly metadata.
  * **Market Research:** Weather APIs for historical weather analysis.

### 2. Data Processing APIs ("Think" APIs)

These APIs accept your data, analyze or transform it, and return insightful results.

* **Natural Language Processing:** *Google Cloud Natural Language API* analyzes text for sentiment and themes.
* **Image Recognition:** *Google Cloud Vision API* identifies objects, text, or logos in images.
* **Geocoding:** *Mapbox Geocoding API* converts street addresses into geographic coordinates.

### 3. Automation APIs ("Do" APIs)

These APIs automate tasks or trigger actions across applications.

* **Automated Reporting:** *Slack API* can post alerts based on data thresholds (e.g., sales targets).
* **Marketing Automation:** Add contacts automatically to mailing lists with *Mailchimp* or *HubSpot APIs* based on customer feedback.
* **System Integration:** Automate invoicing with accounting APIs like *QuickBooks* when sales occur on platforms like *Shopify*.

---

By mastering APIs, you dramatically enhance your capability as a business analyst, automating workflows, and effortlessly managing vast amounts of data.


# Interacting with Web APIs



We are going to learn how to use Web APIs using Python. Using a Web API is similar to using function, but the "function" that we call is located in another machine, and we submit the parameters of the function through the web.



## Motivating example: Weather-targeting in ads

Here are a few examples of ad campaigns that leveraged weather information to effectively target customers.

| Brand & Year | Weather Signal | Activation | Business Impact |
| :--- | :--- | :--- | :--- |
| **1. Pantene “Haircast” (USA, 2013)** | Local humidity spikes (bad-hair-day risk) pulled from The Weather Channel. | When women opened the Weather Channel app on a high-humidity day, they saw a “Haircast” forecast plus the exact Pantene SKU to tame frizz, linked to a Walgreens offer. | +24% YoY Pantene sales at Walgreens; +10% baseline lift and displays in 5,500 stores. |
| **2. Campbell’s Soup Watson Ads (USA, 2016)** | Cold or blustery conditions from users’ GPS forecast. | Inside The Weather Company app, an AI banner invited people to “warm up” with recipe ideas; copy, imagery, and calls-to-action changed in real time. | First commercial Watson Ads unit; demonstrated higher engagement vs. static creative. |
| **3. Stella Artois Cidre “Serve Chilled” (UK, 2013)** | +2 °C above the monthly norm AND no rain. | Digital out-of-home billboards lit up only during perfect cider-drinking weather near supermarkets; the programmatic buy paused when the trigger lapsed. | 65.6% YoY sales jump during the campaign and up to 50% media-cost efficiency. |
| **4. Walgreens Allergy Stories (USA, 2023)** | Neighborhood-level pollen counts via IBM Watson Weather. | Served Instagram Stories only when local pollen was “high,” opening with the alert and flashing a coupon for antihistamines. | +276% click-through vs. prior allergy ads; -41% cost per person reached; -64% cost per click. |



## First Example: GeoIP resolution

We will start with an example that is doing a "geoIP" resolution: it takes the IP of a computer and returns back its location.

## Our Toolkit: The requests Library and JSON

Before we make our first live API call, let's understand the two key tools we'll be using.

### The `requests` Library
To have our Python script "talk" to a web server, we use a library called `requests`. It simplifies the process of sending HTTP requests, which is the standard way information is exchanged on the web. When you type a URL into your browser, you are sending an HTTP "GET" request. The `requests` library lets us do the same thing in our code.

We first need to `import` it to make its functions available to our script.



In [None]:
# We first import the requests library
import requests

### The Response Object
When we send a request, the server sends back a **Response object**. Think of this object as a container. It holds several pieces of information:
* **Content:** The actual data sent back by the server (e.g., HTML, or in our case, JSON). We can see this as text using `response.text`.
* **Status Code:** A number that tells us if our request was successful. The most famous is `200 OK`. A `404` means "Not Found," and a `403` or `401` often means there's a problem with our API key.
* **A JSON Decoder:** A special function, `.json()`, that directly converts a JSON-formatted response into a Python dictionary, which is much easier to work with.

In [None]:
# These two "keys" serve as passwords to access these services
# For security reasons, I share the keys on the class Slack
ipstack_api_key = 'KEY ON SLACK'
openweathermap_key = 'KEY ON SLACK'

In [None]:
# This is the URL of our API requests
url = f'http://api.ipstack.com/check?access_key={ipstack_api_key}'
resp = requests.get(url)

In [None]:
# Let's check the status code. 200 means success!
print(f"Response Status Code: {resp.status_code}")

In [None]:
# The .text attribute shows us the raw data the server sent back.
# As you can see, it's a string that is formatted like a dictionary. This is JSON!
print(resp.text)


### What is JSON?
JSON (JavaScript Object Notation) is the de-facto standard for transmitting data through APIs. It's human-readable text that looks very similar to a Python dictionary. Its simplicity and readability make it perfect for APIs.

For example, a simple JSON object looks like this:
```json
{
  "name": "John Doe",
  "city": "New York",
  "isStudent": true
}
```

In [None]:
# To work with this data in Python, we need to "parse" the JSON string into a Python dictionary.
# The .json() method does this for us automatically.
data = resp.json()

# Let's confirm that it is now a dictionary.
print(f"The type of our 'data' variable is: {type(data)}")

In [None]:
# Now that 'data' is a Python dictionary, we can inspect it.
# Notice the key-value pairs.
from pprint import pprint # pprint makes dictionaries print more neatly
pprint(data)

In [None]:
# And we can access the fields of the JSON as we normally access Python dictionary entries
print("Lon:", data["longitude"], "Lat:", data["latitude"])

In [None]:
# A few more data points
print("City:", data["city"])
print("Region:", data["region_name"], data["region_code"])
print("Zipcode:", data["zip"])

And in one piece:

In [None]:
import requests
url = f'http://api.ipstack.com/check?access_key={ipstack_api_key}'
resp = requests.get(url)
data = resp.json()
print("Lon:", data["longitude"], "Lat:", data["latitude"])
print("City:", data["city"])
print("Region:", data["region_name"], data["region_code"])
print("Zipcode:", data["zip"])

## Using Parameters with API Calls



The first API call that we tried was very simple. We just fetched a URL. Now let's see a URL that accepts as input a set of **parameters**. We have already seen this concept with functions; the parameters of the API calls are the exact equivalent but for Web APIs, which are, at their core, functions that we call over the web.

### Example: OpenWeatherMap

Let's try to query OpenWeatherMap now, to get data about the weather. [Documentation](https://openweathermap.org/api/one-call-3).

Below you can find the URL that you can copy and paste in your browser, to get the weather for the lat/lon coordinates (`40.728955`, `-73.996154`) (i.e., the Stern building).

You will notice that it contains parameters as part of the URL, including an `appid` which is a key that is used to limit the number of calls that can be issued by a single application.

Try the URL in your browser.

Below you can find the same code, but now we have a Python dictionary to organize and list the parameters.

In [None]:
import requests
from pprint import pprint

openweathermap_url = "https://api.openweathermap.org/data/3.0/onecall"

# Using a 'params' dictionary is the best practice.
# The `requests` library will correctly format this into a URL for us.
# It's cleaner and safer than building the URL string by hand.
parameters = {
    'lat'   : 40.728955,
    'lon'   : -73.996154,
    'units' : 'imperial',
    'exclude' : 'minutely,hourly,daily',
    'mode'  : 'json',
    'appid' : openweathermap_key
}

resp = requests.get(openweathermap_url, params=parameters)

# Always good to check for success!
print(f"Response Status Code: {resp.status_code}")

data = resp.json()
pprint(data)

### Navigating Nested Data

Look at the structure of the `data` dictionary above. You'll see that some values are themselves dictionaries. For example, the value for the key `'current'` is another dictionary containing the current weather details.

To get the temperature, we first need to access the `'current'` dictionary, and *then* access the `'temp'` key inside of it.

`data['current']` will give us the inner dictionary.
`data['current']['temp']` will give us the temperature.

Similarly, to get the weather description, you can see it's even more nested: `data['current']['weather']`. Notice that the value for `'weather'` is a list (inside `[` `]`). So we need to access the first item of the list with `[0]`, which is a dictionary, and then get the value for the key `'description'`.

## Exercise 1


a. Extract the current temperature from the returned JSON response.


In [None]:
# your code here

b. Extract the description of the current weather

In [None]:
# your code here

c. Try to change the units to `metric` and repeat


In [None]:
# your code here

### Solution for Exercise 1

In [None]:
# First, access the 'current' dictionary
current_weather = data['current']

# Now, access the 'temp' key from the 'current_weather' dictionary
current_temp = current_weather['temp']

print(f"The current temperature is: {current_temp}°F")

In [None]:
# The 'weather' key contains a list of dictionaries. We want the first one, at index 0.
weather_list = data['current']['weather']
first_weather_item = weather_list[0]
weather_description = first_weather_item['description']

print(f"The current weather description is: {weather_description}")

# You can also do this in a single line, but it's good to understand the steps:
# print(data['current']['weather'][0]['description'])

In [None]:
import requests

openweathermap_url = "https://api.openweathermap.org/data/3.0/onecall"
parameters = {
    'lat'   : 40.728955,
    'lon'   : -73.996154,
    'units' : 'metric',
    'exclude' : 'minutely,hourly,daily',
    'mode'  : 'json',
    'appid' : openweathermap_key
}
resp = requests.get(openweathermap_url, params=parameters)
data = resp.json()
data

## Exercise 2



Read the location of your computer using the GeoIP API. Then use the OpenWeatherMap to query the API and fetch the temperature for the location returned by the GeoIP API. For this exercise, you will need to learn to read variables from a Web API (geoip) and use them as input in another (openweathermap)

In [None]:
# your code here

### Solution for Exercise 2

In [None]:
import requests

# 1. Get location data from GeoIP API
print("Fetching location data...")
geoip_url = f'http://api.ipstack.com/check?access_key={ipstack_api_key}'
resp = requests.get(geoip_url)
geoip_data = resp.json()

# 2. Extract the latitude and longitude and other data from the response
# HINT: Look at the keys in geoip_data
lon = geoip_data["longitude"]
lat = geoip_data["latitude"]
city = geoip_data["city"]
state = geoip_data["region_code"]
zipcode = geoip_data["zip"]

# 3. Use the extracted lat and lon to call the OpenWeatherMap API
# Query the OpenWeatherMap API for the lat/lon coordinates returned by GeoIP
print("\nFetching weather data for this location...")
openweathermap_url = "https://api.openweathermap.org/data/3.0/onecall"
parameters = {
    'lat'   : lat,
    'lon'   : lon,
    'units' : 'imperial',
    'exclude' : 'minutely,hourly,daily',
    'mode'  : 'json',
    'appid' : openweathermap_key
}
resp = requests.get(openweathermap_url, params=parameters)
weather_data = resp.json()

# 4. Extract and print the temperature and description
# HINT: This is just like Exercise 1
weather_description = weather_data['current']['weather'][0]['description']
current_temperature = data['current']['temp']

# Print out the results
print("Location:", city, state, zipcode)
print("Weather:", weather_description)
print("Temperature:", current_temperature)

print(f"\nIn {geoip_data['city']}, the temperature is {current_temp}°F with {weather_description}.")