# Reading Data from API Gateway with Python's `requests`

This notebook provides a tutorial on how to interact with an API Gateway endpoint using Python's `requests` library to retrieve data. We will cover making basic GET requests, handling common API Gateway features like headers, and integrating the response with Pandas for data analysis.

## What is an API Gateway?

An API Gateway acts as the single entry point for a group of microservices. It handles common tasks like authentication, authorization, routing, rate limiting, and monitoring, protecting the backend services from direct exposure. When you access data from an API, you're often interacting with an API Gateway.

## Prerequisites

We will be using two primary Python libraries:

-   `requests`: For making HTTP requests to web services.
-   `pandas`: For data manipulation and analysis, especially after receiving JSON data.

If you don't have them installed, run the following command in your terminal or a new notebook cell:

In [None]:
!pip install requests pandas

## 1. Making a Basic GET Request

The most common way to retrieve data from an API is via an HTTP GET request. The `requests.get()` function is used for this purpose.

For demonstration, we'll use a public API that mimics a simple data endpoint. Imagine this is your API Gateway endpoint.

In [2]:
import requests
import pandas as pd

# Replace this with your actual API Gateway URL
api_gateway_url = "https://jsonplaceholder.typicode.com/posts"

try:
    response = requests.get(api_gateway_url)
    response.raise_for_status() # Raise an exception for HTTP errors (4xx or 5xx)
    
    # Check if the request was successful (status code 200)
    if response.status_code == 200:
        data = response.json() # Parse the JSON response
        print("Successfully fetched data!")
        for i in data:
            print(i)
    else:
        print(f"Failed to fetch data. Status code: {response.status_code}")
        print("Response content:", response.text)
        
except requests.exceptions.HTTPError as http_err:
    print(f"HTTP error occurred: {http_err}")
except requests.exceptions.ConnectionError as conn_err:
    print(f"Connection error occurred: {conn_err}")
except requests.exceptions.Timeout as timeout_err:
    print(f"Timeout error occurred: {timeout_err}")
except requests.exceptions.RequestException as req_err:
    print(f"An error occurred: {req_err}")

Successfully fetched data!
{'userId': 1, 'id': 1, 'title': 'sunt aut facere repellat provident occaecati excepturi optio reprehenderit', 'body': 'quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto'}
{'userId': 1, 'id': 2, 'title': 'qui est esse', 'body': 'est rerum tempore vitae\nsequi sint nihil reprehenderit dolor beatae ea dolores neque\nfugiat blanditiis voluptate porro vel nihil molestiae ut reiciendis\nqui aperiam non debitis possimus qui neque nisi nulla'}
{'userId': 1, 'id': 3, 'title': 'ea molestias quasi exercitationem repellat qui ipsa sit aut', 'body': 'et iusto sed quo iure\nvoluptatem occaecati omnis eligendi aut ad\nvoluptatem doloribus vel accusantium quis pariatur\nmolestiae porro eius odio et labore et velit aut'}
{'userId': 1, 'id': 4, 'title': 'eum et est occaecati', 'body': 'ullam et saepe reiciendis voluptatem adipisci\nsit amet autem assumenda provident 

In [38]:
import requests
import pandas as pd

# Replace this with your actual API Gateway URL
api_gateway_url = "https://jsonplaceholder.typicode.com/posts"

try:
    response = requests.get(api_gateway_url)
    response.raise_for_status() # Raise an exception for HTTP errors (4xx or 5xx)
    
    # Check if the request was successful (status code 200)
    if response.status_code == 200:
        data = response.json() # Parse the JSON response
except:
    print("ERROR")

df = pd.DataFrame(data)
print(df.info())
print(df.head())
print(set(df["userId"]))

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   userId  100 non-null    int64 
 1   id      100 non-null    int64 
 2   title   100 non-null    object
 3   body    100 non-null    object
dtypes: int64(2), object(2)
memory usage: 3.3+ KB
None
   userId  id                                              title  \
0       1   1  sunt aut facere repellat provident occaecati e...   
1       1   2                                       qui est esse   
2       1   3  ea molestias quasi exercitationem repellat qui...   
3       1   4                               eum et est occaecati   
4       1   5                                 nesciunt quas odio   

                                                body  
0  quia et suscipit\nsuscipit recusandae consequu...  
1  est rerum tempore vitae\nsequi sint nihil repr...  
2  et iusto sed quo iure\nvoluptatem occaecati om...

## 2. Handling API Gateway Specifics (Headers & Parameters)

API Gateway often requires specific headers (e.g., `x-api-key` for API key authentication) or uses query parameters to filter/control the data. The `requests` library makes it easy to include these.

### Including Custom Headers (e.g., API Key)

If your API Gateway requires an API key, you'll typically pass it in a header named `x-api-key` or `Authorization`.

In [9]:
# Example: API Key (replace 'YOUR_API_KEY' with your actual key)
headers = {
    "x-api-key": "YOUR_API_KEY", 
    "Content-Type": "application/json"
}

# For this demo, the placeholder API doesn't use API keys, 
# so we'll use a different endpoint that accepts query parameters.
api_gateway_url_with_params = "https://jsonplaceholder.typicode.com/comments"

try:
    # Make the request with headers
    # In a real scenario, this URL would be your API Gateway endpoint requiring the key
    response_with_headers = requests.get(api_gateway_url_with_params, headers=headers)
    response_with_headers.raise_for_status()
    print("Request with headers successful (if API supported it)!")
    print(response_with_headers.json()[:1]) # Uncomment to see response

except requests.exceptions.RequestException as e:
    print(f"Error making request with headers: {e}")

import pandas as pd
df = pd.DataFrame(response_with_headers.json())
print(df.head())
df.describe()

Request with headers successful (if API supported it)!
[{'postId': 1, 'id': 1, 'name': 'id labore ex et quam laborum', 'email': 'Eliseo@gardner.biz', 'body': 'laudantium enim quasi est quidem magnam voluptate ipsam eos\ntempora quo necessitatibus\ndolor quam autem quasi\nreiciendis et nam sapiente accusantium'}]
   postId  id                                       name  \
0       1   1               id labore ex et quam laborum   
1       1   2  quo vero reiciendis velit similique earum   
2       1   3              odio adipisci rerum aut animi   
3       1   4                             alias odio sit   
4       1   5      vero eaque aliquid doloribus et culpa   

                    email                                               body  
0      Eliseo@gardner.biz  laudantium enim quasi est quidem magnam volupt...  
1  Jayne_Kuhic@sydney.com  est natus enim nihil est dolore omnis voluptat...  
2     Nikita@garfield.biz  quia molestiae reprehenderit quasi aspernatur\...  
3        

Unnamed: 0,postId,id
count,500.0,500.0
mean,50.5,250.5
std,28.894979,144.481833
min,1.0,1.0
25%,25.75,125.75
50%,50.5,250.5
75%,75.25,375.25
max,100.0,500.0


In [17]:

import requests

resp = requests.get("https://www.youtube.com")
html_page = resp.text
fd = open("youtube.html", "w")
fd.write(html_page)
fd.close()

In [27]:
import requests

resp = requests.get("http://127.0.0.1:8000/api/new")
print(resp)
print(resp.text)

resp = requests.get("http://127.0.0.1:8000/api/new")
print(resp)
print(resp.text)

resp = requests.get("http://127.0.0.1:8000/api/new")
print(resp)
print(resp.text)

resp = requests.get("http://127.0.0.1:8000/api/new")
print(resp)
print(resp.text)


<Response [200]>
ABCD129
<Response [200]>
ABCD130
<Response [200]>
ABCD131
<Response [200]>
ABCD132


# Using API Key (OpenWeatherMap)

In [45]:
import requests
import os

# --- IMPORTANT: Store your API key securely, e.g., in environment variables ---
# You would typically set this in your shell:
# export OPENWEATHER_API_KEY="your_actual_api_key_here"
OPENWEATHER_API_KEY = "8b73d1902be5214057ebe3a26699db2a"
#OPENWEATHER_API_KEY = os.getenv("KEY")

if not OPENWEATHER_API_KEY:
   print("Error: OPENWEATHER_API_KEY environment variable not set.")
   print("Please get a free API key from OpenWeatherMap and set it:")
   print("  export OPENWEATHER_API_KEY='your_key_here'")
   exit()

BASE_URL = "http://api.openweathermap.org/data/2.5/weather"
CITY_NAME = "Hyderabad"
COUNTRY_CODE = "IN" # India
UNITS = "metric" # For Celsius temperatures

params = {
   "q": f"{CITY_NAME},{COUNTRY_CODE}",
   "appid": OPENWEATHER_API_KEY,
   "units": UNITS
}

try:
   response = requests.get(BASE_URL, params=params)
   response.raise_for_status() # Raise an exception for HTTP errors (4xx or 5xx)
   weather_data = response.json()
   print(weather_data)
except:
    print("ERROR")
#    if weather_data.get("cod") == 200: # Check for successful response
#        temp = weather_data["main"]["temp"]
#        description = weather_data["weather"][0]["description"]
#        print(f"Current weather in {CITY_NAME}:")
#        print(f"Temperature: {temp}°C")
#        print(f"Description: {description.capitalize()}")
#    else:
#        print(f"Error fetching weather: {weather_data.get('message', 'Unknown error')}")

# except requests.exceptions.RequestException as e:
#    print(f"Network or API request error: {e}")
# except KeyError:
#    print("Error: Could not parse weather data. Check API response structure.")

{'coord': {'lon': 78.4744, 'lat': 17.3753}, 'weather': [{'id': 804, 'main': 'Clouds', 'description': 'overcast clouds', 'icon': '04n'}], 'base': 'stations', 'main': {'temp': 24.32, 'feels_like': 24.84, 'temp_min': 24.32, 'temp_max': 24.32, 'pressure': 1006, 'humidity': 78, 'sea_level': 1006, 'grnd_level': 942}, 'visibility': 10000, 'wind': {'speed': 7.2, 'deg': 269, 'gust': 12.93}, 'clouds': {'all': 100}, 'dt': 1750782637, 'sys': {'country': 'IN', 'sunrise': 1750724007, 'sunset': 1750771405}, 'timezone': 19800, 'id': 1269843, 'name': 'Hyderabad', 'cod': 200}


In [13]:
import os
import requests
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
import pickle

# --- Configuration (replace with your Google Cloud project details) ---
CLIENT_SECRETS_FILE = 'client_secrets.json' # Download this from Google Cloud Console
SCOPES = ['https://www.googleapis.com/auth/drive.readonly'] # Example scope: read-only access to Google Drive

def authenticate_google_drive():
   creds = None
   # The file token.pickle stores the user's access and refresh tokens, and is
   # created automatically when the authorization flow completes for the first time.
   if os.path.exists('token.pickle'):
       with open('token.pickle', 'rb') as token:
           creds = pickle.load(token)

   # If there are no (valid) credentials available, let the user log in.
   if not creds or not creds.valid:
       if creds and creds.expired and creds.refresh_token:
           creds.refresh(Request())
       else:
           flow = InstalledAppFlow.from_client_secrets_file(
               CLIENT_SECRETS_FILE, SCOPES)
           creds = flow.run_local_server(port=0)
       # Save the credentials for the next run
       with open('token.pickle', 'wb') as token:
           pickle.dump(creds, token)
   return creds

def get_google_drive_files(credentials):
   """Lists the names and IDs of the first 10 files in Google Drive."""
   from googleapiclient.discovery import build
   service = build('drive', 'v3', credentials=credentials)

   results = service.files().list(pageSize=10, fields="nextPageToken, files(id, name)").execute()
   items = results.get('files', [])

   if not items:
       print('No files found.')
   else:
       print('Files:')
       for item in items:
           print(f"{item['name']} ({item['id']})")

if __name__ == "__main__":
   print("Initiating Google Drive OAuth flow...")
   credentials = authenticate_google_drive()
   if credentials:
       print("\nSuccessfully authenticated with Google Drive!")
       get_google_drive_files(credentials)
   else:
       print("Authentication failed.")

ModuleNotFoundError: No module named 'google_auth_oauthlib'

In [14]:
!pip install google_auth_oauthlib

Collecting google_auth_oauthlib
  Downloading google_auth_oauthlib-1.2.2-py3-none-any.whl.metadata (2.7 kB)
Collecting google-auth>=2.15.0 (from google_auth_oauthlib)
  Downloading google_auth-2.40.3-py2.py3-none-any.whl.metadata (6.2 kB)
Collecting requests-oauthlib>=0.7.0 (from google_auth_oauthlib)
  Downloading requests_oauthlib-2.0.0-py2.py3-none-any.whl.metadata (11 kB)
Collecting rsa<5,>=3.1.4 (from google-auth>=2.15.0->google_auth_oauthlib)
  Downloading rsa-4.9.1-py3-none-any.whl.metadata (5.6 kB)
Collecting oauthlib>=3.0.0 (from requests-oauthlib>=0.7.0->google_auth_oauthlib)
  Downloading oauthlib-3.3.1-py3-none-any.whl.metadata (7.9 kB)
Downloading google_auth_oauthlib-1.2.2-py3-none-any.whl (19 kB)
Downloading google_auth-2.40.3-py2.py3-none-any.whl (216 kB)
Downloading requests_oauthlib-2.0.0-py2.py3-none-any.whl (24 kB)
Downloading oauthlib-3.3.1-py3-none-any.whl (160 kB)
Downloading rsa-4.9.1-py3-none-any.whl (34 kB)
Installing collected packages: rsa, oauthlib, request

## Conclusion

You've learned how to make HTTP GET requests to an API Gateway (or any REST API) using Python's `requests` library. We covered including headers and query parameters, robust error handling, and seamlessly converting the JSON response into a Pandas DataFrame for further analysis. This forms the foundation for building data pipelines that consume data from web services.

In [93]:
import numpy as np

l1 = np.array([12,23,45,6,7,8,9,0,1,2,3,44,3,5,6,77,88,0,4,33,2,2,2,2,2])
price = l1*100
dates = ["Day"+str(num) for num in range(1, len(l1)+1)]
df1 = pd.DataFrame({"sales":l1, "price":price}, index=["Item"+str(num) for num in range(1, len(l1)+1)])
df1['dates'] = dates
#print(df1.head())
# print("df1.iloc[1,0] = ", df1.iloc[1,0])
# print("df1.loc[\"Item1\"] = ", df1.loc["Item2"]["sales"])

# Group by sales and sum all prices 
df2 = df1.groupby('dates')['price'].sum().sort_values(ascending=False)
df_posts = df2[:10]
print(df_posts)

dates
Day17    8800
Day16    7700
Day3     4500
Day12    4400
Day20    3300
Day2     2300
Day1     1200
Day7      900
Day6      800
Day5      700
Name: price, dtype: int64


In [92]:
%matplotlib inline
import matplotlib.pyplot as plt


import numpy as np

l1 = np.array([12,23,45,6,7,8,9,0,1,2,3,44,3,5,6,77,88,0,4,33,2,2,2,2,2])
price = l1*100
dates = ["Day"+str(num) for num in range(1, len(l1)+1)]
df1 = pd.DataFrame({"sales":l1, "price":price}, index=["Item"+str(num) for num in range(1, len(l1)+1)])
df1['dates'] = dates
#print(df1.head())
# print("df1.iloc[1,0] = ", df1.iloc[1,0])
# print("df1.loc[\"Item1\"] = ", df1.loc["Item2"]["sales"])

# Group by sales and sum all prices 
df2 = df1.groupby('dates')['price'].sum().sort_values(ascending=False)
df_posts = df2[:10]
print(df_posts)
plt.figure(figsize=(9, 6))
plt.scatter(df_posts['dates'], df_posts['price'], alpha=0.7, color='teal', s=100) # s for size of markers
plt.title('Post ID vs. User ID')
plt.xlabel('User ID')
plt.ylabel('Post ID')
plt.xticks(df_posts['userId'].unique())
plt.grid(True, linestyle='--', alpha=0.6)
plt.show()

KeyError: 'dates'

<Figure size 900x600 with 0 Axes>