# Ingesting Data from REST API and CSV in Google Colab
This lab exercise is inspired by the provided image, focusing on demonstrating how to ingest data from a public REST API and a local CSV file. It is suitable for running directly in Google Colab.

Objectives
- Practice loading data from a REST API using Python.

- Load and inspect data from a local CSV file.

- Compare the similarities and differences between API and CSV ingestion.



In [None]:
# Step 1: Set Up the Environment
# Install required libraries (requests and pandas)
!pip install requests pandas -q

In [1]:
# Step 2: Ingest Data from a Public REST API
# Example: Fetch random user data from the Random User Generator API.
import requests
import pandas as pd

url = "https://api.openaq.org/v3/countries/42"
params = {
    "city": "Los Angeles",
    "limit": 10
}
headers = {
    "X-API-Key": "f18e38def9d70ad2e5023aeaf892c3b92e0e470657bdb50b3d519ead03984713"
}

response = requests.get(url, params=params, headers=headers)
data=response.json()




In [3]:
# Convert the results to a DataFrame
df_api = pd.json_normalize(data['results'], sep='_')
print("Data from REST API:")
print(df_api.head())


Data from REST API:
   id code        name         datetimeFirst                 datetimeLast  \
0  42   KZ  Kazakhstan  2018-07-27T17:00:00Z  2025-08-04T11:06:37.167000Z   

                                          parameters  
0  [{'id': 1, 'name': 'pm10', 'units': 'µg/m³', '...  


In [None]:
# Step 3: Ingest Data from a CSV File
# Example using the Iris dataset (direct CSV link):

csv_url = "https://raw.githubusercontent.com/fivethirtyeight/uber-tlc-foil-response/master/uber-trip-data/uber-raw-data-apr14.csv"
df_csv = pd.read_csv(csv_url)
print("\nData from CSV File:")
print(df_csv.head())



Data from CSV File:
          Date/Time      Lat      Lon    Base
0  4/1/2014 0:11:00  40.7690 -73.9549  B02512
1  4/1/2014 0:17:00  40.7267 -74.0345  B02512
2  4/1/2014 0:21:00  40.7316 -73.9873  B02512
3  4/1/2014 0:28:00  40.7588 -73.9776  B02512
4  4/1/2014 0:33:00  40.7594 -73.9722  B02512


In [None]:
# Step 4: Inspect and Compare Data
# Perform basic inspection on both data sources.
# Inspect columns and info for both DataFrames
print("API Data Columns:", df_api.columns)
print("CSV Data Columns:", df_csv.columns)

print("\nAPI Data Info:")
print(df_api.info())

print("\nCSV Data Info:")
print(df_csv.info())


API Data Columns: Index(['gender', 'email', 'phone', 'cell', 'nat', 'name.title', 'name.first',
       'name.last', 'location.street.number', 'location.street.name',
       'location.city', 'location.state', 'location.country',
       'location.postcode', 'location.coordinates.latitude',
       'location.coordinates.longitude', 'location.timezone.offset',
       'location.timezone.description', 'login.uuid', 'login.username',
       'login.password', 'login.salt', 'login.md5', 'login.sha1',
       'login.sha256', 'dob.date', 'dob.age', 'registered.date',
       'registered.age', 'id.name', 'id.value', 'picture.large',
       'picture.medium', 'picture.thumbnail'],
      dtype='object')
CSV Data Columns: Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width',
       'species'],
      dtype='object')

API Data Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 34 columns):
 #   Column                          Non-Null Count  Dtype 

## Step 5: Reflection Questions
At the end of your notebook, answer these questions:

- What were the main steps for ingesting data from a REST API vs. a CSV?

- What are some possible challenges or error scenarios for each ingestion method?

- For your workflow, when would you prefer an API vs. a CSV file?

In [None]:
# (Optional) Save Your Results
# Save both datasets to Colab files (optional)
df_api.to_csv("users_api.csv", index=False)
df_csv.to_csv("iris_csv.csv", index=False)

Deliverables:

- Code snippets and output for REST API and CSV ingestion.

- Completed reflection answers.