# 1. Introduction to Data Integration and APIs in Data Engineering


### 1.1 Importance in Data Engineering

<font size = "4">Data integration is central to data engineering, particularly in the contemporary business environment where the volume of data is burgeoning. It facilitates a unified view of data from various sources, encouraging collaboration and enhancing data accuracy, which is vital for strategic decision-making within organizations. Moreover, modern cloud data platforms have a significant role in data management, aiding in the maintenance of data quality and trustworthiness <a href="https://www.techtarget.com/searchdatamanagement/feature/Effective-integration-key-to-creating-trusted-data">Source1</a>, <a href="https://dev.to/k_ndrick/data-engineering-for-beginners-a-step-by-step-guide-3d1f">Source2</a></font>


<font size = "4">In this example below, we will demonstrate how to fetch data from a single API endpoint and display it, which is a basic yet fundamental aspect of data integration in data engineering.</font><br><br>

In [57]:
import requests
import pandas as pd

# Define the API endpoint and parameters
api_endpoint = "https://api.openaq.org/v2/latest"
params = {
    "city": "London",
    "limit": 100
}
headers = {
    "X-API-Key": "b102705a63c732bd548f9b60f802d68d2e12807437516cae4f790e01198d3e3e"
}

# Make a GET request to the API
response = requests.get(api_endpoint, params=params, headers=headers)

# Check if the request was successful
if response.status_code == 200:
    # Parse the JSON response
    data = response.json()
    
    # Load data into a pandas DataFrame
    df = pd.DataFrame(data['results'])
    
    # Extract measurements as a separate DataFrame
    measurements_df = pd.DataFrame([x['measurements'][0] for x in data['results']])
    
    # Perform data cleaning steps (like handling missing values)
    measurements_df.dropna(inplace=True)
    
    # Perform basic data analysis (like calculating the average value of measurements)
    average_measurement = measurements_df['value'].mean()
    print(f"The average measurement value is: {average_measurement}")

else:
    print(f"Failed to retrieve data. HTTP Status code: {response.status_code}")

The average measurement value is: 44.0136


<br><font size="4">The above code can be further modified to explore the measurements_df as shown below</font><br><br>

In [58]:
measurements_df

Unnamed: 0,parameter,value,lastUpdated,unit
0,pm10,7.00,2024-02-10T12:04:13+00:00,µg/m³
1,pm10,15.00,2024-02-10T12:04:08+00:00,µg/m³
2,pm10,10.00,2024-02-10T12:04:08+00:00,µg/m³
3,pm10,14.00,2024-02-10T12:04:08+00:00,µg/m³
4,pm10,4.00,2024-02-10T12:04:07+00:00,µg/m³
...,...,...,...,...
95,humidity,64.00,2024-02-09T19:13:46+00:00,%
96,um003,0.03,2024-02-09T19:13:46+00:00,particles/cm³
97,pm1,1.30,2024-02-09T19:13:46+00:00,µg/m³
98,humidity,59.00,2024-02-09T19:13:46+00:00,%


<br><font size="4">Here is another example to extract data from National Parks API to query their Parks request url</font><br><br>

In [7]:
import requests
import json
import pandas as pd

def fetch_data_from_api(api_url):
    """
    This function takes an API URL as input, sends a request to the API,
    and returns the response data if the request is successful.
    """
    response = requests.get(api_url)
    
    if response.status_code == 200:
        return response.json()
    else:
        return f"Failed to retrieve data. HTTP Status code: {response.status_code}"

# Separate API key as a variable
api_key = "9bAYU4D7zwS4kt9MFtDCm8bgFiL87n9qfGEVbjAS"

# API endpoint to fetch data, including the API key as a query parameter
api_url = f'https://developer.nps.gov/api/v1/parks?limit=1&api_key={api_key}'

# Fetch data from the API
data = fetch_data_from_api(api_url)

<br><font size = "4">Here we will pretty print the json to show the nested natured of the returned data</font><br><br>

In [8]:
# Pretty print the JSON data
print("Pretty Printed JSON Data:")
print(json.dumps(data, indent=4))

Pretty Printed JSON Data:
{
    "total": "471",
    "limit": "1",
    "start": "0",
    "data": [
        {
            "id": "77E0D7F0-1942-494A-ACE2-9004D2BDC59E",
            "url": "https://www.nps.gov/abli/index.htm",
            "fullName": "Abraham Lincoln Birthplace National Historical Park",
            "parkCode": "abli",
            "description": "For over a century people from around the world have come to rural Central Kentucky to honor the humble beginnings of our 16th president, Abraham Lincoln. His early life on Kentucky's frontier shaped his character and prepared him to lead the nation through Civil War. Visit our country's first memorial to Lincoln, built with donations from young and old, and the site of his childhood home.",
            "latitude": "37.5858662",
            "longitude": "-85.67330523",
            "latLong": "lat:37.5858662, long:-85.67330523",
            "activities": [
                {
                    "id": "13A57703-BB1A-41A2-94B8-53B692E

<br><font size = "4">As you can see that the nested json has a multi-level parent-child structure and hence simple json_normalize might not be useful in analyzing the data. Hence, we will denormalize the nested elements dynamically and adding the child elements parent key name to the column to expand the dataframe.</font><br><br>

<font size = "4">For example <code>activities</code> has nested element with keys <code>id</code>, and <code>name</code> which we will flatten to <code>activities_id, activities_name</code> and recusrively for other columns with nested key value pairs as well.</font><br><br>

In [24]:
# Extract the 'data' field which contains the information we want to normalize
data_to_normalize = data.get('data', [{}])[0]

# Convert the nested JSON structures within each column to a flat structure
main_df = pd.json_normalize(data_to_normalize)

def extract_nested_data(df):
    """
    This recursive function takes a DataFrame as input, checks each column for nested structures,
    and denormalizes nested elements if any are found. It continues to denormalize
    until all nested structures are fully denormalized.
    """
    while True:
        nested_cols = [col for col in df.columns if isinstance(df[col].iloc[0], list)]
        
        if not nested_cols:
            break
        
        for col in nested_cols:
            expanded_col_df = pd.json_normalize(df[col].explode().apply(lambda x: x if isinstance(x, dict) else {}))
            expanded_col_df.index = df.index.repeat(df[col].apply(len))
            prefix = f'{col}_'
            expanded_col_df = expanded_col_df.add_prefix(prefix)
            df = df.drop(columns=[col])
            df = df.join(expanded_col_df).reset_index(drop=True)
    
    return df

# Apply the function to denormalize all nested elements in the DataFrame
flattened_df = extract_nested_data(main_df)
flattened_df

Unnamed: 0,id,url,fullName,parkCode,description,latitude,longitude,latLong,states,directionsInfo,directionsUrl,weatherInfo,name,designation,relevanceScore,activities_id,activities_name,topics_id,topics_name,operatingHours_description,operatingHours_name,operatingHours_standardHours.wednesday,operatingHours_standardHours.monday,operatingHours_standardHours.thursday,operatingHours_standardHours.sunday,operatingHours_standardHours.tuesday,operatingHours_standardHours.friday,operatingHours_standardHours.saturday,addresses_postalCode,addresses_city,addresses_stateCode,addresses_countryCode,addresses_provinceTerritoryCode,addresses_line1,addresses_type,addresses_line3,addresses_line2,images_credit,images_title,images_altText,images_caption,images_url,contacts.phoneNumbers_phoneNumber,contacts.phoneNumbers_description,contacts.phoneNumbers_extension,contacts.phoneNumbers_type,contacts.emailAddresses_description,contacts.emailAddresses_emailAddress,operatingHours_exceptions_startDate,operatingHours_exceptions_name,operatingHours_exceptions_endDate,operatingHours_exceptions_exceptionHours.wednesday,operatingHours_exceptions_exceptionHours.monday,operatingHours_exceptions_exceptionHours.thursday,operatingHours_exceptions_exceptionHours.sunday,operatingHours_exceptions_exceptionHours.tuesday,operatingHours_exceptions_exceptionHours.friday,operatingHours_exceptions_exceptionHours.saturday
0,77E0D7F0-1942-494A-ACE2-9004D2BDC59E,https://www.nps.gov/abli/index.htm,Abraham Lincoln Birthplace National Historical Park,abli,"For over a century people from around the world have come to rural Central Kentucky to honor the humble beginnings of our 16th president, Abraham Lincoln. His early life on Kentucky's frontier shaped his character and prepared him to lead the nation through Civil War. Visit our country's first memorial to Lincoln, built with donations from young and old, and the site of his childhood home.",37.5858662,-85.67330523,"lat:37.5858662, long:-85.67330523",KY,The Birthplace Unit of the park is located approximately 2 miles south of the town of Hodgenville on U.S. Highway 31E South. The Boyhood Home Unit at Knob Creek is located approximately 10 miles northeast of the Birthplace Unit of the park.,http://www.nps.gov/abli/planyourvisit/directions.htm,"There are four distinct seasons in Central Kentucky. However, temperature and weather conditions can vary widely within those seasons. Spring and Fall are generally pleasant with frequent rain showers. Summer is usually hot and humid. Winter is moderately cold with mixed precipitation.",Abraham Lincoln Birthplace,National Historical Park,1.0,13A57703-BB1A-41A2-94B8-53B692EB7238,Astronomy,D10852A3-443C-4743-A5FA-6DD6D2A054B3,Birthplace,Memorial Building:\nopen 9:00 am - 4:30 pm eastern time.\n\nBirthplace Unit Visitor Center and Grounds: \nopen 9:00 am - 5:00 pm eastern time.,Birthplace Unit,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,42748,Hodgenville,KY,US,,2995 Lincoln Farm Road,Physical,,,NPS Photo,The Memorial Building with fall colors,The Memorial Building surrounded by fall colors,"Over 200,000 people a year come to walk up the steps of the Memorial Building to visit the site where Abraham Lincoln was born",https://www.nps.gov/common/uploads/structured_data/3C861078-1DD8-B71B-0B774A242EF6A706.jpg,2703583137,,,Voice,,ABLI_Administration@nps.gov,2024-11-28,Park is Closed,2024-11-28,,,,,,,
1,77E0D7F0-1942-494A-ACE2-9004D2BDC59E,https://www.nps.gov/abli/index.htm,Abraham Lincoln Birthplace National Historical Park,abli,"For over a century people from around the world have come to rural Central Kentucky to honor the humble beginnings of our 16th president, Abraham Lincoln. His early life on Kentucky's frontier shaped his character and prepared him to lead the nation through Civil War. Visit our country's first memorial to Lincoln, built with donations from young and old, and the site of his childhood home.",37.5858662,-85.67330523,"lat:37.5858662, long:-85.67330523",KY,The Birthplace Unit of the park is located approximately 2 miles south of the town of Hodgenville on U.S. Highway 31E South. The Boyhood Home Unit at Knob Creek is located approximately 10 miles northeast of the Birthplace Unit of the park.,http://www.nps.gov/abli/planyourvisit/directions.htm,"There are four distinct seasons in Central Kentucky. However, temperature and weather conditions can vary widely within those seasons. Spring and Fall are generally pleasant with frequent rain showers. Summer is usually hot and humid. Winter is moderately cold with mixed precipitation.",Abraham Lincoln Birthplace,National Historical Park,1.0,13A57703-BB1A-41A2-94B8-53B692EB7238,Astronomy,D10852A3-443C-4743-A5FA-6DD6D2A054B3,Birthplace,Memorial Building:\nopen 9:00 am - 4:30 pm eastern time.\n\nBirthplace Unit Visitor Center and Grounds: \nopen 9:00 am - 5:00 pm eastern time.,Birthplace Unit,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,42748,Hodgenville,KY,US,,2995 Lincoln Farm Road,Physical,,,NPS Photo,The Memorial Building with fall colors,The Memorial Building surrounded by fall colors,"Over 200,000 people a year come to walk up the steps of the Memorial Building to visit the site where Abraham Lincoln was born",https://www.nps.gov/common/uploads/structured_data/3C861078-1DD8-B71B-0B774A242EF6A706.jpg,2703583137,,,Voice,,ABLI_Administration@nps.gov,2024-12-25,Park is Closed,2024-12-25,,,,,,,
2,77E0D7F0-1942-494A-ACE2-9004D2BDC59E,https://www.nps.gov/abli/index.htm,Abraham Lincoln Birthplace National Historical Park,abli,"For over a century people from around the world have come to rural Central Kentucky to honor the humble beginnings of our 16th president, Abraham Lincoln. His early life on Kentucky's frontier shaped his character and prepared him to lead the nation through Civil War. Visit our country's first memorial to Lincoln, built with donations from young and old, and the site of his childhood home.",37.5858662,-85.67330523,"lat:37.5858662, long:-85.67330523",KY,The Birthplace Unit of the park is located approximately 2 miles south of the town of Hodgenville on U.S. Highway 31E South. The Boyhood Home Unit at Knob Creek is located approximately 10 miles northeast of the Birthplace Unit of the park.,http://www.nps.gov/abli/planyourvisit/directions.htm,"There are four distinct seasons in Central Kentucky. However, temperature and weather conditions can vary widely within those seasons. Spring and Fall are generally pleasant with frequent rain showers. Summer is usually hot and humid. Winter is moderately cold with mixed precipitation.",Abraham Lincoln Birthplace,National Historical Park,1.0,13A57703-BB1A-41A2-94B8-53B692EB7238,Astronomy,D10852A3-443C-4743-A5FA-6DD6D2A054B3,Birthplace,Memorial Building:\nopen 9:00 am - 4:30 pm eastern time.\n\nBirthplace Unit Visitor Center and Grounds: \nopen 9:00 am - 5:00 pm eastern time.,Birthplace Unit,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,42748,Hodgenville,KY,US,,2995 Lincoln Farm Road,Physical,,,NPS Photo,The Memorial Building with fall colors,The Memorial Building surrounded by fall colors,"Over 200,000 people a year come to walk up the steps of the Memorial Building to visit the site where Abraham Lincoln was born",https://www.nps.gov/common/uploads/structured_data/3C861078-1DD8-B71B-0B774A242EF6A706.jpg,2703583137,,,Voice,,ABLI_Administration@nps.gov,2025-01-01,Park is Closed,2025-01-01,,,,,,,
3,77E0D7F0-1942-494A-ACE2-9004D2BDC59E,https://www.nps.gov/abli/index.htm,Abraham Lincoln Birthplace National Historical Park,abli,"For over a century people from around the world have come to rural Central Kentucky to honor the humble beginnings of our 16th president, Abraham Lincoln. His early life on Kentucky's frontier shaped his character and prepared him to lead the nation through Civil War. Visit our country's first memorial to Lincoln, built with donations from young and old, and the site of his childhood home.",37.5858662,-85.67330523,"lat:37.5858662, long:-85.67330523",KY,The Birthplace Unit of the park is located approximately 2 miles south of the town of Hodgenville on U.S. Highway 31E South. The Boyhood Home Unit at Knob Creek is located approximately 10 miles northeast of the Birthplace Unit of the park.,http://www.nps.gov/abli/planyourvisit/directions.htm,"There are four distinct seasons in Central Kentucky. However, temperature and weather conditions can vary widely within those seasons. Spring and Fall are generally pleasant with frequent rain showers. Summer is usually hot and humid. Winter is moderately cold with mixed precipitation.",Abraham Lincoln Birthplace,National Historical Park,1.0,13A57703-BB1A-41A2-94B8-53B692EB7238,Astronomy,D10852A3-443C-4743-A5FA-6DD6D2A054B3,Birthplace,Memorial Building:\nopen 9:00 am - 4:30 pm eastern time.\n\nBirthplace Unit Visitor Center and Grounds: \nopen 9:00 am - 5:00 pm eastern time.,Birthplace Unit,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,42748,Hodgenville,KY,US,,2995 Lincoln Farm Road,Physical,,,NPS Photo,The Memorial Building with fall colors,The Memorial Building surrounded by fall colors,"Over 200,000 people a year come to walk up the steps of the Memorial Building to visit the site where Abraham Lincoln was born",https://www.nps.gov/common/uploads/structured_data/3C861078-1DD8-B71B-0B774A242EF6A706.jpg,2703583874,,,Fax,,ABLI_Administration@nps.gov,2024-11-28,Park is Closed,2024-11-28,,,,,,,
4,77E0D7F0-1942-494A-ACE2-9004D2BDC59E,https://www.nps.gov/abli/index.htm,Abraham Lincoln Birthplace National Historical Park,abli,"For over a century people from around the world have come to rural Central Kentucky to honor the humble beginnings of our 16th president, Abraham Lincoln. His early life on Kentucky's frontier shaped his character and prepared him to lead the nation through Civil War. Visit our country's first memorial to Lincoln, built with donations from young and old, and the site of his childhood home.",37.5858662,-85.67330523,"lat:37.5858662, long:-85.67330523",KY,The Birthplace Unit of the park is located approximately 2 miles south of the town of Hodgenville on U.S. Highway 31E South. The Boyhood Home Unit at Knob Creek is located approximately 10 miles northeast of the Birthplace Unit of the park.,http://www.nps.gov/abli/planyourvisit/directions.htm,"There are four distinct seasons in Central Kentucky. However, temperature and weather conditions can vary widely within those seasons. Spring and Fall are generally pleasant with frequent rain showers. Summer is usually hot and humid. Winter is moderately cold with mixed precipitation.",Abraham Lincoln Birthplace,National Historical Park,1.0,13A57703-BB1A-41A2-94B8-53B692EB7238,Astronomy,D10852A3-443C-4743-A5FA-6DD6D2A054B3,Birthplace,Memorial Building:\nopen 9:00 am - 4:30 pm eastern time.\n\nBirthplace Unit Visitor Center and Grounds: \nopen 9:00 am - 5:00 pm eastern time.,Birthplace Unit,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,9:00AM - 5:00PM,42748,Hodgenville,KY,US,,2995 Lincoln Farm Road,Physical,,,NPS Photo,The Memorial Building with fall colors,The Memorial Building surrounded by fall colors,"Over 200,000 people a year come to walk up the steps of the Memorial Building to visit the site where Abraham Lincoln was born",https://www.nps.gov/common/uploads/structured_data/3C861078-1DD8-B71B-0B774A242EF6A706.jpg,2703583874,,,Fax,,ABLI_Administration@nps.gov,2024-12-25,Park is Closed,2024-12-25,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10795,77E0D7F0-1942-494A-ACE2-9004D2BDC59E,https://www.nps.gov/abli/index.htm,Abraham Lincoln Birthplace National Historical Park,abli,"For over a century people from around the world have come to rural Central Kentucky to honor the humble beginnings of our 16th president, Abraham Lincoln. His early life on Kentucky's frontier shaped his character and prepared him to lead the nation through Civil War. Visit our country's first memorial to Lincoln, built with donations from young and old, and the site of his childhood home.",37.5858662,-85.67330523,"lat:37.5858662, long:-85.67330523",KY,The Birthplace Unit of the park is located approximately 2 miles south of the town of Hodgenville on U.S. Highway 31E South. The Boyhood Home Unit at Knob Creek is located approximately 10 miles northeast of the Birthplace Unit of the park.,http://www.nps.gov/abli/planyourvisit/directions.htm,"There are four distinct seasons in Central Kentucky. However, temperature and weather conditions can vary widely within those seasons. Spring and Fall are generally pleasant with frequent rain showers. Summer is usually hot and humid. Winter is moderately cold with mixed precipitation.",Abraham Lincoln Birthplace,National Historical Park,1.0,43800AD1-D439-40F3-AAB3-9FB651FE45BB,Gift Shop and Souvenirs,A7359FC4-DAD8-45F5-AF15-7FF62F816ED3,Night Sky,"The Boyhood Home Unit at Knob Creek Grounds:\nopen daily dawn to dusk.\n\nKnob Creek Tavern Visitor Center:\nopen on weekends in April, May, September and October and 5 days a week from Memorial Day to Labor day from 10 am to 4pm (Thursday - Monday) closed for the winter.",Boyhood Unit,Closed,10:00AM - 4:00PM,10:00AM - 4:00PM,10:00AM - 4:00PM,Closed,10:00AM - 4:00PM,10:00AM - 4:00PM,42748,Hodgenville,KY,US,,2995 Lincoln Farm Road,Mailing,,,NPS Photo,The Symbolic Birth Cabin of Abraham Lincoln,The symbolic birth cabin on the traditional site of the birth of Abraham Lincoln.,The symbolic birth cabin of Abraham Lincoln.,https://www.nps.gov/common/uploads/structured_data/3C86137D-1DD8-B71B-0B978BACD7EBAEF1.jpg,2703583874,,,Fax,,ABLI_Administration@nps.gov,2024-09-01,Fall Hours - Knob Creek Visitor Center,2024-10-31,Closed,Closed,Closed,10:00AM - 4:00PM,Closed,Closed,10:00AM - 4:00PM
10796,77E0D7F0-1942-494A-ACE2-9004D2BDC59E,https://www.nps.gov/abli/index.htm,Abraham Lincoln Birthplace National Historical Park,abli,"For over a century people from around the world have come to rural Central Kentucky to honor the humble beginnings of our 16th president, Abraham Lincoln. His early life on Kentucky's frontier shaped his character and prepared him to lead the nation through Civil War. Visit our country's first memorial to Lincoln, built with donations from young and old, and the site of his childhood home.",37.5858662,-85.67330523,"lat:37.5858662, long:-85.67330523",KY,The Birthplace Unit of the park is located approximately 2 miles south of the town of Hodgenville on U.S. Highway 31E South. The Boyhood Home Unit at Knob Creek is located approximately 10 miles northeast of the Birthplace Unit of the park.,http://www.nps.gov/abli/planyourvisit/directions.htm,"There are four distinct seasons in Central Kentucky. However, temperature and weather conditions can vary widely within those seasons. Spring and Fall are generally pleasant with frequent rain showers. Summer is usually hot and humid. Winter is moderately cold with mixed precipitation.",Abraham Lincoln Birthplace,National Historical Park,1.0,43800AD1-D439-40F3-AAB3-9FB651FE45BB,Gift Shop and Souvenirs,A7359FC4-DAD8-45F5-AF15-7FF62F816ED3,Night Sky,"The Boyhood Home Unit at Knob Creek Grounds:\nopen daily dawn to dusk.\n\nKnob Creek Tavern Visitor Center:\nopen on weekends in April, May, September and October and 5 days a week from Memorial Day to Labor day from 10 am to 4pm (Thursday - Monday) closed for the winter.",Boyhood Unit,Closed,10:00AM - 4:00PM,10:00AM - 4:00PM,10:00AM - 4:00PM,Closed,10:00AM - 4:00PM,10:00AM - 4:00PM,42748,Hodgenville,KY,US,,2995 Lincoln Farm Road,Mailing,,,NPS Photo,Statue of the Lincoln Family in the Visitor Center,Statue of the Lincoln family in the park's Visitor Center,Visitors to the park can view the statue of the Lincoln family.,https://www.nps.gov/common/uploads/structured_data/3C8614D1-1DD8-B71B-0B1AF72CA452B051.jpg,2703583137,,,Voice,,ABLI_Administration@nps.gov,2024-04-01,Spring Hours - Visitor Center at Knob Creek,2024-05-31,Closed,Closed,Closed,10:00AM - 4:00PM,Closed,Closed,10:00AM - 4:00PM
10797,77E0D7F0-1942-494A-ACE2-9004D2BDC59E,https://www.nps.gov/abli/index.htm,Abraham Lincoln Birthplace National Historical Park,abli,"For over a century people from around the world have come to rural Central Kentucky to honor the humble beginnings of our 16th president, Abraham Lincoln. His early life on Kentucky's frontier shaped his character and prepared him to lead the nation through Civil War. Visit our country's first memorial to Lincoln, built with donations from young and old, and the site of his childhood home.",37.5858662,-85.67330523,"lat:37.5858662, long:-85.67330523",KY,The Birthplace Unit of the park is located approximately 2 miles south of the town of Hodgenville on U.S. Highway 31E South. The Boyhood Home Unit at Knob Creek is located approximately 10 miles northeast of the Birthplace Unit of the park.,http://www.nps.gov/abli/planyourvisit/directions.htm,"There are four distinct seasons in Central Kentucky. However, temperature and weather conditions can vary widely within those seasons. Spring and Fall are generally pleasant with frequent rain showers. Summer is usually hot and humid. Winter is moderately cold with mixed precipitation.",Abraham Lincoln Birthplace,National Historical Park,1.0,43800AD1-D439-40F3-AAB3-9FB651FE45BB,Gift Shop and Souvenirs,A7359FC4-DAD8-45F5-AF15-7FF62F816ED3,Night Sky,"The Boyhood Home Unit at Knob Creek Grounds:\nopen daily dawn to dusk.\n\nKnob Creek Tavern Visitor Center:\nopen on weekends in April, May, September and October and 5 days a week from Memorial Day to Labor day from 10 am to 4pm (Thursday - Monday) closed for the winter.",Boyhood Unit,Closed,10:00AM - 4:00PM,10:00AM - 4:00PM,10:00AM - 4:00PM,Closed,10:00AM - 4:00PM,10:00AM - 4:00PM,42748,Hodgenville,KY,US,,2995 Lincoln Farm Road,Mailing,,,NPS Photo,Statue of the Lincoln Family in the Visitor Center,Statue of the Lincoln family in the park's Visitor Center,Visitors to the park can view the statue of the Lincoln family.,https://www.nps.gov/common/uploads/structured_data/3C8614D1-1DD8-B71B-0B1AF72CA452B051.jpg,2703583137,,,Voice,,ABLI_Administration@nps.gov,2024-09-01,Fall Hours - Knob Creek Visitor Center,2024-10-31,Closed,Closed,Closed,10:00AM - 4:00PM,Closed,Closed,10:00AM - 4:00PM
10798,77E0D7F0-1942-494A-ACE2-9004D2BDC59E,https://www.nps.gov/abli/index.htm,Abraham Lincoln Birthplace National Historical Park,abli,"For over a century people from around the world have come to rural Central Kentucky to honor the humble beginnings of our 16th president, Abraham Lincoln. His early life on Kentucky's frontier shaped his character and prepared him to lead the nation through Civil War. Visit our country's first memorial to Lincoln, built with donations from young and old, and the site of his childhood home.",37.5858662,-85.67330523,"lat:37.5858662, long:-85.67330523",KY,The Birthplace Unit of the park is located approximately 2 miles south of the town of Hodgenville on U.S. Highway 31E South. The Boyhood Home Unit at Knob Creek is located approximately 10 miles northeast of the Birthplace Unit of the park.,http://www.nps.gov/abli/planyourvisit/directions.htm,"There are four distinct seasons in Central Kentucky. However, temperature and weather conditions can vary widely within those seasons. Spring and Fall are generally pleasant with frequent rain showers. Summer is usually hot and humid. Winter is moderately cold with mixed precipitation.",Abraham Lincoln Birthplace,National Historical Park,1.0,43800AD1-D439-40F3-AAB3-9FB651FE45BB,Gift Shop and Souvenirs,A7359FC4-DAD8-45F5-AF15-7FF62F816ED3,Night Sky,"The Boyhood Home Unit at Knob Creek Grounds:\nopen daily dawn to dusk.\n\nKnob Creek Tavern Visitor Center:\nopen on weekends in April, May, September and October and 5 days a week from Memorial Day to Labor day from 10 am to 4pm (Thursday - Monday) closed for the winter.",Boyhood Unit,Closed,10:00AM - 4:00PM,10:00AM - 4:00PM,10:00AM - 4:00PM,Closed,10:00AM - 4:00PM,10:00AM - 4:00PM,42748,Hodgenville,KY,US,,2995 Lincoln Farm Road,Mailing,,,NPS Photo,Statue of the Lincoln Family in the Visitor Center,Statue of the Lincoln family in the park's Visitor Center,Visitors to the park can view the statue of the Lincoln family.,https://www.nps.gov/common/uploads/structured_data/3C8614D1-1DD8-B71B-0B1AF72CA452B051.jpg,2703583874,,,Fax,,ABLI_Administration@nps.gov,2024-04-01,Spring Hours - Visitor Center at Knob Creek,2024-05-31,Closed,Closed,Closed,10:00AM - 4:00PM,Closed,Closed,10:00AM - 4:00PM


In [69]:
import requests
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

def get_openaq_data(city, api_key):
    """
    Retrieves air quality data from the OpenAQ API for the specified city.

    Parameters:
    city (str): Name of the city to retrieve data for
    api_key (str): The API key for the OpenAQ API
    
    Returns:
    pd.DataFrame: A DataFrame containing air quality data
    """
    api_endpoint = "https://api.openaq.org/v2/latest"
    params = {"city": city, "limit": 100}
    headers = {"X-API-Key": api_key}

    response = requests.get(api_endpoint, params=params, headers=headers)
    data = response.json()
    measurements_df = pd.DataFrame([x['measurements'][0] for x in data['results']])
    
    return measurements_df

def get_weather_data(city):
    """
    Retrieves current weather data for the specified city from the Weatherbit API.

    Parameters:
    city (str): Name of the city to retrieve data for
    
    Returns:
    pd.DataFrame: A DataFrame containing weather data
    """
    api_endpoint = f"https://api.weatherbit.io/v2.0/current?city={city}&key=da2eb0c8c1514f0fa82aa44697aaf637"
    response = requests.get(api_endpoint)
    data = response.json()
    weather_df = pd.DataFrame(data['data'])
    
    return weather_df

def analyze_data(openaq_df, weather_df):
    """
    Merges and analyzes data from the OpenAQ and the weather API dataframes 
    to find correlations between air quality and weather patterns.
    Visualizes the findings using plots.

    Parameters:
    openaq_df (pd.DataFrame): The DataFrame containing OpenAQ data
    weather_df (pd.DataFrame): The DataFrame containing weather data
    
    Returns:
    None
    """
    # Merge the data from the two sources based on common parameters (like date and time)
    merged_df = pd.concat([openaq_df, weather_df], axis=1)
    
    # Perform data cleaning and transformation operations
    merged_df.dropna(inplace=True)
    
    # Conduct data analysis to find correlations between weather patterns and air quality
    correlation_matrix = merged_df.corr()
    
    # Visualize the correlation matrix using a heatmap
    plt.figure(figsize=(10,8))
    plt.title('Correlation Matrix')
    sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm')
    plt.show()

# Define API key and city
OPENAQ_API_KEY = "89443a95ef5309fe5b32b84e02ec074e4133a373909e2da9a21530103ca1575e"
CITY = "London"

# Get data from APIs
openaq_data = get_openaq_data(CITY, OPENAQ_API_KEY)
weather_data = get_weather_data(CITY)

# Analyze data
analyze_data(openaq_data, weather_data)

KeyError: 'data'

In [72]:
import requests
import pandas as pd

def fetch_openaq_data(city='Los Angeles', parameter='pm25'):
    """
    Fetch air quality data for a specific city from OpenAQ.
    """
    base_url = 'https://api.openaq.org/v1/measurements'
    query_params = {
        'city': city,
        'parameter': parameter,
        'limit': 100,
        'api_key': '89443a95ef5309fe5b32b84e02ec074e4133a373909e2da9a21530103ca1575e'
    }
    response = requests.get(base_url, params=query_params)
    data = response.json()
    return pd.DataFrame(data['results'])

def fetch_weather_data(city='Los Angeles', api='weatherapi'):
    """
    Fetch weather data for a specific city using either WeatherAPI.com or Weatherbit.
    """
    if api == 'weatherapi':
        base_url = 'http://api.weatherapi.com/v1/current.json'
        query_params = {
            'key': 'cbabb10b93e04bfc846183054240902',
            'q': city,
            'aqi': 'yes'
        }
    elif api == 'weatherbit':
        base_url = 'https://api.weatherbit.io/v2.0/current'
        query_params = {
            'key': 'YOUR_WEATHERBIT_KEY',
            'city': city,
            'include': 'aqi'
        }
    response = requests.get(base_url, params=query_params)
    data = response.json()
    return pd.DataFrame([data['data']]) if api == 'weatherbit' else pd.DataFrame([data['current']])

def analyze_correlation(df1, df2, column1='pm25', column2='temp_c'):
    """
    Analyze correlation between air quality (e.g., PM2.5) and temperature.
    """
    combined_df = pd.merge(df1, df2, left_on='date', right_on='date', how='inner')
    correlation = combined_df[[column1, column2]].corr()
    return correlation

# Example usage:
openaq_data = fetch_openaq_data()
weather_data = fetch_weather_data()

# Assuming both dataframes have a common 'date' column after preprocessing
correlation_result = analyze_correlation(openaq_data, weather_data)
print(correlation_result)

KeyError: 'date'