# Tracking Near-Earth Objects (NEOs): Exploratory Data Analysis

## Data Collection: 
For this project, I obtained a dataset from NASA that provides detailed information about the NEOs and their approximate distance from Earth. The data is extracted via the NASA API for the month of January 2023. Due to the API's 7 days data fetch limit, the data is collected in 4 separate chunks, each corresponding to 7-day period in the month of January 2023. These chunks are then, merged to form a complete dataset for the entire month of January.

Source: NASA Asteroids NeoWs (Near Earth Object Web Service)
Time Period: January 2023

In [52]:
import requests  # Import the requests module to make HTTP requests
import json      # Import the json module to handle JSON data
import pandas as pd
# NASA API Key
api_key = '43CeF9mMNkbz8M1IhqOEYi3ici40QBdXXT2IXwMX'

# Function to make a request to the NASA API and fetch the data
def get_nasa_data(start_date, end_date, api_key):
    try:
        # Base URL for NASA's API
        base_url = "https://api.nasa.gov/neo/rest/v1/feed"
        # Make the GET request to the NASA API with the specified parameters (date range and API key)
        request = requests.get(base_url, params={'start_date': start_date, 'end_date': end_date, 'api_key': api_key})

        # Check if the request was successful (status code 200)
        if request.status_code == 200:
            # Print response to verify data
            print(f"Data fetched for {start_date} to {end_date}")
            return request.json()  # Return the data as JSON
        else:
            print(f"Failed to retrieve data for {start_date} to {end_date}, Status Code: {request.status_code}")
            return None
    except:
        print(f"An error occurred")
        return None

In [54]:
date_ranges = [
    ("2023-01-01", "2023-01-07"),
    ("2023-01-08", "2023-01-14"),
    ("2023-01-15", "2023-01-21"),
    ("2023-01-22", "2023-01-28")
]

# Initialize an empty list to hold the combined data
neo_data = []

# Loop through each date range, fetch the data and append it to the list
for start_date, end_date in date_ranges:
    data = get_nasa_data(start_date, end_date, api_key)
    if data:
        neo_data.append(data)  # Append data if fetched successfully
    else:
        print(f"No data for {start_date} to {end_date}")

# Check if neo_data is not empty and then print the first 5 entries 
if neo_data:
    neo_df = pd.DataFrame(neo_data)  # Print the combined data in a readable format
else:
    print("No data was retrieved for the given date ranges.")
print(neo_df)
neo_df.head()
print(neo_df['near_earth_objects'])

Data fetched for 2023-01-01 to 2023-01-07
Data fetched for 2023-01-08 to 2023-01-14
Data fetched for 2023-01-15 to 2023-01-21
Data fetched for 2023-01-22 to 2023-01-28
                                               links  element_count  \
0  {'next': 'http://api.nasa.gov/neo/rest/v1/feed...            115   
1  {'next': 'http://api.nasa.gov/neo/rest/v1/feed...            122   
2  {'next': 'http://api.nasa.gov/neo/rest/v1/feed...            108   
3  {'next': 'http://api.nasa.gov/neo/rest/v1/feed...            133   

                                  near_earth_objects  
0  {'2023-01-01': [{'links': {'self': 'http://api...  
1  {'2023-01-11': [{'links': {'self': 'http://api...  
2  {'2023-01-20': [{'links': {'self': 'http://api...  
3  {'2023-01-22': [{'links': {'self': 'http://api...  
0    {'2023-01-01': [{'links': {'self': 'http://api...
1    {'2023-01-11': [{'links': {'self': 'http://api...
2    {'2023-01-20': [{'links': {'self': 'http://api...
3    {'2023-01-22': [{'links': {'sel

## Data Extraction and Transformation for NEOs
Now that, I have extracted the data from the NASA API and collected all the 4 weeks of data, I will tranform the raw JSON data into a structured format using Python's `pandas` library. The goal is to extract the key details about the NEOs (Near Earth Objects) and use them as variables.
The key variables chosen here are:
1. `id`: a unique identifier for each NEO
2. `name`: the name of the NEO
3. `absolute_magnitude_h`: the brightness of the NEO, which is an indicator of its size and reflectivity
4. `estimated_diameter_min_km` & `estimated_diameter_max_km`: the minimun and maximum diameter of the NEO in kilometers.
5. `is_potentially_hazardous_asteroid`: 
6. `close_approach_data`:
7. `is_sentry_object`:

In [57]:
"""
extract the 'near_earth_objects' data to form the dataframe
"""

#initialize the neo_df as empty list to store the new data
df =[]

# create a loop through the NEO data
for each_week in neo_df['near_earth_objects']:
    for date, objects in each_week.items():
        for neo in objects:
            df.append({
                'ID': neo['id'], 'Name': neo['name'], 'Abs_magnitude': neo['absolute_magnitude_h'], 'Min_diameter': neo['estimated_diameter']['kilometers']['estimated_diameter_min'],
                'Max_diameter': neo['estimated_diameter']['kilometers']['estimated_diameter_max'], 'Potential_hazard': neo['is_potentially_hazardous_asteroid'], 
                'Close_approach_date': neo['close_approach_data'][0]['close_approach_date'], 'Relative_velocity': neo['close_approach_data'][0]['relative_velocity']['kilometers_per_second'], 
                'Miss_distance': neo['close_approach_data'][0]['miss_distance']['kilometers'], 'Orbiting_body': neo['close_approach_data'][0]['orbiting_body'], 
                'Sentry_object': neo['is_sentry_object'], 
            })

df = pd.DataFrame(df)
print(df.head())


        ID               Name  Abs_magnitude  Min_diameter  Max_diameter  \
0  2154347  154347 (2002 XK4)          16.08      1.616423      3.614431   
1  2385186  385186 (1994 AW1)          17.64      0.788052      1.762138   
2  2453309  453309 (2008 VQ4)          19.51      0.333085      0.744801   
3  3683468       (2014 QR295)          18.39      0.557898      1.247498   
4  3703782        (2015 AE45)          25.30      0.023150      0.051765   

   Potential_hazard Close_approach_date Relative_velocity       Miss_distance  \
0             False          2023-01-01     27.3921993676  49550754.592860912   
1              True          2023-01-01     12.9241938417  33403488.692868118   
2             False          2023-01-01       5.822172435  39565961.760205039   
3             False          2023-01-01     16.1804693508  39330824.516289241   
4             False          2023-01-01      6.8621510862   8526777.284930033   

  Orbiting_body  Sentry_object  
0         Earth        