# Final Project Proposal
### Leo Jia and Mark Sanghera

# Dataset Description

## Source
The dataset is sourced from the **Jet Propulsion Laboratory (JPL) Small-Body Database**, maintained by the California Institute of Technology under NASA. It is publicly available on the JPL Small-Body Database Search Engine.

## Format
The dataset is provided in a csv, containing various columns related to asteroid characteristics.

## Contents
The dataset includes detailed information about asteroids, such as their physical and orbital properties. Below is a brief description of the key attributes:

### Key Attributes
- **SPK-ID**: Object primary SPK-ID (unique identifier for the asteroid).
- **Object ID**: Internal database ID for the object.
- **Object Fullname**: Full name or designation of the asteroid.
- **pdes**: Primary designation of the asteroid.
- **name**: International Astronomical Union (IAU) name of the asteroid.
- **NEO**: Near-Earth Object flag, indicating if the object is a Near-Earth Object (1) or not (0).
- **PHA**: Potentially Hazardous Asteroid flag, indicating if the asteroid poses a threat (1) or not (0).
- **H**: Absolute magnitude parameter.
- **Diameter**: Estimated diameter of the asteroid in kilometers.
- **Albedo**: Geometric albedo (reflectivity) of the asteroid.
- **Diameter_sigma**: 1-sigma uncertainty in the diameter measurement (km).
- **Orbit_id**: Orbit solution ID.
- **Epoch**: Epoch of osculation in modified Julian day form.
- **Equinox**: Reference frame equinox.
- **e**: Orbital eccentricity.
- **a**: Semi-major axis (in astronomical units).
- **q**: Perihelion distance (in astronomical units).
- **i**: Orbital inclination; angle with respect to the ecliptic plane.
- **tp**: Time of perihelion passage.
- **moid_ld**: Earth Minimum Orbit Intersection Distance (in astronomical units).

## Class Information
For this project, the **PHA (Potentially Hazardous Asteroid) flag** will serve as the class information. The task is to predict whether an asteroid is potentially hazardous (`PHA = 1`) or not (`PHA = 0`). This classification will be based on attributes such as:
- Diameter
- Orbital eccentricity (`e`)
- Inclination (`i`)
- Earth Minimum Orbit Intersection Distance (`moid_ld`)
- Semi-major axis (`a`)


# Implementation/Technical Merit
- Anticipated challenges in pre-processing and/or classification
  - The large number of attributes may lead to curse of dimensionality
  - Cleaning the dataset such that there is no major loss in information
  - Irrelavent features
  - Removing bias
  - Selecting the best classification algorithm
- How we will explore feature selection techniques to pare down the attributes
  - Correlation analysis
  - Statistical tests
  - Good Judgement

# Potential Impact of the Results

## Why Are These Results Useful?
The results of this project are highly significant as they contribute to:
- **Planetary Defense**: By accurately predicting whether an asteroid is potentially hazardous, this project aids in early detection and risk assessment, enabling timely measures to mitigate potential threats to Earth.
- **Astronomical Research**: The insights gained can enhance our understanding of asteroid properties, their behavior, and their interaction with Earth's orbit, advancing scientific research in the field of astronomy.
- **Public Awareness and Safety**: The classification of hazardous asteroids can inform global initiatives and public outreach campaigns, ensuring preparedness and safety.

## Stakeholders
The following stakeholders will benefit from and be interested in the outcomes of this project:

### Primary Stakeholders
- **Planetary Defense Agencies**: Organizations such as NASA's Planetary Defense Coordination Office (PDCO) and the European Space Agency (ESA) that are directly responsible for monitoring and mitigating asteroid threats.
- **Astronomers and Researchers**: Scientists studying the physical and orbital properties of asteroids will find the classification insights valuable for ongoing research.
- **Government and Policy Makers**: Decision-makers who fund and develop space programs focused on asteroid impact prevention and disaster response.

### Secondary Stakeholders
- **Space Exploration Companies**: Organizations like SpaceX and Blue Origin that may utilize asteroid data for mission planning and resource identification.
- **Educators and Students**: The results can serve as a learning resource for academic programs in planetary science, data science, and astronomy.
- **General Public**: Individuals who benefit from increased awareness of space hazards and reassurance regarding Earth's safety.

