# Final Project

Use the NASA JPL Small-Body Database to Find Relationships Between Asteroid Parameters

There are about 1.2 million asteroid data in the NASA JPL Small-Body Database, however not every data has complete parameters, in this project we will have two topics:
1) Asteroid Mining: 

We will use several orbital parameters, probably including asteroid size, mass (if available) and bulk albedo to predict possible spectral classifications. It is actually a classification challenge, we expect to use at least two different methods here, such as K-Means and PCA.
[Science Projects: Asteroid Mining](https://www.sciencebuddies.org/science-fair-projects/project-ideas/Astro_p038/astronomy/asteroid-mining-gold-rush-in-space)

2) Asteroid Mass Estimator:

There has been little quantitative analysis of asteroid masses in the NASA JPL dataset,  the majority of the data points lack GM values. That is because an accurate asteroid masses estimation usually requires close encounters or high-accuracy orbit observation of a multiple-body system. However, mass is one of the most crucial asteroid features. One may determine the potential interior composition of an asteroid by combining its mass and size. The makeup of asteroids is a reflection of the accretion and collisional environment that existed in the early solar system. Therefore, it could be useful if we could predict the possible mass from the known parameters. In this part, we will build a neural network to give the possible mass range of asteroids with unknown mass.

However, considering that quality is affected by multiple observables, only 15 GM data points are likely to be insufficient for even the most preliminary analysis. [Density of asteroids by B. Carry](https://arxiv.org/pdf/1203.4336.pdf) is the first comprehensive paper review with mass estimates for roughly 250 asteroids. 

Additionally, as the volume is the third power of the dimension, the inferred mass error could result from inaccurate dimensions, which would result in a threefold increase in the error.

3) Other Useful Links:

[List of exceptional asteroids on Wiki](https://en.wikipedia.org/wiki/List_of_exceptional_asteroids)

### Read data

In [5]:
import csv
#opening the csv file by specifying
with open('sbdb_asteroids.csv') as csv_file:
    # Creating an object of csv reader
    csv_reader = csv.reader(csv_file, delimiter = ',')
    columns = []
 
    # loop to iterate through the rows of csv
    for row in csv_reader:
        # Write columns
        columns.append(row)
# printing the result
column_names = columns[0]
print("List of column names: ", columns[0])
print("Total data point number: "+ str(len(columns)))

List of column names:  ['spec_B', 'spec_T', 'full_name', 'diameter', 'extent', 'albedo', 'a', 'q', 'i', 'GM', 'rot_per', 'BV', 'UB', 'IR']
Total data point number: 1242581


In [8]:
for i in range(len(column_names)):
    non_empty_number = 0
    for data in columns[1:]:
        if data[i] !='':
            non_empty_number += 1
    print("The number of data with known "+column_names[i]+": "+ str(non_empty_number))

The number of data with known spec_B: 1666
The number of data with known spec_T: 980
The number of data with known full_name: 1242580
The number of data with known diameter: 139680
The number of data with known extent: 20
The number of data with known albedo: 138546
The number of data with known a: 1242580
The number of data with known q: 1242580
The number of data with known i: 1242580
The number of data with known GM: 15
The number of data with known rot_per: 33350
The number of data with known BV: 1021
The number of data with known UB: 979
The number of data with known IR: 1
