# Entry 4 - Get the Data

In Entry 3, I defined my problem as:

**Holding all other factors constant, what mass is needed to retain an atmosphere on Mars?**

## The Problem

This sounds like a dataset I'm going to have to create myself.

*[Hands on Machine Learning with Scikit-Learn & TensorFlow](https://www.amazon.com/Hands-Machine-Learning-Scikit-Learn-TensorFlow/dp/1491962291)* recommends automating as much of the data acquisition process as possible, but this is a one-off dataset and the known parameters of planets doesn't change very often, so I'm not going to worry about that in this entry. If I were working on a project where I would connect to the data source again, like a twitter NLP project, I would most certainly want to automate pulling data. Sounds like a project for another dairy entry on a different mini-project.

If I were going to do any automation of this dataset, it would revolve around scraping table data from an HTML page. 

## The Options

The type of entities from which to draw the necessary data is rather limited. There are the planets, moons, and dwarf planets of this solar system and possibly exoplanets of other systems.

## The Proposed Solution

Fortunately, I didn't have to comb through information on each and every planetary body individually. The [planetary fact sheet](https://nssdc.gsfc.nasa.gov/planetary/factsheet/) has many of the features I need. This included 8 planets, 1 moon, and 1 dwarf planet. Starting with this as a base, I gathered 27 features on 11 planetary bodies.

I considered including more moons (Jupiter has 79, Saturn 82, Uranus 27, and Neptune 14), but couldn't find sufficient information on the necessary features. Most importantly was a lack of information on atmospheric mass.
The same was true for exoplanets. They're just too far away to have good measurements.

In [1]:
import pandas as pd
planets = pd.read_excel('../data/planets_moons.xlsx')

In [2]:
print('The planetary bodies are:')
print(planets['name'].tolist())
print('The features are:')
print(planets.columns.tolist(), '\n')

The features are:
['name', 'type', 'mass_1024kg', 'diameter_km', 'density_kg_m3', 'gravity_m_s2', 'escape_vel_km_s', 'rotation_period_hr', 'day_len_hr', 'distance_from_sun_106_km', 'perihelion_106\xa0km', 'aphelion_106\xa0km', 'orbital_period_days', 'orbital_velocity_km_s', 'orbital_inclination_degrees', 'orbital_eccentricity', 'obliquity_to_orbit_degrees', 'mean_temp_c', 'surface_pressure_bars', 'nbr_moons', 'rings', 'magnetic_field', 'equatorial_radius_km', 'mean_radius_km', 'V(1,0) (mag)', 'geometric_albedo', 'atmospheric_mass_kg'] 

The planetary bodies are:
['Mercury', 'Venus', 'Earth', 'Moon', 'Mars', 'Jupiter', 'Saturn', 'Titan', 'Uranus', 'Neptune', 'Pluto']


In [3]:
planets

Unnamed: 0,name,type,mass_1024kg,diameter_km,density_kg_m3,gravity_m_s2,escape_vel_km_s,rotation_period_hr,day_len_hr,distance_from_sun_106_km,...,mean_temp_c,surface_pressure_bars,nbr_moons,rings,magnetic_field,equatorial_radius_km,mean_radius_km,"V(1,0) (mag)",geometric_albedo,atmospheric_mass_kg
0,Mercury,planet,0.33,4879.0,5427,3.7,4.3,1407.6,4222.6,57.9,...,167,1e-14,0,No,Yes,2440.53,2439.4,-0.6,0.106,1000.0
1,Venus,planet,4.87,12104.0,5243,8.9,10.4,-5832.5,2802.0,108.2,...,464,92.0,0,No,No,6051.8,6051.8,-4.47,0.65,4.8e+20
2,Earth,planet,5.97,12756.0,5514,9.8,11.2,23.9,24.0,149.6,...,15,1.014,1,No,Yes,6378.1366,6371.0084,-3.86,0.367,1.4e+21
3,Moon,moon,0.073,3475.0,3340,1.6,2.4,655.7,708.7,149.6,...,-20,3e-15,0,No,No,1737.5,1737.4,-0.08,0.12,100000.0
4,Mars,planet,0.642,6792.0,3933,3.7,5.0,24.6,24.7,227.9,...,-65,0.01,2,No,No,3396.19,3389.5,-1.52,0.15,2.5e+16
5,Jupiter,planet,1898.0,142984.0,1326,23.1,59.5,9.9,9.9,778.6,...,-110,2.0,79,Yes,Yes,71492.0,69911.0,-9.4,0.52,1.9e+27
6,Saturn,planet,568.0,120536.0,687,9.0,35.5,10.7,10.7,1433.5,...,-140,1000.0,82,Yes,Yes,60268.0,58232.0,-8.88,0.47,5.4e+26
7,Titan,moon,0.126,5149.4,1882,1.4,2.6,382.0,382.0,1433.5,...,-179,1.6,0,No,No,2574.7,2574.7,-8.1,0.21,9.1e+18
8,Uranus,planet,86.8,51118.0,1271,8.7,21.3,-17.2,17.2,2872.5,...,-195,1000.0,27,Yes,Yes,25559.0,25362.0,-7.19,0.51,8.6e+25
9,Neptune,planet,102.0,49528.0,1638,11.0,23.5,16.1,16.1,4495.1,...,-200,1000.0,14,Yes,Yes,24764.0,24622.0,-6.87,0.41,1e+26


## The Failure

This step was virtually error free. Other than spending hours looking at a bunch of different sources and trying to find more information on moons and exoplanets, it went pretty smoothly.

The limited number of examples (only 11) may be problematic in future steps. For example, measuring how well the model performs could prove extremely challenging.

## Next Up

Explore the data.

#### Sources:

The planetary body data was retrieved from a variety of sources. There were also a couple of one-off searches for some of the more obscure information, like the V(1,0)(mag) for Titan. These were the major contributors:

- https://spacemath.gsfc.nasa.gov/Grade67/10Page7.pdf
- https://www.sciencedirect.com/topics/earth-and-planetary-sciences/planetary-atmosphere
- https://ssd.jpl.nasa.gov/?sat_phys_par
- https://ssd.jpl.nasa.gov/?planet_phys_par
- https://www.windows2universe.org/our_solar_system/planets_table.html
- https://www.windows2universe.org/our_solar_system/moons_table.html
- https://nssdc.gsfc.nasa.gov/planetary/factsheet/
- https://solarsystem.nasa.gov/moons/saturn-moons/titan/by-the-numbers/![image.png](attachment:image.png)

