# Homework 2: Exploring Solar System Bodies (Pandas introduction)
Welcome to Assignment 12!

In this assignment, we will analyze data about celestial bodies in the solar system using Python, NumPy, and Pandas. The goals of this assignment are to:

 - Open a simple dataset formatted as JSON using pandas.
 - Apply simple statistical analysis to real-world data.
 - Refine Python programming skills through hands-on practice.
 - Ensure you can run Python and Python notebook environments (e.g., Jupyter Notebook, JupyterLab, Collab, VSCode) and troubleshoot any setup issues.

A key part of this homework is verifying that you can successfully run Python notebooks. If you encounter any difficulties, seek help from the instructor or AIs. Additionally, use Slack to ask questions or share insights. If you see a classmate struggling, helping them out will be great for a collaborative learning environment (and may count extra points in engagement ðŸ˜€).

In [1]:
# if you are running this notebook in your local machine,
# make sure you have all the dependencies installed
# uncomment the following lines to install the dependencies
# This may be needed if you are running this notebook in online
# environments such as Google Colab
#
# !pip install numpy pandas
#
# also copy the data file to the same directory as this notebook
# and update the paths accordingly

### Instructions

1. Follow the instructions on how to setup your Python and Jupyter (or VSCode) environment and cloning or downloading our repository. Instructions can be found in the class notes.
2. Ensure that you have Python, Jupyter Notebook, and the necessary libraries installed (`NumPy` and `Pandas`).
3. Load the dataset `Datasets/sol_data.json` into a Pandas DataFrame.
4. Answer the questions below by writing Python code.
5. No plots or visualizations are requiredâ€”your insights should come from code-based analysis and outputs.

### Dataset Overview
The dataset contains information about celestial objects, including:
- **isPlanet**: Indicates whether the object is a planet (`True` or `False`).
- **isDwarfPlanet**: Indicates whether the object is a dwarf planet (`True` or `False`).
- **orbit_type**: Classifies the object as "Primary" (planets) or "Secondary" (moons).
- Physical and orbital properties, such as **mass**, **density**, **meanRadius**, **gravity**, **sideralOrbit**, and more.


### Submission Guidelines

- Submit your completed notebook as a HTML export, or a PDF file.

To export to HTML, if you are on Jupyter, select `File` > `Export Notebook As` > `HTML`.

If you are on VSCode, you can use the `Jupyter: Export to HTML` command.
 - Open the command palette (Ctrl+Shift+P or Cmd+Shift+P on Mac).
    - Search for `Jupyter: Export to HTML`.
    - Save the HTML file to your computer and submit it via Canvas.

---

> **Hint:** If you are learning pandas, check out our tutorials or the official documentation:
> - [Pandas Getting started](https://pandas.pydata.org/docs/getting_started/intro_tutorials/index.html)
> - [Pandas DataFrame API Documentation](https://pandas.pydata.org/docs/reference/frame.html)
> - [Our lecture on Pandas](https://filipinascimento.github.io/usable_ai/panda_basics)
> 
> 
> **Using Generative AI Responsibly**
>
> You're welcome to use Generative AI to assist your learning, but focus on understanding the concepts rather than just solving the assignment. For example:
>
> - Instead of asking: `What's the code to count moons orbiting each planet?`
> - Try asking: `How can I use Pandas to group and count values? Can you provide examples? Can you explain the steps?`
>
> This way, you will learn how the solution works while building your skills. Remember to give context to the generative AI, so it can better assist you. Talk to the instructor and AIs if you have any questions or need insights.

In [2]:
# Local directory
import os
print(os.getcwd())

c:\Ricardo\2025-02 SP25 USABLE ARTIFICIAL INTELLIGENCE\GitHub\usable_ai\Homework


In [3]:
import pandas as pd
import numpy as np

# Load the dataset
data = pd.read_json('../Datasets/sol_data.json')    # only need to go up one folder
# The ../../ are needed to go back two levels in the directory structure.
# Note that the path is relative to the location of the notebook file. Double check
# if the path is correct based on your system
data.head()

Unnamed: 0,eName,isPlanet,isDwarfPlanet,semimajorAxis,perihelion,aphelion,eccentricity,inclination,density,gravity,...,orbits,bondAlbido,geomAlbido,RV_abs,p_transit,transit_visibility,transit_depth,massj,semimajorAxis_AU,grav_int
0,Moon,False,False,384400,363300,405500,0.0549,5.145,3.344,1.62,...,Earth,,,,1.811589,326.086108,2.2e-09,3.9e-05,0.00257,6.606324e+25
1,Phobos,False,False,9376,9234,9518,0.0151,1.075,1.9,0.0057,...,Mars,,,,74.272078,13368.973976,2.2e-09,0.0,6.3e-05,1.601437e+22
2,Deimos,False,False,23458,23456,23471,0.0002,1.075,1.75,0.003,...,Mars,,,,29.686035,5343.486231,2.2e-09,0.0,0.000157,5.792534e+20
3,Io,False,False,421800,0,0,0.004,0.036,3.53,1.79,...,Jupiter,,,,1.6552,297.93606,6.8425e-06,4.7e-05,0.00282,6.666188e+25
4,Europa,False,False,671100,0,0,0.009,0.466,3.01,1.31,...,Jupiter,,,,1.039939,187.188949,5.024e-06,2.5e-05,0.004486,1.415488e+25


### 1. General Information

- How many objects are in the dataset?
- How many are planets? How many are moons?


In [92]:
# Total number of objects
# Fill in code to calculate total number of objects
# Shape of data
print('Data shape:', data.shape)
print('Rows:',data.shape[0],'\tCols:',data.shape[1])

print('\nFor reference')
print('\nColumns names\n', data.columns.to_list())

Data shape: (265, 32)
Rows: 265 	Cols: 32

For reference

Columns names
 ['eName', 'isPlanet', 'isDwarfPlanet', 'semimajorAxis', 'perihelion', 'aphelion', 'eccentricity', 'inclination', 'density', 'gravity', 'escape', 'meanRadius', 'equaRadius', 'polarRadius', 'flattening', 'dimension', 'sideralOrbit', 'sideralRotation', 'discoveryDate', 'mass_kg', 'volume', 'orbit_type', 'orbits', 'bondAlbido', 'geomAlbido', 'RV_abs', 'p_transit', 'transit_visibility', 'transit_depth', 'massj', 'semimajorAxis_AU', 'grav_int']


In [5]:
# Number of planets
# Fill in code to calculate number of planets

# Select rows where isPlanet column is True - Assuming that the flag will only have true for Planets
planets = data[data['isPlanet'] == True]

print('Number of planets:', planets['eName'].count())
print('List of planets:', planets['eName'].to_list())

Number of planets: 8
List of planets: ['Uranus', 'Neptune', 'Jupiter', 'Mars', 'Mercury', 'Saturn', 'Earth', 'Venus']


In [74]:
# Number of moons
# Fill in code to calculate number of moons

# Select rows where 'isPlanet' is False and 'orbit_type' is 'Secondary' and 'isDwarfPlanet' is False :: Moons
# Reference https://science.nasa.gov/solar-system/moons/
moons = data[(data['isPlanet'] == False) & (data['orbit_type'] == 'Secondary') & (data['isDwarfPlanet'] == False)]

print('Number of moons:', moons['eName'].count())

Number of moons: 205


> **Hint**: By moon we mean a natural satellite of a planet or another object in the solar system. Take a look at the columns and see if you can identify the criteria for classifying an object as a moon. Ask the instructor or AIs for help if needed. 

### 2. Planets

- What is the mean density of all planets?
- Which planet has the highest surface gravity, and what is its gravity value?
- List all planets in descending order of their mass.


In [93]:

# Mean density of all planets
# Fill in code

print('Mean density of Planets:', planets['density'].mean())

# Planet with the highest surface gravity
# Fill in code

planet_highest_gravity = planets.sort_values(by='gravity', ascending=False).iloc[0]    # position 0 is the max

print('\nPlanet with the highest surface gravity:', planet_highest_gravity['eName'], planet_highest_gravity['gravity'])

# Planets by descending mass
# Fill in code

# Assuming field 'mass_kg' is actually the mass in kg of a given object
planets_by_mass = planets.sort_values(by='mass_kg', ascending=False)

print('\nPlanets by mass descending:')
display(planets_by_mass[['eName', 'mass_kg']])


Mean density of Planets: 3.1301375

Planet with the highest surface gravity: Jupiter 24.79

Planets by mass descending:


Unnamed: 0,eName,mass_kg
238,Jupiter,1.9e+27
241,Saturn,5.68e+26
219,Neptune,1.02e+26
199,Uranus,8.68e+25
243,Earth,5.97e+24
244,Venus,4.87e+24
239,Mars,6.42e+23
240,Mercury,3.3e+23


### 3. Moons (Satellites)
- How many moons orbit each planet? Present this as a table or dictionary.
- What is the average radius (meanRadius) of all moons?
- Compare the average surface gravity of moons to that of planets.


In [108]:
# Number of moons orbiting each planet
# Fill in code

# Groping by 'orbits' no matter if it is a planet of not
# moons_planet = moons.groupby('orbits')
# display(moons_planet['orbits' in ].size())

# filter 'orbits' so it matches a planet name
filtered_moons_planet = moons[moons['orbits'].isin(planets['eName'])]

print('Number of moons for each planet')

# Group and then use size to get the number of moons per 'orbits' that now only have planets
display(filtered_moons_planet.groupby('orbits').size())       

# Average radius of all moons
# Fill in code

# Select the table moons that contains all satellites 
print('Average radius of all moons (satellites):', moons['meanRadius'].mean())

# Compare average surface gravity of moons vs. planets
# Fill in code

print('\nAverage gravity \nPlanets:',planets['gravity'].mean(), '\tMoons:', moons['gravity'].mean())


Number of moons for each planet


orbits
Earth       1
Jupiter    79
Mars        2
Neptune    14
Saturn     65
Uranus     27
dtype: int64

Average radius of all moons (satellites): 120.96439024390246

Average gravity 
Planets: 10.16625 	Moons: 0.042440731707317075


### 4. Orbital Properties

- Which object has the highest orbital eccentricity, and what is its value?
- Calculate the average semi-major axis (semimajorAxis) for planets and compare it to that of moons.
- Identify the moon with the shortest orbital period (sideralOrbit) and the planet it orbits.


In [148]:
# Highest orbital eccentricity
# Fill in code

# Sort descending all objets in data and then the highest at position 0
highest_obj_by_eccentricity = data.sort_values(by='eccentricity', ascending=False).iloc[0]
print('Highest object by eccentricity\n')
display(highest_obj_by_eccentricity[['eName','eccentricity']])

# Average semi-major axis of planets vs. moons
# Fill in code

avg_semimajorAxis_planets = planets['semimajorAxis'].mean()
avg_semimajorAxis_moons = moons['semimajorAxis'].mean()

print('Average semi-major axis\nPlanets:',f"{avg_semimajorAxis_planets:>18,.2f}",'\nMoons:\t',f"{avg_semimajorAxis_moons:>18,.2f}")
print('Ratio Planets/Moons:',f"{avg_semimajorAxis_planets/avg_semimajorAxis_moons:.2f}")

# Moon with the shortest orbital period
# Fill in code

shortest_orbital_period = filtered_moons_planet.sort_values(by='sideralOrbit', ascending=True)

print('\nShortest orbital period of moon belonging to a planet:')
display(shortest_orbital_period[['eName', 'orbits','sideralOrbit']].iloc[0])

print('List to double check:\n')
display(shortest_orbital_period[['eName', 'orbits','sideralOrbit']])


Highest object by eccentricity



eName           Nereid
eccentricity    0.7512
Name: 155, dtype: object

Average semi-major axis
Planets:   1,264,715,207.25 
Moons:	      12,257,587.94
Ratio Planets/Moons: 103.18

Shortest orbital period of moon belonging to a planet:


eName           Ferdinand
orbits             Uranus
sideralOrbit      -2823.4
Name: 150, dtype: object

List to double check:



Unnamed: 0,eName,orbits,sideralOrbit
150,Ferdinand,Uranus,-2823.4
145,Setebos,Uranus,-2234.8
144,Prospero,Uranus,-1977.3
143,Sycorax,Uranus,-1283.4
147,Trinculo,Uranus,-758.1
...,...,...,...
162,Halimede,Neptune,1879.7
164,Sao,Neptune,2914.1
165,Laomedeia,Neptune,3167.9
163,Psamathe,Neptune,9115.9


### 5. Discovery Dates

- How many objects have recorded discovery dates?
- Which is the oldest discovered moon (except ours) for which we have recorded discovery dates, and when was it discovered?

> Look at the format of dates in the dataset. You will find NA values for objects without recorded discovery dates. Also some dates are just a year, while others are more precise.


In [None]:
# Objects with discovery dates
# Fill in code



# Oldest discovered moon
# Fill in code

### 6. Advanced Analysis

- Calculate the average density of moons that orbit planets with a mass greater than Earth's mass (`5.97e24 kg`).
- Group all objects by their `orbit_type` and compute the average orbital eccentricity for each group.
- Identify the top 3 moons with the highest escape velocity (escape).


In [11]:
# Average density of moons orbiting planets with mass > Earth
# Fill in code

# Average orbital eccentricity by orbit_type
# Fill in code

# Top 3 moons with highest escape velocity
# Fill in code

### 7. Extra questions

1. How many moons have a mass less than 10% of Earth's moon? What percentage of all moons does this represent?
2. Calculate the ratio of moons to planets in the dataset. Which planet has the highest number of moons relative to its mass?
3. Group moons by their host planet and calculate the average density for each group. Which planet hosts moons with the highest average density?

In [12]:
# Moons with a mass less than Earth's moon and percentage
# Fill in code

# Ratio of moons to planets and planet with highest moon to mass ratio
# Fill in code

# Average density of moons per planet
# Fill in code
