# Final Project: Exploring Planet Habitability

## Import Python Packages

In [3]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

import os
import re
import math
import time
import random

from time import sleep
from collections import Counter
from collections import defaultdict
from glob import glob
from tqdm import tqdm
from IPython.display import Image

## What makes a planet habitable?

[NASA Exoplanets Link]('https://seec.gsfc.nasa.gov/what_makes_a_planet_habitable.html')

The text indicates some high level factors that are needed life for a significant period of time.

- liquid water
    - The region around a star where liquid surface water can exist on a planet’s surface is called the “habitable zone.” 


- sun
    - Planet needs to orbit another star.
    - Also, there are different types of stars with different temps.
    -  There is a model for students at the website below
    
<font style='color:red'>Not done with list</font>
    

In [2]:
%%html
<iframe src='https://activity-player.concord.org/?page=page_100200&runKey=b6df6cb1-1a87-4120-9077-e7619785bde0&sequence=https%3A%2F%2Fauthoring.concord.org%2Fapi%2Fv1%2Fsequences%2F390.json&sequenceActivity=activity_7680' width="1200" height="800"></iframe>

## How do we find life outside our solar system? 

[Finding Habitable Planets]('https://seec.gsfc.nasa.gov/finding_habitable_planets.html')

<p>We start by looking for planets that resemble Earth, the only planet we know of that is habitable. This means we look for planets that are roughly the same size as Earth and orbit the right distance from their star to support liquid water at the surface, known as the habitable zone.</p>

## Search for Habitable planet database

- Tons of links
    - [The Planetary Habitability Laboratory (PHL)]('https://phl.upr.edu/data')
<br>
- One link that looked intresting<br>
    - [Habitable Zone Gallery]('http://www.hzgallery.org/table.html')

### Take a look at the Habitable Zone Gallery

In [3]:
df_habitable_zone_gal = pd.read_csv('Raw/habitable_zone_data.csv')
df_habitable_zone_gal.head()

Unnamed: 0,PLANET,MASS,RADIUS,PERIOD,ECC,OMEGA,THZC,THZO,TEQA,TEQB,TEQC,TEQD,OHZIN,CHZIN,CHZOUT,OHZOUT
0,11 Com b,19.4,,326.0,0.231,94.8,0.0,0.0,1209.5,1017.1,956.0,803.9,10.481,13.276,24.285,25.615
1,11 UMi b,14.74,,516.2,0.08,90.0,0.0,0.0,1111.3,934.5,1025.6,862.5,12.783,16.192,30.355,32.018
2,14 And b,4.8,,185.8,0.0,0.0,0.0,0.0,1003.1,843.5,1003.1,843.5,6.013,7.617,13.89,14.651
3,16 Cyg B b,1.78,,798.5,0.68,90.0,21.2,29.2,480.8,404.3,209.9,176.5,0.842,1.066,1.881,1.984
4,17 Sco b,4.32,,578.4,0.06,57.0,0.0,0.0,1038.4,873.2,977.9,822.3,10.825,13.712,25.776,27.188


*__What do the columns mean__?*

In [4]:
%%html
<iframe src='http://www.hzgallery.org/key.html' width="800" height="600"></iframe>

*__What could we use this dataset for__?*

- Mass:  Too much mass and the gravity would be too much
- Radius
- Potentially the equlibrium temperatures.

### Let's keep looking...

## Going back to the original question, "What makes a planet habitable?"

Strategy: Come up with a short list of properties. Use this list to guide data search and exploration.

The text indicates some high level factors that are needed life for a significant period of time.

- liquid water
    - The region around a star where liquid surface water can exist on a planet’s surface is called the “habitable zone.” 
- sun
    - Planet needs to orbit another star.
    - Also, there are different types of stars with different temps.
    -  There is a model for students at the website below
    
<font style='color:red'>Not done with list</font>
    

### According a Natural History Museum in the UK there are 8 ingredients for life in space.

*Each property is explained in greater detail on the web page below*

[Web Link to Article]('https://www.nhm.ac.uk/discover/eight-ingredients-life-in-space.html')

1. Water (Number 1 factor)
    - The region around a star where liquid surface water can exist on a planet’s surface is called the “habitable zone
1. Carbon
    - Carbon is the simple building block that organisms need to form organic compounds such as proteins, carbohydrates and fats.
1. Nitrogen
    - Nitrogen is also needed to make DNA and RNA, the carriers of the genetic code for life on Earth.
1. Phosphorus
    - Phosphorus is a key component of adenosine triphosphate (ATP)
    - And like nitrogen, phosphorus is necessary to create DNA and RNA
1. Sulphur
1. Luck
    - Overtime, major catastrophes such as impact by asteroids and massive volcanic eruptions have wiped out many species.
1. Time
    - Earlyest fossil evidence suggests life began after 1.1 billion years.
1. Location
    -  Need to orbit a star
    -  Need to be in the Goldilocks zone. Not to near or far from its orbiting star and not to close.
        -  What does this mean specifically?  
            - Distance?
            - Temperature of the sun.              
1. Atmosphere
    -  Without an ozone layer life would not be possiable due to extream radiation.
    

<span style="color:blue">

Comment: We probably won't be able to find a dataset where all of the above factors are described for all planets.  Therfore, we should place focus on the top 3 or 4.
    
    
From the NASA Exoplanet webpage below
- Planet needs to be the same size as Earth
- Planet need to be the same distance from its orbiting star as the distance from Earth to the sun.
- The Planets orbiting sun need to be the right temperature.
- The surface temperature need to be within a certain range (need to find this range).  This ties into distance from the planet to the orbiting star.


</span>

[NASA Exoplanets]('https://seec.gsfc.nasa.gov/finding_habitable_planets.html')

### Finding Habitable Planets

How do we find life outside our solar system? We start by looking for planets that resemble Earth, the only planet we know of that is habitable. This means we look for planets that are roughly the same size as Earth and orbit the right distance from their star to support liquid water at the surface, known as the habitable zone.

## So let's first get data on Earth that make it habitable.

#### Links

1. *Luck Facts Article Goes into Even More Factors that Make Earth Habitable than What Was Listed Above*<br>
[Lucky Facts Article]('https://www.livescience.com/21546-earth-facts.html')

2. Place Holder


## Scrape Planet HTML Data Tables

In [4]:
# import the necessary libraries
import requests
from bs4 import BeautifulSoup
import lxml.html as lh

In [5]:
# Get the web page content
url = 'https://nssdc.gsfc.nasa.gov/planetary/factsheet/index.html'
url_2 = 'https://nssdc.gsfc.nasa.gov/planetary/factsheet/planet_table_ratio.html'
fact_sheet_metric = requests.get(url_2)

In [6]:
# Print out status code of response object object `home_page`
print(f'Status code for `home_page` response object: {fact_sheet_metric.status_code}')

fact_sheet_metric.text[:100]

Status code for `home_page` response object: 200


'<html>\n<head>\n<title>Planetary Fact Sheet - Ratio to Earth</title>\n</head>\n<body bgcolor=FFFFFF>\n<p>'

In [7]:
tbl = lh.fromstring(fact_sheet_metric.content)

In [8]:
#Parse data that are stored between <tr>..</tr> of HTML
tr_elements = tbl.xpath('//tr')

In [9]:
# sanity Check the length of the first 5 rows.  Should be all the same
[len(T) for T in tr_elements[:5]]

[11, 11, 11, 11, 11]

In [10]:
# Parse the first row as the table header

tr_elements = tbl.xpath('//tr')
#Create empty list
col=[]
i=0
#For each row, store each first element (header) and an empty list
for t in tr_elements[0]:
    i+=1
    name=t.text_content()
    print('%d:"%s"'%(i,name))
    col.append((name,[]))

1:" "
2:" MERCURY "
3:" VENUS "
4:" EARTH "
5:" MOON "
6:" MARS "
7:" JUPITER "
8:" SATURN "
9:" URANUS "
10:" NEPTUNE "
11:" PLUTO "


In [11]:
# Create a DataFrame
#Since out first row is the header, data is stored on the second row onwards
for j in range(1,len(tr_elements)):
    #T is our j'th row
    T=tr_elements[j]
    
    #If row is not of size 10, the //tr data is not from our table 
    if len(T)!=11:
        break
    
    #i is the index of our column
    i=0
    
    #Iterate through each element of the row
    for t in T.iterchildren():
        data=t.text_content() 
        #Check if row is empty
        if i>0:
        #Convert any numerical value to float
            try:
                data=float(data)
            except:
                pass
        #Append the data to the empty list of the i'th column
        col[i][1].append(data)
        #Increment i for the next column
        i+=1

In [12]:
Dict={title:column for (title,column) in col}
df_solar_sys_planet_facts_metric=pd.DataFrame(Dict)

df_solar_sys_planet_facts_metric

Unnamed: 0,Unnamed: 1,MERCURY,VENUS,EARTH,MOON,MARS,JUPITER,SATURN,URANUS,NEPTUNE,PLUTO
0,Mass,0.0553,0.815,1.0,0.0123,0.107,317.8,95.2,14.5,17.1,0.0022
1,Diameter,0.383,0.949,1.0,0.2724,0.532,11.21,9.45,4.01,3.88,0.187
2,Density,0.985,0.951,1.0,0.606,0.714,0.241,0.125,0.23,0.297,0.336
3,Gravity,0.378,0.907,1.0,0.166,0.377,2.36,0.916,0.889,1.12,0.071
4,Escape Velocity,0.384,0.926,1.0,0.213,0.45,5.32,3.17,1.9,2.1,0.116
5,Rotation Period,58.8,-244.0,1.0,27.4,1.03,0.415,0.445,-0.72,0.673,6.41
6,Length of Day,175.9,116.8,1.0,29.5,1.03,0.414,0.444,0.718,0.671,6.39
7,Distance from Sun,0.387,0.723,1.0,0.00257*,1.52,5.2,9.57,19.17,30.18,39.48
8,Perihelion,0.313,0.731,1.0,0.00247*,1.41,5.04,9.23,18.58,30.4,30.16
9,Aphelion,0.459,0.716,1.0,0.00267*,1.64,5.37,9.91,19.73,29.97,48.49


In [20]:
df_solar_sys_planet_facts_metric.columns = ['Parameter', 'MERCURY', 'VENUS', 'EARTH', 'MOON', 'MARS', 'JUPITER',
       'SATURN', 'URANUS', 'NEPTUNE', 'PLUTO']
df_solar_sys_planet_facts_metric = df_solar_sys_planet_facts_metric.iloc[:-1, :]
df_solar_sys_planet_facts_metric

Unnamed: 0,Parameter,MERCURY,VENUS,EARTH,MOON,MARS,JUPITER,SATURN,URANUS,NEPTUNE,PLUTO
0,Mass,0.0553,0.815,1.0,0.0123,0.107,317.8,95.2,14.5,17.1,0.0022
1,Diameter,0.383,0.949,1.0,0.2724,0.532,11.21,9.45,4.01,3.88,0.187
2,Density,0.985,0.951,1.0,0.606,0.714,0.241,0.125,0.23,0.297,0.336
3,Gravity,0.378,0.907,1.0,0.166,0.377,2.36,0.916,0.889,1.12,0.071
4,Escape Velocity,0.384,0.926,1.0,0.213,0.45,5.32,3.17,1.9,2.1,0.116
5,Rotation Period,58.8,-244.0,1.0,27.4,1.03,0.415,0.445,-0.72,0.673,6.41
6,Length of Day,175.9,116.8,1.0,29.5,1.03,0.414,0.444,0.718,0.671,6.39
7,Distance from Sun,0.387,0.723,1.0,0.00257*,1.52,5.2,9.57,19.17,30.18,39.48
8,Perihelion,0.313,0.731,1.0,0.00247*,1.41,5.04,9.23,18.58,30.4,30.16
9,Aphelion,0.459,0.716,1.0,0.00267*,1.64,5.37,9.91,19.73,29.97,48.49


In [21]:
df_solar_sys_planet_facts_metric_T = df_solar_sys_planet_facts_metric.set_index('Parameter').T
df_solar_sys_planet_facts_metric_T

Parameter,Mass,Diameter,Density,Gravity,Escape Velocity,Rotation Period,Length of Day,Distance from Sun,Perihelion,Aphelion,Orbital Period,Orbital Velocity,Orbital Eccentricity,Obliquity to Orbit,Surface Pressure,Number of Moons,Ring System?,Global Magnetic Field?
MERCURY,0.0553,0.383,0.985,0.378,0.384,58.8,175.9,0.387,0.313,0.459,0.241,1.59,12.3,0.001,0.0,0.0,No,Yes
VENUS,0.815,0.949,0.951,0.907,0.926,-244.0,116.8,0.723,0.731,0.716,0.615,1.18,0.401,0.113*,92.0,0.0,No,No
EARTH,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,No,Yes
MOON,0.0123,0.2724,0.606,0.166,0.213,27.4,29.5,0.00257*,0.00247*,0.00267*,0.0748*,0.0343*,3.29,0.285,0.0,0.0,No,No
MARS,0.107,0.532,0.714,0.377,0.45,1.03,1.03,1.52,1.41,1.64,1.88,0.808,5.6,1.07,0.01,2.0,No,No
JUPITER,317.8,11.21,0.241,2.36,5.32,0.415,0.414,5.2,5.04,5.37,11.9,0.439,2.93,0.134,Unknown*,79.0,Yes,Yes
SATURN,95.2,9.45,0.125,0.916,3.17,0.445,0.444,9.57,9.23,9.91,29.4,0.325,3.38,1.14,Unknown*,82.0,Yes,Yes
URANUS,14.5,4.01,0.23,0.889,1.9,-0.72,0.718,19.17,18.58,19.73,83.7,0.228,2.74,4.17*,Unknown*,27.0,Yes,Yes
NEPTUNE,17.1,3.88,0.297,1.12,2.1,0.673,0.671,30.18,30.4,29.97,163.7,0.182,0.677,1.21,Unknown*,14.0,Yes,Yes
PLUTO,0.0022,0.187,0.336,0.071,0.116,6.41,6.39,39.48,30.16,48.49,247.9,0.157,14.6,2.45*,0.00001,5.0,No,Unknown


In [23]:
# Save the data to a csv
df_solar_sys_planet_facts_metric_T.to_csv('Raw/planet_facts_rel_earth_solar_system_metric_T.csv')

In [15]:
### So lets find datasets with the above data in blue.  Maybe start with our 