# Assignment #7 - Data Gathering and Warehousing - DSSA-5102

Instructor: Melissa Laurino</br>
Spring 2025</br>

Name: Tara Jacobsen
</br>
Date: 10 April 2025
<br>
<br>
**At this time in the semester:** <br>
- We have explored a dataset. <br>
- We have cleaned our dataset. <br>
- We created a Github account with a repository for this class and included a metadata read me file about our data. <br>
- We introduced general SQL syntax, queries, and applications in Python.<br>
- Created our own databases from scratch using MySQL Workbench and Python with SQLAlchemy/SQlConnector on our local server and locally on our machine.
<br>

Now we will populate and create **all** tables for our dataset into our database and finalize our ERR diagram.<br>

We created a database three different ways in our previous assignment; One database on our local MySQL server, one test database stored locally that integrates with MySQL and one test database stored only locally as a .db file on your machine. Now we will create all tables and populate your tables with your data from your dataset (Feel free to practice with all methods, but it is encouraged to use the first method that will allow you to create your schema diagram). After populating your database, create a visual database schema diagram in MySQL Workbench. <br>
<br>
Be sure to comment all code. Include a .png image of your database schema from MySQL Workbench in your Blackboard submission or Github repository.

In [392]:
# Load necessary packages:
from sqlalchemy import create_engine, Column, String, Integer, Boolean, BigInteger, Float, text # Database navigation
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
import mysql.connector
import sqlite3 # A second option for working with databases
import pandas as pd # Python data manilpulation
import math
import pandas as pd

In [407]:
# Connect to the MySQL server 

conn = mysql.connector.connect(
        host="localhost", 
        user="root", 
        password="password") 


cursor = conn.cursor()

In [409]:
# Connect to database
DATABASE_URL = DATABASE_URL = "mysql+mysqlconnector://root:password@localhost/exoplanets"
# Use MySQL Connector to connect to the database
engine = create_engine(DATABASE_URL) # Creates a connection to the MySQL database

print("Connected to MySQL database successfully!")

Connected to MySQL database successfully!


In [411]:
# Read in the CLEAN .csv file (Using pandas) we will use to populate our database. This is the same dataset that you cleaned for Assignment #2!
exoplanets = pd.read_csv("exoplanets_clean.csv")

# Create a function replace na with null. This will be used within the importing process to make sure the na values were being converted properly. Doing this prior to importing was not working and still
# returning errors. 
def replace_na_with_null(value): 
    return value if pd.notna(value) else None # Return the normal value, unless it is na, then return None


The columns being imported to this second table will be: <br>
num_stars_in_system, int <br>
num_planets_in_system, int <br>
num_moons_in_system, boolean <br>

All data types are integers


In [79]:
second_table_query = """CREATE TABLE IF NOT EXISTS system_info (
    system_id INT AUTO_INCREMENT PRIMARY KEY,
    num_stars_in_system INT,
    num_planets_in_system INT,
    num_moons_in_system INT,
    is_circumbinary BOOLEAN,
    planet_id INT, 
    FOREIGN KEY (planet_id) REFERENCES planet_identifiers(id)
);
"""

# Create a query for the second table (first table is already imported from previous assignment). Create a table with autoincrementing primary ID and link foreign id from planet_identifiers table. 

In [81]:
#Execute the query:
with engine.connect() as connection:
    connection.execute(text(second_table_query))
    print("Second table created successfully!")

Second table created successfully!


In [83]:
with engine.connect() as connection:
    cursor.execute("USE exoplanets;")


for _, row in exoplanets.iterrows():
    cursor.execute("SELECT id FROM planet_identifiers WHERE planet_name = %s", (row['planet_name'],)) # Find the id number associated with the
    # correct planet_name
    result = cursor.fetchone() # Get this id number
    planet_identifiers_id = result[0] # Assign number to planet_identifiers_id

 # Insert data into mysql   
    cursor.execute("""INSERT INTO system_info 
                      (num_stars_in_system, num_planets_in_system, num_moons_in_system, is_circumbinary, planet_id)
                      VALUES (%s, %s, %s, %s, %s)
                     """, 
                     (row['num_stars_in_system'], 
                      row['num_planets_in_system'],
                      row['num_moons_in_system'],
                      row['is_circumbinary'],
                      planet_identifiers_id))  # Use the fetched planet_identifiers_id

conn.commit()

In [84]:
# Ensure table imported with sample query. 
with engine.connect() as connection:  
    practice_query = text("""SELECT num_stars_in_system, COUNT(*) as count
                                 FROM exoplanets.system_info
                                 GROUP BY num_stars_in_system
                                 ORDER BY count DESC
                                 LIMIT 4;
                                 """) # Count the frequency of the number of stars in systems
    practice_query = pd.read_sql(practice_query, connection) 
    
practice_query

# Most systems with observable exoplanets are a single star, with some binary star systems, and very few ternary
# and quaternary star systems. 

Unnamed: 0,num_stars_in_system,count
0,1,5259
1,2,458
2,3,69
3,4,2


In [85]:
# Group the data by count of num of stars
star_count = exoplanets.groupby('num_stars_in_system').size().reset_index(name='count')

# Sort the DataFrame by 'count' in descending order and take the top 10
star_count = star_count.sort_values(by='count', ascending=False).head(4)

print(star_count)
# Info matches!

   num_stars_in_system  count
0                    1   5259
1                    2    458
2                    3     69
3                    4      2


The columns being imported to the third table (discovery_info) will be: <br>
discovery_method <br>
discovery_year <br>
discovery_publication_date <br>
discovery_locale <br>
discovery_facility <br>
discovery_telescope <br>
discovery_instrument <br>
detected_by_radial_velocity <br>
detected_by_pulsar_timing <br>
detected_by_pulsation_timing_variations <br>
detected_by_transit <br>
detected_by_astrometry <br>
detected_by_orbital_brightness_modulation<br>
detected_by_microlensing<br>
detected_by_eclipse_timing_variations<br>
detected_by_imaging<br>
detected_by_disk_kinematics<br>
is_controversial

In [124]:
third_table_query = """CREATE TABLE IF NOT EXISTS discovery_info (
    discovery_id INT AUTO_INCREMENT PRIMARY KEY,
    discovery_method VARCHAR(50),
    discovery_year INT,
    discovery_publication_date DATE,
    discovery_locale VARCHAR(50),
    discovery_facility VARCHAR(50),
    discovery_telescope LONGTEXT,
    discovery_instrument VARCHAR(50),
    detected_by_radial_velocity BOOLEAN,
    detected_by_pulsar_timing BOOLEAN, 
    detected_by_pulsation_timing_variations BOOLEAN, 
    detected_by_transit BOOLEAN, 
    detected_by_astrometry BOOLEAN, 
    detected_by_orbital_brightness_modulation BOOLEAN, 
    detected_by_microlensing BOOLEAN, 
    detected_by_eclipse_timing_variations BOOLEAN, 
    detected_by_imaging BOOLEAN, 
    detected_by_disk_kinematics BOOLEAN,
    is_controversial BOOLEAN,
    planet_id INT, 
    FOREIGN KEY (planet_id) REFERENCES planet_identifiers(id)
);
"""
# Create a query for the table. Create a table with autoincrementing primary ID and link foreign id from planet_identifiers table. 

In [126]:
with engine.connect() as connection:
    connection.execute(text(third_table_query))
    print("Third table created successfully!")

Third table created successfully!


In [128]:
with engine.connect() as connection:
    # Make sure MySQL is using the correct database
    cursor.execute("USE exoplanets;")

    # Populate the songs table
    for _, row in exoplanets.iterrows():
        cursor.execute("SELECT id FROM planet_identifiers WHERE planet_name = %s", (row['planet_name'],)) # Find the corresponding id number associated with the
    # correct planet_name
        result = cursor.fetchone() # Fetch this id number
        planet_identifiers_id = result[0]
        
        cursor.execute("""INSERT INTO discovery_info (discovery_method, discovery_year, discovery_publication_date,
        discovery_locale, discovery_facility, discovery_telescope, discovery_instrument,
        detected_by_radial_velocity, detected_by_pulsar_timing, detected_by_pulsation_timing_variations,
        detected_by_transit, detected_by_astrometry, detected_by_orbital_brightness_modulation,
        detected_by_microlensing, detected_by_eclipse_timing_variations, detected_by_imaging,
        detected_by_disk_kinematics, is_controversial, planet_id)
                          VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s) 
                        """, (row['discovery_method'], # %s acts as a placeholder for values that will be inserted into the table
                              row['discovery_year'],
                              row['discovery_publication_date'],
                              row['discovery_locale'],
                              row['discovery_facility'],
                              row['discovery_telescope'],
                              row['discovery_instrument'],
                              row['detected_by_radial_velocity'],
                              row['detected_by_pulsar_timing'],
                              row['detected_by_pulsation_timing_variations'],
                              row['detected_by_transit'],
                              row['detected_by_astrometry'],
                              row['detected_by_orbital_brightness_modulation'],
                              row['detected_by_microlensing'],
                              row['detected_by_eclipse_timing_variations'],
                              row['detected_by_imaging'],
                              row['detected_by_disk_kinematics'],
                              row['is_controversial'],
                              planet_identifiers_id
                            
                              ))
    conn.commit()

In [130]:
# Practice query
with engine.connect() as connection:  
    practice_query = text("""SELECT discovery_locale, COUNT(*) as count
                                 FROM exoplanets.discovery_info
                                 GROUP BY discovery_locale
                                 ORDER BY count DESC
                                 LIMIT 10;
                                 """) 
    practice_query = pd.read_sql(practice_query, connection) 
    
practice_query

# Most discoveries are from space telescopes. This makes sense, Earth's atmosphere makes it difficult to see 
# dim objects like exoplanets. 

Unnamed: 0,discovery_locale,count
0,space,3961
1,ground,1794
2,multiple locales,33


In [132]:
locale_count = exoplanets.groupby('discovery_locale').size().reset_index(name='count')

# Sort the DataFrame by 'count' in descending order and take the top 10
locale_count = locale_count.sort_values(by='count', ascending=False).head(4)

print(locale_count)
# Info matches!

   discovery_locale  count
2             space   3961
0            ground   1794
1  multiple locales     33


The columns being imported to the fourth table (planet_properties) will be: <br>

planet_radius_earth_radius <br> planet_radius_upper_unc_earth_radius <br>
 planet_radius_lower_unc_earth_radius <br> planet_radius_earth_limit_flag <br> planet_radius_jupiter_radius <br> planet_radius_upper_unc_jupiter_radius <br> planet_radius_lower_unc_jupiter_radius<br> planet_radius_jupiter_limit_flag<br> planet_mass_earth_mass<br> planet_mass_upper_unc_earth_mass<br> planet_mass_lower_unc_earth_mass<br> planet_mass_earth_limit_flag<br> planet_mass_jupiter_mass<br> planet_mass_upper_unc_jupiter_mass <br> planet_mass_lower_unc_jupiter_mass <br> planet_mass_jupiter_limit_flag <br> planet_mass_provenance<br>planet_density_gcm3<br>planet_density_upper_unc_gcm3<br>planet_density_lower_unc_gcm3<br>planet_density_limit_flag<br>ratio_planet_to_stellar_radius<br>ratio_planet_to_stellar_radius_upper_unc<br>ratio_planet_to_stellar_radius_lower_unc<br> ratio_planet_to_stellar_radius_limit_flag



In [134]:
fourth_table_query = """CREATE TABLE IF NOT EXISTS planet_properties (
    properties_id INT AUTO_INCREMENT PRIMARY KEY,
    planet_radius_earth_radius FLOAT,
    planet_radius_upper_unc_earth_radius FLOAT,
    planet_radius_lower_unc_earth_radius FLOAT,
    planet_radius_earth_limit_flag BOOLEAN,
    planet_radius_jupiter_radius FLOAT,
    planet_radius_upper_unc_jupiter_radius FLOAT,
    planet_radius_lower_unc_jupiter_radius FLOAT,
    planet_radius_jupiter_limit_flag BOOLEAN,
    planet_mass_earth_mass FLOAT,
    planet_mass_upper_unc_earth_mass FLOAT,
    planet_mass_lower_unc_earth_mass FLOAT,
    planet_mass_earth_limit_flag BOOLEAN,
    planet_mass_jupiter_mass FLOAT,
    planet_mass_upper_unc_jupiter_mass FLOAT,
    planet_mass_lower_unc_jupiter_mass FLOAT,
    planet_mass_jupiter_limit_flag BOOLEAN,
    planet_mass_provenance VARCHAR(50),
    planet_density_gcm3 FLOAT,
    planet_density_upper_unc_gcm3 FLOAT,
    planet_density_lower_unc_gcm3 FLOAT,
    planet_density_limit_flag BOOLEAN,
    ratio_planet_to_stellar_radius FLOAT,
    ratio_planet_to_stellar_radius_upper_unc FLOAT,
    ratio_planet_to_stellar_radius_lower_unc FLOAT,
    ratio_planet_to_stellar_radius_limit_flag BOOLEAN,
    planet_id INT, 
    FOREIGN KEY (planet_id) REFERENCES planet_identifiers(id)
);
"""

# Create a query for the table. Create a table with autoincrementing primary ID and link foreign id from planet_identifiers table. 

In [136]:
# Create the table using above query
with engine.connect() as connection:
    connection.execute(text(fourth_table_query))
    print("Fourth table created successfully!")

Fourth table created successfully!


In [138]:
with engine.connect() as connection:
    # Make sure MySQL is using the correct database
    cursor.execute("USE exoplanets;")

    for _, row in exoplanets.iterrows():
        cursor.execute("SELECT id FROM planet_identifiers WHERE planet_name = %s", (row['planet_name'],)) # Find the corresponding id number associated with the
    # correct planet_name
        result = cursor.fetchone() # Fetch this id number
        planet_identifiers_id = result[0]
        
        cursor.execute("""
            INSERT INTO planet_properties (
                planet_radius_earth_radius,
                planet_radius_upper_unc_earth_radius,
                planet_radius_lower_unc_earth_radius,
                planet_radius_earth_limit_flag,
                planet_radius_jupiter_radius,
                planet_radius_upper_unc_jupiter_radius,
                planet_radius_lower_unc_jupiter_radius,
                planet_radius_jupiter_limit_flag,
                planet_mass_earth_mass,
                planet_mass_upper_unc_earth_mass,
                planet_mass_lower_unc_earth_mass,
                planet_mass_earth_limit_flag,
                planet_mass_jupiter_mass,
                planet_mass_upper_unc_jupiter_mass,
                planet_mass_lower_unc_jupiter_mass,
                planet_mass_jupiter_limit_flag,
                planet_mass_provenance,
                planet_density_gcm3,
                planet_density_upper_unc_gcm3,
                planet_density_lower_unc_gcm3,
                planet_density_limit_flag,
                ratio_planet_to_stellar_radius,
                ratio_planet_to_stellar_radius_upper_unc,
                ratio_planet_to_stellar_radius_lower_unc,
                ratio_planet_to_stellar_radius_limit_flag,
                planet_id
            ) VALUES (
                %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s
            )
        """, (
            replace_na_with_null(row['planet_radius_earth_radius']),
            replace_na_with_null(row['planet_radius_upper_unc_earth_radius']),
            replace_na_with_null(row['planet_radius_lower_unc_earth_radius']),
            replace_na_with_null(row['planet_radius_earth_limit_flag']),
            replace_na_with_null(row['planet_radius_jupiter_radius']),
            replace_na_with_null(row['planet_radius_upper_unc_jupiter_radius']),
            replace_na_with_null(row['planet_radius_lower_unc_jupiter_radius']),
            replace_na_with_null(row['planet_radius_jupiter_limit_flag']),
            replace_na_with_null(row['planet_mass_earth_mass']),
            replace_na_with_null(row['planet_mass_upper_unc_earth_mass']),
            replace_na_with_null(row['planet_mass_lower_unc_earth_mass']),
            replace_na_with_null(row['planet_mass_earth_limit_flag']),
            replace_na_with_null(row['planet_mass_jupiter_mass']),
            replace_na_with_null(row['planet_mass_upper_unc_jupiter_mass']),
            replace_na_with_null(row['planet_mass_lower_unc_jupiter_mass']),
            replace_na_with_null(row['planet_mass_jupiter_limit_flag']),
            replace_na_with_null(row['planet_mass_provenance']),
            replace_na_with_null(row['planet_density_gcm3']),
            replace_na_with_null(row['planet_density_upper_unc_gcm3']),
            replace_na_with_null(row['planet_density_lower_unc_gcm3']),
            replace_na_with_null(row['planet_density_limit_flag']),
            replace_na_with_null(row['ratio_planet_to_stellar_radius']),
            replace_na_with_null(row['ratio_planet_to_stellar_radius_upper_unc']),
            replace_na_with_null(row['ratio_planet_to_stellar_radius_lower_unc']),
            replace_na_with_null(row['ratio_planet_to_stellar_radius_limit_flag']),
            planet_identifiers_id
        ))

    conn.commit()

In [140]:
# Practice query to make sure data wasn't doubled 
with engine.connect() as connection:  
    practice_query = text("""SELECT COUNT(*) as row_count
                              FROM exoplanets.planet_properties;""")  
    practice_query = pd.read_sql(practice_query, connection) 
    
# Print the results
practice_query

Unnamed: 0,row_count
0,5788


In [142]:
len(exoplanets)
# Info matches!

5788

The columns being imported to the fifth table (orbital_info) will be: <br>
orbital_period_days <br>
orbital_period_upper_unc_days <br>
orbital_period_lower_unc_days <br>
orbital_period_limit_flag <br>
orbit_semi_major_axis_au <br>
orbit_semi_major_axis_upper_unc_au <br>
orbit_semi_major_axis_lower_unc_au <br>
orbit_semi_major_axis_limit_flag <br>
angular_separation_mas <br>
angular_separation_limit_flag <br>
eccentricity <br>
eccentricity_upper_unc <br>
eccentricity_lower_unc <br>
eccentricity_limit_flag <br> 
inclination_deg <br> inclination_upper_unc_deg <br> inclination_lower_unc_deg <br> inclination_limit_flag <br>



In [172]:
fifth_table_query = """CREATE TABLE IF NOT EXISTS orbital_info (
    orbital_id INT AUTO_INCREMENT PRIMARY KEY,
    orbital_period_days FLOAT,
    orbital_period_upper_unc_days FLOAT,
    orbital_period_lower_unc_days FLOAT, 
    orbital_period_limit_flag BOOLEAN, 
    orbit_semi_major_axis_au FLOAT,
    orbit_semi_major_axis_upper_unc_au FLOAT,
    orbit_semi_major_axis_lower_unc_au FLOAT, 
    orbit_semi_major_axis_limit_flag BOOLEAN,
    angular_separation_mas FLOAT, 
    angular_separation_limit_flag BOOLEAN, 
    eccentricity FLOAT,
    eccentricity_upper_unc FLOAT, 
    eccentricity_lower_unc FLOAT,
    eccentricity_limit_flag BOOLEAN,
    inclination_deg FLOAT,
    inclination_upper_unc_deg FLOAT,
    inclination_lower_unc_deg FLOAT,
    inclination_limit_flag BOOLEAN,
    properties_id INT, 
    FOREIGN KEY (properties_id) REFERENCES planet_properties(properties_id)
);
"""
# Create a query for the table. Create a table with autoincrementing primary ID and link foreign id from planet_properties table. 

In [174]:
# Create the table using above query
with engine.connect() as connection:
    connection.execute(text(fifth_table_query))
    print("Fifth table created successfully!")

Fifth table created successfully!


In [176]:
# All below data imports use the same, if not very similar, method. I will comment fully here to avoid repetition. 
with engine.connect() as connection:
    # Make sure MySQL is using the correct database
    cursor.execute("USE exoplanets;")

    for _, row in exoplanets.iterrows():
        # Get the planet_id from planet_identifiers using planet_name
        cursor.execute("SELECT id FROM planet_identifiers WHERE planet_name = %s", (row['planet_name'],))
        result = cursor.fetchone()

        if result:
            planet_id = result[0]  # Get the planet_id 
        
            # Get the properties_id from planet_properties using the planet_id we just got from above
            cursor.execute("SELECT properties_id FROM planet_properties WHERE planet_id = %s", (planet_id,))
            result_properties = cursor.fetchone() 
            
            if result_properties:
                properties_id = result_properties[0] # Get the properties_id. This is what we need in order to connect the properties table with orbital info directly
# Continue as normal import                
                cursor.execute(""" 
                    INSERT INTO orbital_info (
                        orbital_period_days, orbital_period_upper_unc_days, orbital_period_lower_unc_days, 
                        orbital_period_limit_flag, orbit_semi_major_axis_au, orbit_semi_major_axis_upper_unc_au, 
                        orbit_semi_major_axis_lower_unc_au, orbit_semi_major_axis_limit_flag, angular_separation_mas, 
                        angular_separation_limit_flag, eccentricity, eccentricity_upper_unc, eccentricity_lower_unc, 
                        eccentricity_limit_flag, inclination_deg, inclination_upper_unc_deg, inclination_lower_unc_deg, 
                        inclination_limit_flag, properties_id)
                    VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
                """, (
                    replace_na_with_null(row['orbital_period_days']),
                    replace_na_with_null(row['orbital_period_upper_unc_days']),
                    replace_na_with_null(row['orbital_period_lower_unc_days']),
                    replace_na_with_null(row['orbital_period_limit_flag']),
                    replace_na_with_null(row['orbit_semi_major_axis_au']),
                    replace_na_with_null(row['orbit_semi_major_axis_upper_unc_au']),
                    replace_na_with_null(row['orbit_semi_major_axis_lower_unc_au']),
                    replace_na_with_null(row['orbit_semi_major_axis_limit_flag']),
                    replace_na_with_null(row['angular_separation_mas']),
                    replace_na_with_null(row['angular_separation_limit_flag']),
                    replace_na_with_null(row['eccentricity']),
                    replace_na_with_null(row['eccentricity_upper_unc']),
                    replace_na_with_null(row['eccentricity_lower_unc']),
                    replace_na_with_null(row['eccentricity_limit_flag']),
                    replace_na_with_null(row['inclination_deg']),
                    replace_na_with_null(row['inclination_upper_unc_deg']),
                    replace_na_with_null(row['inclination_lower_unc_deg']),
                    replace_na_with_null(row['inclination_limit_flag']),
                    properties_id  
                ))

    # Commit all the changes
    conn.commit()

In [178]:
with engine.connect() as connection:  
    practice_query = text("""SELECT COUNT(*) as row_count
                              FROM exoplanets.orbital_info;""")  
    practice_query = pd.read_sql(practice_query, connection) 
    
practice_query

Unnamed: 0,row_count
0,5788


In [182]:
len(exoplanets)
# Info matches!

5788

The columns being imported to the sixth table (insolation_and_temp) will be:<br> 
insolation_flux_earth_flux<br>insolation_flux_upper_unc_earth_flux<br>insolation_flux_lower_unc_earth_flux<br>insolation_flux_limit_flag<br>equilibrium_temperature_k<br>equilibrium_temperature_upper_unc_k<br>equilibrium_temperature_lower_unc_k<br>equilibrium_temperature_limit_flag


In [184]:
sixth_table_query = """CREATE TABLE IF NOT EXISTS insolation_and_temp (
    ins_id INT AUTO_INCREMENT PRIMARY KEY,
    insolation_flux_earth_flux FLOAT, 
    insolation_flux_upper_unc_earth_flux FLOAT, 
    insolation_flux_lower_unc_earth_flux FLOAT, 
    insolation_flux_limit_flag BOOLEAN,
    equilibrium_temperature_k FLOAT,
    equilibrium_temperature_upper_unc_k FLOAT,
    equilibrium_temperature_lower_unc_k FLOAT,
    equilibrium_temperature_limit_flag BOOLEAN,
    properties_id INT, 
    FOREIGN KEY (properties_id) REFERENCES planet_properties(properties_id)
);
"""
# Create a query for the table. Create a table with autoincrementing primary ID and link foreign id from planet_properties table. 

In [186]:
# Create the table using above query
with engine.connect() as connection:
    connection.execute(text(sixth_table_query))
    print("Sixth table created successfully!")

Sixth table created successfully!


In [190]:
with engine.connect() as connection:
    cursor.execute("USE exoplanets;")

    for _, row in exoplanets.iterrows():
        cursor.execute("SELECT id FROM planet_identifiers WHERE planet_name = %s", (row['planet_name'],))
        result = cursor.fetchone()

        if result:
            planet_id = result[0] 
        
            cursor.execute("SELECT properties_id FROM planet_properties WHERE planet_id = %s", (planet_id,))
            result_properties = cursor.fetchone()
            
            if result_properties:
                properties_id = result_properties[0]
                
        cursor.execute("""
            INSERT INTO insolation_and_temp (
                insolation_flux_earth_flux,
                insolation_flux_upper_unc_earth_flux,
                insolation_flux_lower_unc_earth_flux,
                insolation_flux_limit_flag,
                equilibrium_temperature_k,
                equilibrium_temperature_upper_unc_k,
                equilibrium_temperature_lower_unc_k,
                equilibrium_temperature_limit_flag,
                properties_id
            ) VALUES (
                %s, %s, %s, %s, %s, %s, %s, %s, %s
            )
        """, (
            replace_na_with_null(row['insolation_flux_earth_flux']),
            replace_na_with_null(row['insolation_flux_upper_unc_earth_flux']),
            replace_na_with_null(row['insolation_flux_lower_unc_earth_flux']),
            replace_na_with_null(row['insolation_flux_limit_flag']),
            replace_na_with_null(row['equilibrium_temperature_k']),
            replace_na_with_null(row['equilibrium_temperature_upper_unc_k']),
            replace_na_with_null(row['equilibrium_temperature_lower_unc_k']),
            replace_na_with_null(row['equilibrium_temperature_limit_flag']),
            properties_id
        ))

    conn.commit()


The columns being imported to the seventh table (transit_info) will be: <br>
transit_midpoint_days<br>transit_midpoint_upper_unc_days<br>transit_midpoint_lower_unc_days<br>transit_midpoint_limit_flag<br>transit_midpoint_time_system<br>data_show_transit_timing_variations<br>impact_parameter<br>impact_parameter_upper_unc<br>impact_parameter_lower_unc<br>impact_parameter_limit_flag<br>transit_depth_percent<br>transit_depth_upper_unc_percent<br>transit_depth_lower_unc_percent<br>transit_depth_limit_flag<br>transit_duration_hours<br>transit_duration_upper_unc_hours<br>transit_duration_lower_unc_hours<br>transit_duration_limit_flag

In [198]:
seventh_table_query = """CREATE TABLE IF NOT EXISTS transit_info (
    transit_id INT AUTO_INCREMENT PRIMARY KEY,
    transit_midpoint_days FLOAT, 
    transit_midpoint_upper_unc_days FLOAT,
    transit_midpoint_lower_unc_days FLOAT,
    transit_midpoint_limit_flag BOOLEAN,
    transit_midpoint_time_system VARCHAR(25),
    data_show_transit_timing_variations BOOLEAN,
    impact_parameter FLOAT,
    impact_parameter_upper_unc FLOAT,
    impact_parameter_lower_unc FLOAT,
    impact_parameter_limit_flag BOOLEAN,
    transit_depth_percent FLOAT,
    transit_depth_upper_unc_percent FLOAT,
    transit_depth_lower_unc_percent FLOAT,
    transit_depth_limit_flag BOOLEAN,
    transit_duration_hours FLOAT,
    transit_duration_upper_unc_hours FLOAT,
    transit_duration_lower_unc_hours FLOAT,
    transit_duration_limit_flag BOOLEAN,
    orbital_id INT,
    FOREIGN KEY (orbital_id) REFERENCES orbital_info(orbital_id)
);
"""
# Create a query for the table. Create a table with autoincrementing primary ID and link foreign id from orbital_info table. 

In [200]:
# Create the table using above query
with engine.connect() as connection:
    connection.execute(text(seventh_table_query))
    print("Seventh table created successfully!")

Seventh table created successfully!


In [206]:
with engine.connect() as connection:
    cursor.execute("USE exoplanets;")

    for _, row in exoplanets.iterrows():
        cursor.execute("SELECT id FROM planet_identifiers WHERE planet_name = %s", (row['planet_name'],))
        result = cursor.fetchone()

        if result:
            planet_id = result[0]  
        
            cursor.execute("SELECT properties_id FROM planet_properties WHERE planet_id = %s", (planet_id,))
            result_properties = cursor.fetchone()
            
            if result_properties:
                properties_id = result_properties[0]
                
                # Get the orbital_id from orbital_info using properties_id
                cursor.execute("SELECT orbital_id FROM orbital_info WHERE properties_id = %s", (properties_id,))
                result_orbital = cursor.fetchone()

                if result_orbital:
                    orbital_id = result_orbital[0]
                    
                    cursor.execute("""
                        INSERT INTO transit_info(
                            transit_midpoint_days,
                            transit_midpoint_upper_unc_days,
                            transit_midpoint_lower_unc_days,
                            transit_midpoint_limit_flag,
                            transit_midpoint_time_system,
                            data_show_transit_timing_variations,
                            impact_parameter,
                            impact_parameter_upper_unc,
                            impact_parameter_lower_unc,
                            impact_parameter_limit_flag,
                            transit_depth_percent,
                            transit_depth_upper_unc_percent,
                            transit_depth_lower_unc_percent,
                            transit_depth_limit_flag,
                            transit_duration_hours,
                            transit_duration_upper_unc_hours,
                            transit_duration_lower_unc_hours,
                            transit_duration_limit_flag,
                            orbital_id
                        ) VALUES (
                            %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s
                        )
                    """, (
                        replace_na_with_null(row['transit_midpoint_days']),
                        replace_na_with_null(row['transit_midpoint_upper_unc_days']),
                        replace_na_with_null(row['transit_midpoint_lower_unc_days']),
                        replace_na_with_null(row['transit_midpoint_limit_flag']),
                        replace_na_with_null(row['transit_midpoint_time_system']),
                        replace_na_with_null(row['data_show_transit_timing_variations']),
                        replace_na_with_null(row['impact_parameter']),
                        replace_na_with_null(row['impact_parameter_upper_unc']),
                        replace_na_with_null(row['impact_parameter_lower_unc']),
                        replace_na_with_null(row['impact_parameter_limit_flag']),
                        replace_na_with_null(row['transit_depth_percent']),
                        replace_na_with_null(row['transit_depth_upper_unc_percent']),
                        replace_na_with_null(row['transit_depth_lower_unc_percent']),
                        replace_na_with_null(row['transit_depth_limit_flag']),
                        replace_na_with_null(row['transit_duration_hours']),
                        replace_na_with_null(row['transit_duration_upper_unc_hours']),
                        replace_na_with_null(row['transit_duration_lower_unc_hours']),
                        replace_na_with_null(row['transit_duration_limit_flag']),
                        orbital_id 
                    ))

    conn.commit()


The columns being imported to the eigth table (orbital_ratios) will be:<br>
ratio_semi_major_axis_to_stellar_radius<br>ratio_semi_major_axis_to_stellar_radius_upper_unc<br>ratio_semi_major_axis_to_stellar_radius_lower_unc<br>ratio_semi_major_axis_to_stellar_radius_limit_flag


In [208]:
exoplanets['ratio_semi_major_axis_to_stellar_radius']

0         NaN
1         NaN
2         NaN
3        5.30
4         NaN
        ...  
5783    40.81
5784     7.77
5785    16.59
5786    24.63
5787    27.93
Name: ratio_semi_major_axis_to_stellar_radius, Length: 5788, dtype: float64

In [413]:
eighth_table_query = """CREATE TABLE IF NOT EXISTS orbital_ratios (
    ratio_id INT AUTO_INCREMENT PRIMARY KEY,
    ratio_semi_major_axis_to_stellar_radius FLOAT,
    ratio_semi_major_axis_to_stellar_radius_upper_unc FLOAT,
    ratio_semi_major_axis_to_stellar_radius_lower_unc FLOAT,
    ratio_semi_major_axis_to_stellar_radius_limit_flag BOOLEAN,
    orbital_id INT,
    FOREIGN KEY (orbital_id) REFERENCES orbital_info(orbital_id)
);
"""
# Create a query for the table. Create a table with autoincrementing primary ID and link foreign id from orbital_info table. 

In [415]:
# Create the table using above query
with engine.connect() as connection:
    connection.execute(text(eighth_table_query))
    print("Eighth table created successfully!")

Eighth table created successfully!


In [417]:
with engine.connect() as connection:
    cursor.execute("USE exoplanets;")

    for _, row in exoplanets.iterrows():
        cursor.execute("SELECT id FROM planet_identifiers WHERE planet_name = %s", (row['planet_name'],))
        result = cursor.fetchone()

        if result:
            planet_id = result[0]  # Fetch the planet_id if the result is correct
        
            cursor.execute("SELECT properties_id FROM planet_properties WHERE planet_id = %s", (planet_id,))
            result_properties = cursor.fetchone()
            
            if result_properties:
                properties_id = result_properties[0]
                
                cursor.execute("SELECT orbital_id FROM orbital_info WHERE properties_id = %s", (properties_id,))
                result_orbital = cursor.fetchone()

                if result_orbital:
                    orbital_id = result_orbital[0]
                    
                    cursor.execute("""
                        INSERT INTO orbital_ratios(
                            ratio_semi_major_axis_to_stellar_radius,
                            ratio_semi_major_axis_to_stellar_radius_upper_unc,
                            ratio_semi_major_axis_to_stellar_radius_lower_unc,
                            ratio_semi_major_axis_to_stellar_radius_limit_flag,
                            orbital_id
                        ) VALUES (
                            %s, %s, %s, %s, %s
                        )
                    """, (
                        replace_na_with_null(row['ratio_semi_major_axis_to_stellar_radius']),
                        replace_na_with_null(row['ratio_semi_major_axis_to_stellar_radius_upper_unc']),
                        replace_na_with_null(row['ratio_semi_major_axis_to_stellar_radius_lower_unc']),
                        replace_na_with_null(row['ratio_semi_major_axis_to_stellar_radius_limit_flag']),
                        orbital_id  
                    ))

    conn.commit()

The columns being imported to the ninth table (periastron_info) will be:<br> 
 epoch_of_periastron_days<br>epoch_of_periastron_upper_unc_days<br>epoch_of_periastron_lower_unc_days<br>epoch_of_periastron_limit_flag<br>epoch_of_periastron_time_system<br>argument_of_periastron_deg<br>argument_of_periastron_upper_unc_deg<br>argument_of_periastron_lower_unc_deg<br>argument_of_periastron_limit_flag


In [419]:
ninth_table_query = """CREATE TABLE IF NOT EXISTS periastron_info (
    peri_id INT AUTO_INCREMENT PRIMARY KEY,
    epoch_of_periastron_days FLOAT,
    epoch_of_periastron_upper_unc_days FLOAT,
    epoch_of_periastron_lower_unc_days FLOAT,
    epoch_of_periastron_limit_flag BOOLEAN,
    epoch_of_periastron_time_system VARCHAR(25),
    argument_of_periastron_deg FLOAT,
    argument_of_periastron_upper_unc_deg FLOAT,
    argument_of_periastron_lower_unc_deg FLOAT,
    argument_of_periastron_limit_flag BOOLEAN,
    orbital_id INT,
    FOREIGN KEY (orbital_id) REFERENCES orbital_info(orbital_id)
);
"""
# Create a query for the table. Create a table with autoincrementing primary ID and link foreign id from orbital_info table. 

In [421]:
# Create the table using above query
with engine.connect() as connection:
    connection.execute(text(ninth_table_query))
    print("Ninth table created successfully!")

Ninth table created successfully!


In [423]:
with engine.connect() as connection:
    cursor.execute("USE exoplanets;")

    for _, row in exoplanets.iterrows():
        cursor.execute("SELECT id FROM planet_identifiers WHERE planet_name = %s", (row['planet_name'],))
        result = cursor.fetchone()

        if result:
            planet_id = result[0]  # Fetch the planet_id if the result is correct
        
            cursor.execute("SELECT properties_id FROM planet_properties WHERE planet_id = %s", (planet_id,))
            result_properties = cursor.fetchone()
            
            if result_properties:
                properties_id = result_properties[0]
                
                cursor.execute("SELECT orbital_id FROM orbital_info WHERE properties_id = %s", (properties_id,))
                result_orbital = cursor.fetchone()

                if result_orbital:
                    orbital_id = result_orbital[0]
                    
                    cursor.execute("""
                        INSERT INTO periastron_info(
                            epoch_of_periastron_days,
                            epoch_of_periastron_upper_unc_days,
                            epoch_of_periastron_lower_unc_days,
                            epoch_of_periastron_limit_flag,
                            epoch_of_periastron_time_system,
                            argument_of_periastron_deg,
                            argument_of_periastron_upper_unc_deg,
                            argument_of_periastron_lower_unc_deg,
                            argument_of_periastron_limit_flag,
                            orbital_id
                        ) VALUES (
                            %s, %s, %s, %s, %s, %s, %s, %s, %s, %s
                        )
                    """, (
                        replace_na_with_null(row['epoch_of_periastron_days']),
                        replace_na_with_null(row['epoch_of_periastron_upper_unc_days']),
                        replace_na_with_null(row['epoch_of_periastron_lower_unc_days']),
                        replace_na_with_null(row['epoch_of_periastron_limit_flag']),
                        replace_na_with_null(row['epoch_of_periastron_time_system']),
                        replace_na_with_null(row['argument_of_periastron_deg']),
                        replace_na_with_null(row['argument_of_periastron_upper_unc_deg']),
                        replace_na_with_null(row['argument_of_periastron_lower_unc_deg']),
                        replace_na_with_null(row['argument_of_periastron_limit_flag']),
                        orbital_id  
                    ))

    conn.commit()

The columns being imported to the tenth table (radial_velocity_and_obliquity) will be: <br> 
 radial_velocity_amplitude_ms<br>radial_velocity_amplitude_upper_unc_ms<br>radial_velocity_amplitude_lower_unc_ms<br>radial_velocity_amplitude_limit_flag<br> projected_obliquity_deg<br> projected_obliquity_upper_unc_deg<br> projected_obliquity_lower_unc_deg<br> projected_obliquity_limit_flag<br>true_obliquity_deg<br>true_obliquity_upper_unc_deg<br>true_obliquity_lower_unc_deg<br>true_obliquity_limit_flag


In [425]:
tenth_table_query = """CREATE TABLE IF NOT EXISTS radial_velocity_and_obliquity (
    rad_id INT AUTO_INCREMENT PRIMARY KEY,
    radial_velocity_amplitude_ms FLOAT, 
    radial_velocity_amplitude_upper_unc_ms FLOAT,
    radial_velocity_amplitude_lower_unc_ms FLOAT,
    radial_velocity_amplitude_limit_flag BOOLEAN,
    projected_obliquity_deg FLOAT,
    projected_obliquity_upper_unc_deg FLOAT,
    projected_obliquity_lower_unc_deg FLOAT,
    projected_obliquity_limit_flag BOOLEAN,
    true_obliquity_deg FLOAT,
    true_obliquity_upper_unc_deg FLOAT,
    true_obliquity_lower_unc_deg FLOAT,
    true_obliquity_limit_flag BOOLEAN,
    orbital_id INT,
    FOREIGN KEY (orbital_id) REFERENCES orbital_info(orbital_id)
);
"""
# Create a query for the table. Create a table with autoincrementing primary ID and link foreign id from orbital_info table. 

In [427]:
# Create the table using above query
with engine.connect() as connection:
    connection.execute(text(tenth_table_query))
    print("Tenth table created successfully!")

Tenth table created successfully!


In [429]:
with engine.connect() as connection:
    cursor.execute("USE exoplanets;")

    for _, row in exoplanets.iterrows():
        cursor.execute("SELECT id FROM planet_identifiers WHERE planet_name = %s", (row['planet_name'],))
        result = cursor.fetchone()

        if result:
            planet_id = result[0]  # Fetch the planet_id if the result is correct
        
            cursor.execute("SELECT properties_id FROM planet_properties WHERE planet_id = %s", (planet_id,))
            result_properties = cursor.fetchone()
            
            if result_properties:
                properties_id = result_properties[0]
                
                cursor.execute("SELECT orbital_id FROM orbital_info WHERE properties_id = %s", (properties_id,))
                result_orbital = cursor.fetchone()

                if result_orbital:
                    orbital_id = result_orbital[0]
                    
                    cursor.execute("""
                        INSERT INTO radial_velocity_and_obliquity(
                            radial_velocity_amplitude_ms,
                            radial_velocity_amplitude_upper_unc_ms,
                            radial_velocity_amplitude_lower_unc_ms,
                            radial_velocity_amplitude_limit_flag,
                            projected_obliquity_deg,
                            projected_obliquity_upper_unc_deg,
                            projected_obliquity_lower_unc_deg,
                            projected_obliquity_limit_flag,
                            true_obliquity_deg,
                            true_obliquity_upper_unc_deg,
                            true_obliquity_lower_unc_deg,
                            true_obliquity_limit_flag,
                            orbital_id
                        ) VALUES (
                            %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s
                        )
                    """, (
                        replace_na_with_null(row['radial_velocity_amplitude_ms']),
                        replace_na_with_null(row['radial_velocity_amplitude_upper_unc_ms']),
                        replace_na_with_null(row['radial_velocity_amplitude_lower_unc_ms']),
                        replace_na_with_null(row['radial_velocity_amplitude_limit_flag']),
                        replace_na_with_null(row['projected_obliquity_deg']),
                        replace_na_with_null(row['projected_obliquity_upper_unc_deg']),
                        replace_na_with_null(row['projected_obliquity_lower_unc_deg']),
                        replace_na_with_null(row['projected_obliquity_limit_flag']),
                        replace_na_with_null(row['true_obliquity_deg']),
                        replace_na_with_null(row['true_obliquity_upper_unc_deg']),
                        replace_na_with_null(row['true_obliquity_lower_unc_deg']),
                        replace_na_with_null(row['true_obliquity_limit_flag']),
                        orbital_id  
                    ))

    conn.commit()

The columns being imported into the eleventh table (coordinates_and_motion) will be: <br> 
ra_str<br>ra_deg<br>dec_str<br>dec_deg<br>galactic_latitude_deg<br>galactic_longitude_deg<br>ecliptic_latitude_deg<br>ecliptic_longitude_deg<br>total_proper_motion_mas_yr<br>total_proper_motion_upper_unc_mas_yr<br>total_proper_motion_lower_unc_mas_yr<br>proper_motion_ra_mas_yr<br>proper_motion_ra_upper_unc_mas_yr<br>proper_motion_ra_lower_unc_mas_yr<br>proper_motion_dec_mas_yr<br>proper_motion_dec_upper_unc_mas_yr<br>proper_motion_dec_lower_unc_mas_yr<br>distance_pc<br>distance_upper_unc_pc<br>distance_lower_unc_pc<br>parallax_mas<br>parallax_upper_unc_mas<br>parallax_lower_unc_mas


In [431]:
eleventh_table_query = """CREATE TABLE IF NOT EXISTS coordinates_and_motion (
    coordinates_id INT AUTO_INCREMENT PRIMARY KEY,
    ra_str VARCHAR(25),
    ra_deg FLOAT,
    dec_str VARCHAR(25),
    dec_deg FLOAT,
    galactic_latitude_deg FLOAT,
    galactic_longitude_deg FLOAT, 
    ecliptic_latitude_deg FLOAT,
    ecliptic_longitude_deg FLOAT,
    total_proper_motion_mas_yr FLOAT,
    total_proper_motion_upper_unc_mas_yr FLOAT, 
    total_proper_motion_lower_unc_mas_yr FLOAT, 
    proper_motion_ra_mas_yr FLOAT,
    proper_motion_ra_upper_unc_mas_yr FLOAT, 
    proper_motion_ra_lower_unc_mas_yr FLOAT, 
    proper_motion_dec_mas_yr FLOAT, 
    proper_motion_dec_upper_unc_mas_yr FLOAT, 
    proper_motion_dec_lower_unc_mas_yr FLOAT,
    distance_pc FLOAT,
    distance_upper_unc_pc FLOAT,
    distance_lower_unc_pc FLOAT,
    parallax_mas FLOAT,
    parallax_upper_unc_mas FLOAT,
    parallax_lower_unc_mas FLOAT,
    orbital_id INT,
    FOREIGN KEY (orbital_id) REFERENCES orbital_info(orbital_id)
);
"""
# Create a query for the table. Create a table with autoincrementing primary ID and link foreign id from orbital_info table. 

In [433]:
# Create the table using above query
with engine.connect() as connection:
    connection.execute(text(eleventh_table_query))
    print("Eleventh table created successfully!")

Eleventh table created successfully!


In [435]:
with engine.connect() as connection:
    cursor.execute("USE exoplanets;")

    for _, row in exoplanets.iterrows():
        cursor.execute("SELECT id FROM planet_identifiers WHERE planet_name = %s", (row['planet_name'],))
        result = cursor.fetchone()

        if result:
            planet_id = result[0]  # Fetch the planet_id if the result is correct
        
            cursor.execute("SELECT properties_id FROM planet_properties WHERE planet_id = %s", (planet_id,))
            result_properties = cursor.fetchone()
            
            if result_properties:
                properties_id = result_properties[0]
                
                cursor.execute("SELECT orbital_id FROM orbital_info WHERE properties_id = %s", (properties_id,))
                result_orbital = cursor.fetchone()

                if result_orbital:
                    orbital_id = result_orbital[0]
                    
                    cursor.execute("""
                        INSERT INTO coordinates_and_motion(
                            ra_str,
                            ra_deg,
                            dec_str,
                            dec_deg,
                            galactic_latitude_deg,
                            galactic_longitude_deg,
                            ecliptic_latitude_deg,
                            ecliptic_longitude_deg,
                            total_proper_motion_mas_yr,
                            total_proper_motion_upper_unc_mas_yr,
                            total_proper_motion_lower_unc_mas_yr,
                            proper_motion_ra_mas_yr,
                            proper_motion_ra_upper_unc_mas_yr,
                            proper_motion_ra_lower_unc_mas_yr,
                            proper_motion_dec_mas_yr,
                            proper_motion_dec_upper_unc_mas_yr,
                            proper_motion_dec_lower_unc_mas_yr,
                            distance_pc,
                            distance_upper_unc_pc,
                            distance_lower_unc_pc,
                            parallax_mas,
                            parallax_upper_unc_mas,
                            parallax_lower_unc_mas,
                            orbital_id
                        ) VALUES (
                            %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s
                        )
                    """, (
                        replace_na_with_null(row['ra_str']),
                        replace_na_with_null(row['ra_deg']),
                        replace_na_with_null(row['dec_str']),
                        replace_na_with_null(row['dec_deg']),
                        replace_na_with_null(row['galactic_latitude_deg']),
                        replace_na_with_null(row['galactic_longitude_deg']),
                        replace_na_with_null(row['ecliptic_latitude_deg']),
                        replace_na_with_null(row['ecliptic_longitude_deg']),
                        replace_na_with_null(row['total_proper_motion_mas_yr']),
                        replace_na_with_null(row['total_proper_motion_upper_unc_mas_yr']),
                        replace_na_with_null(row['total_proper_motion_lower_unc_mas_yr']),
                        replace_na_with_null(row['proper_motion_ra_mas_yr']),
                        replace_na_with_null(row['proper_motion_ra_upper_unc_mas_yr']),
                        replace_na_with_null(row['proper_motion_ra_lower_unc_mas_yr']),
                        replace_na_with_null(row['proper_motion_dec_mas_yr']),
                        replace_na_with_null(row['proper_motion_dec_upper_unc_mas_yr']),
                        replace_na_with_null(row['proper_motion_dec_lower_unc_mas_yr']),
                        replace_na_with_null(row['distance_pc']),
                        replace_na_with_null(row['distance_upper_unc_pc']),
                        replace_na_with_null(row['distance_lower_unc_pc']),
                        replace_na_with_null(row['parallax_mas']),
                        replace_na_with_null(row['parallax_upper_unc_mas']),
                        replace_na_with_null(row['parallax_lower_unc_mas']),
                        orbital_id  
                    ))

    conn.commit()

The columns being imported into the twelfth table (magnitudes) will be: <br>
b_magnitude_johnson<br> 
b_magnitude_johnson_upper_unc<br> 
b_magnitude_johnson_lower_unc<br>
v_magnitude_johnson<br> 
v_magnitude_johnson_upper_unc<br> 
v_magnitude_johnson_lower_unc<br> 
j_magnitude_2mass<br> 
j_magnitude_2mass_upper_unc<br> 
j_magnitude_2mass_lower_unc<br> 
h_magnitude_2mass<br>
h_magnitude_2mass_upper_unc<br> 
h_magnitude_2mass_lower_unc


In [437]:
twelfth_table_query = """CREATE TABLE IF NOT EXISTS magnitudes (
    mag_id INT AUTO_INCREMENT PRIMARY KEY,    
    b_magnitude_johnson FLOAT,
    b_magnitude_johnson_upper_unc FLOAT,
    b_magnitude_johnson_lower_unc FLOAT,
    v_magnitude_johnson FLOAT,
    v_magnitude_johnson_upper_unc FLOAT,
    v_magnitude_johnson_lower_unc FLOAT,
    j_magnitude_2mass FLOAT,
    j_magnitude_2mass_upper_unc FLOAT,
    j_magnitude_2mass_lower_unc FLOAT,
    h_magnitude_2mass FLOAT,
    h_magnitude_2mass_upper_unc FLOAT,
    h_magnitude_2mass_lower_unc FLOAT,
    orbital_id INT,
    FOREIGN KEY (orbital_id) REFERENCES orbital_info(orbital_id)
);
"""
# Create a query for the table. Create a table with autoincrementing primary ID and link foreign id from orbital_info table. 

In [439]:
# Create the table using above query
with engine.connect() as connection:
    connection.execute(text(twelfth_table_query))
    print("Twelfth table created successfully!")

Twelfth table created successfully!


In [441]:
with engine.connect() as connection:
    cursor.execute("USE exoplanets;")

    for _, row in exoplanets.iterrows():
        cursor.execute("SELECT id FROM planet_identifiers WHERE planet_name = %s", (row['planet_name'],))
        result = cursor.fetchone()

        if result:
            planet_id = result[0]  # Fetch the planet_id if the result is correct
        
            cursor.execute("SELECT properties_id FROM planet_properties WHERE planet_id = %s", (planet_id,))
            result_properties = cursor.fetchone()
            
            if result_properties:
                properties_id = result_properties[0]
                
                cursor.execute("SELECT orbital_id FROM orbital_info WHERE properties_id = %s", (properties_id,))
                result_orbital = cursor.fetchone()

                if result_orbital:
                    orbital_id = result_orbital[0]
                    
                    cursor.execute("""
                        INSERT INTO magnitudes(
                            b_magnitude_johnson,
                            b_magnitude_johnson_upper_unc,
                            b_magnitude_johnson_lower_unc,
                            v_magnitude_johnson,
                            v_magnitude_johnson_upper_unc,
                            v_magnitude_johnson_lower_unc,
                            j_magnitude_2mass,
                            j_magnitude_2mass_upper_unc,
                            j_magnitude_2mass_lower_unc,
                            h_magnitude_2mass,
                            h_magnitude_2mass_upper_unc,
                            h_magnitude_2mass_lower_unc,
                            orbital_id
                        ) VALUES (
                            %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s
                        )
                    """, (
                        replace_na_with_null(row['b_magnitude_johnson']),
                        replace_na_with_null(row['b_magnitude_johnson_upper_unc']),
                        replace_na_with_null(row['b_magnitude_johnson_lower_unc']),
                        replace_na_with_null(row['v_magnitude_johnson']),
                        replace_na_with_null(row['v_magnitude_johnson_upper_unc']),
                        replace_na_with_null(row['v_magnitude_johnson_lower_unc']),
                        replace_na_with_null(row['j_magnitude_2mass']),
                        replace_na_with_null(row['j_magnitude_2mass_upper_unc']),
                        replace_na_with_null(row['j_magnitude_2mass_lower_unc']),
                        replace_na_with_null(row['h_magnitude_2mass']),
                        replace_na_with_null(row['h_magnitude_2mass_upper_unc']),
                        replace_na_with_null(row['h_magnitude_2mass_lower_unc']),
                        orbital_id  
                    ))

    conn.commit()

The columns being imported into the thirteenth table (stellar_properties) will be: <br> 
spectral_type<br>stellar_effective_temp_k<br>stellar_effective_temp_upper_unc_k<br>stellar_effective_temp_lower_unc_k<br>stellar_effective_temp_limit_flag<br>stellar_radius_solar_radius<br>stellar_radius_upper_unc_solar_radius<br>stellar_radius_lower_unc_solar_radius<br>stellar_radius_limit_flag<br>stellar_mass_solar_mass<br>stellar_mass_upper_unc_solar_mass<br>stellar_mass_lower_unc_solar_mass<br>stellar_mass_limit_flag<br>stellar_metallicity_dex<br>stellar_metallicity_upper_unc_dex<br>stellar_metallicity_lower_unc_dex<br>stellar_metallicity_limit_flag<br>stellar_metallicity_ratio<br>stellar_luminosity_log_solar<br>stellar_luminosity_upper_unc_log_solar<br>stellar_luminosity_lower_unc_log_solar<br>stellar_luminosity_limit_flag<br>stellar_surface_gravity_log10_cms2<br>stellar_surface_gravity_upper_unc_log10_cms2<br>stellar_surface_gravity_lower_unc_log10_cms2<br>stellar_surface_gravity_limit_flag<br>stellar_age_gyr<br>stellar_age_upper_unc_gyr<br>stellar_age_lower_unc_gyr<br>stellar_age_limit_flag<br>stellar_density_gcm3<br>stellar_density_upper_unc_gcm3<br> stellar_density_lower_unc_gcm3<br>
stellar_density_limit_flag<br>stellar_rotational_velocity_kms
<br>stellar_rotational_velocity_upper_unc_kms<br>stellar_rotational_velocity_lower_unc_kms<br>stellar_rotational_velocity_limit_flag<br>stellar_rotational_period_days<br>stellar_rotational_period_upper_unc_days<br>stellar_rotational_period_lower_unc_days<br>stellar_rotational_period_limit_flag<br>systemic_radial_velocity_kms<br>systemic_radial_velocity_upper_unc_kms<br>systemic_radial_velocity_lower_unc_kms<br>systemic_radial_velocity_limit_flag


In [443]:
thirteenth_table_query = """CREATE TABLE IF NOT EXISTS stellar_properties (
    stellar_id INT AUTO_INCREMENT PRIMARY KEY,    
    spectral_type LONGTEXT,
    stellar_effective_temp_k FLOAT,
    stellar_effective_temp_upper_unc_k FLOAT,
    stellar_effective_temp_lower_unc_k FLOAT,
    stellar_effective_temp_limit_flag BOOLEAN,
    stellar_radius_solar_radius FLOAT, 
    stellar_radius_upper_unc_solar_radius FLOAT, 
    stellar_radius_lower_unc_solar_radius FLOAT,
    stellar_radius_limit_flag BOOLEAN, 
    stellar_mass_solar_mass FLOAT,
    stellar_mass_upper_unc_solar_mass FLOAT, 
    stellar_mass_lower_unc_solar_mass FLOAT,
    stellar_mass_limit_flag BOOLEAN, 
    stellar_metallicity_dex FLOAT,
    stellar_metallicity_upper_unc_dex FLOAT,
    stellar_metallicity_lower_unc_dex FLOAT,
    stellar_metallicity_limit_flag BOOLEAN,
    stellar_metallicity_ratio VARCHAR(10),
    stellar_luminosity_log_solar FLOAT,
    stellar_luminosity_upper_unc_log_solar FLOAT, 
    stellar_luminosity_lower_unc_log_solar FLOAT,
    stellar_luminosity_limit_flag BOOLEAN,
    stellar_surface_gravity_log10_cms2 FLOAT,
    stellar_surface_gravity_upper_unc_log10_cms2 FLOAT,
    stellar_surface_gravity_lower_unc_log10_cms2 FLOAT,
    stellar_surface_gravity_limit_flag BOOLEAN,
    stellar_age_gyr FLOAT,
    stellar_age_upper_unc_gyr FLOAT,
    stellar_age_lower_unc_gyr FLOAT,
    stellar_age_limit_flag BOOLEAN,
    stellar_density_gcm3 FLOAT,
    stellar_density_upper_unc_gcm3 FLOAT,
    stellar_density_lower_unc_gcm3 FLOAT,
    stellar_density_limit_flag BOOLEAN,
    stellar_rotational_velocity_kms FLOAT,
    stellar_rotational_velocity_upper_unc_kms FLOAT,
    stellar_rotational_velocity_lower_unc_kms FLOAT,
    stellar_rotational_velocity_limit_flag BOOLEAN,
    stellar_rotational_period_days FLOAT,
    stellar_rotational_period_upper_unc_days FLOAT,
    stellar_rotational_period_lower_unc_days FLOAT,
    stellar_rotational_period_limit_flag BOOLEAN,
    systemic_radial_velocity_kms FLOAT,
    systemic_radial_velocity_upper_unc_kms FLOAT,
    systemic_radial_velocity_lower_unc_kms FLOAT,
    systemic_radial_velocity_limit_flag BOOLEAN,
    system_id INT,
    FOREIGN KEY (system_id) REFERENCES system_info(system_id)
);
"""
# Create a query for the table. Create a table with autoincrementing primary ID and link foreign id from system_info table. 

In [445]:
# Create the table using above query
with engine.connect() as connection:
    connection.execute(text(thirteenth_table_query))
    print("Thirteenth table created successfully!")

Thirteenth table created successfully!


In [447]:
with engine.connect() as connection:
    cursor.execute("USE exoplanets;")

    for _, row in exoplanets.iterrows():
        cursor.execute("SELECT id FROM planet_identifiers WHERE planet_name = %s", (row['planet_name'],))
        result = cursor.fetchone() 

        if result:
            planet_id = result[0] 
        
            cursor.execute("SELECT system_id FROM system_info WHERE planet_id = %s", (planet_id,))
            result_system = cursor.fetchone()
            
            if result_properties:
                system_id = result_system[0]
                
        cursor.execute("""
    INSERT INTO stellar_properties (
        spectral_type,
        stellar_effective_temp_k,
        stellar_effective_temp_upper_unc_k,
        stellar_effective_temp_lower_unc_k,
        stellar_effective_temp_limit_flag,
        stellar_radius_solar_radius,
        stellar_radius_upper_unc_solar_radius,
        stellar_radius_lower_unc_solar_radius,
        stellar_radius_limit_flag,
        stellar_mass_solar_mass,
        stellar_mass_upper_unc_solar_mass,
        stellar_mass_lower_unc_solar_mass,
        stellar_mass_limit_flag,
        stellar_metallicity_dex,
        stellar_metallicity_upper_unc_dex,
        stellar_metallicity_lower_unc_dex,
        stellar_metallicity_limit_flag,
        stellar_metallicity_ratio,
        stellar_luminosity_log_solar,
        stellar_luminosity_upper_unc_log_solar,
        stellar_luminosity_lower_unc_log_solar,
        stellar_luminosity_limit_flag,
        stellar_surface_gravity_log10_cms2,
        stellar_surface_gravity_upper_unc_log10_cms2,
        stellar_surface_gravity_lower_unc_log10_cms2,
        stellar_surface_gravity_limit_flag,
        stellar_age_gyr,
        stellar_age_upper_unc_gyr,
        stellar_age_lower_unc_gyr,
        stellar_age_limit_flag,
        stellar_density_gcm3,
        stellar_density_upper_unc_gcm3,
        stellar_density_lower_unc_gcm3,
        stellar_density_limit_flag,
        stellar_rotational_velocity_kms,
        stellar_rotational_velocity_upper_unc_kms,
        stellar_rotational_velocity_lower_unc_kms,
        stellar_rotational_velocity_limit_flag,
        stellar_rotational_period_days,
        stellar_rotational_period_upper_unc_days,
        stellar_rotational_period_lower_unc_days,
        stellar_rotational_period_limit_flag,
        systemic_radial_velocity_kms,
        systemic_radial_velocity_upper_unc_kms,
        systemic_radial_velocity_lower_unc_kms,
        systemic_radial_velocity_limit_flag,
        system_id
    ) VALUES (
        %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
        %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, 
        %s, %s, %s, %s, %s, %s, %s
    )
""", (
    replace_na_with_null(row['spectral_type']),
    replace_na_with_null(row['stellar_effective_temp_k']),
    replace_na_with_null(row['stellar_effective_temp_upper_unc_k']),
    replace_na_with_null(row['stellar_effective_temp_lower_unc_k']),
    replace_na_with_null(row['stellar_effective_temp_limit_flag']),
    replace_na_with_null(row['stellar_radius_solar_radius']),
    replace_na_with_null(row['stellar_radius_upper_unc_solar_radius']),
    replace_na_with_null(row['stellar_radius_lower_unc_solar_radius']),
    replace_na_with_null(row['stellar_radius_limit_flag']),
    replace_na_with_null(row['stellar_mass_solar_mass']),
    replace_na_with_null(row['stellar_mass_upper_unc_solar_mass']),
    replace_na_with_null(row['stellar_mass_lower_unc_solar_mass']),
    replace_na_with_null(row['stellar_mass_limit_flag']),
    replace_na_with_null(row['stellar_metallicity_dex']),
    replace_na_with_null(row['stellar_metallicity_upper_unc_dex']),
    replace_na_with_null(row['stellar_metallicity_lower_unc_dex']),
    replace_na_with_null(row['stellar_metallicity_limit_flag']),
    replace_na_with_null(row['stellar_metallicity_ratio']),
    replace_na_with_null(row['stellar_luminosity_log_solar']),
    replace_na_with_null(row['stellar_luminosity_upper_unc_log_solar']),
    replace_na_with_null(row['stellar_luminosity_lower_unc_log_solar']),
    replace_na_with_null(row['stellar_luminosity_limit_flag']),
    replace_na_with_null(row['stellar_surface_gravity_log10_cms2']),
    replace_na_with_null(row['stellar_surface_gravity_upper_unc_log10_cms2']),
    replace_na_with_null(row['stellar_surface_gravity_lower_unc_log10_cms2']),
    replace_na_with_null(row['stellar_surface_gravity_limit_flag']),
    replace_na_with_null(row['stellar_age_gyr']),
    replace_na_with_null(row['stellar_age_upper_unc_gyr']),
    replace_na_with_null(row['stellar_age_lower_unc_gyr']),
    replace_na_with_null(row['stellar_age_limit_flag']),
    replace_na_with_null(row['stellar_density_gcm3']),
    replace_na_with_null(row['stellar_density_upper_unc_gcm3']),
    replace_na_with_null(row['stellar_density_lower_unc_gcm3']),
    replace_na_with_null(row['stellar_density_limit_flag']),
    replace_na_with_null(row['stellar_rotational_velocity_kms']),
    replace_na_with_null(row['stellar_rotational_velocity_upper_unc_kms']),
    replace_na_with_null(row['stellar_rotational_velocity_lower_unc_kms']),
    replace_na_with_null(row['stellar_rotational_velocity_limit_flag']),
    replace_na_with_null(row['stellar_rotational_period_days']),
    replace_na_with_null(row['stellar_rotational_period_upper_unc_days']),
    replace_na_with_null(row['stellar_rotational_period_lower_unc_days']),
    replace_na_with_null(row['stellar_rotational_period_limit_flag']),
    replace_na_with_null(row['systemic_radial_velocity_kms']),
    replace_na_with_null(row['systemic_radial_velocity_upper_unc_kms']),
    replace_na_with_null(row['systemic_radial_velocity_lower_unc_kms']),
    replace_na_with_null(row['systemic_radial_velocity_limit_flag']),
    system_id
))

conn.commit()


In [449]:
#Close the database connection :)
cursor.close()
conn.close()

**MySQL Workbench**<br>
To export your database schema as a .PNG:<br>
->Go to your EER Diagram<br>
->File<br>
->Export<br>
->Export as .PNG