# Improved Gaia Data
Based on the data mappings provided by [VizieR](https://vizier.cds.unistra.fr/viz-bin/VizieR-3?-source=I/355), we created an improved dataset of stellar radii, mass, and atomic abundances.

From this dataset, we can calculate the stellar core temperature using more accurate formulas based on the ideal gas law (or something more complex if possible).

We will also derive error formulas for core temperature as well.

## Data Processing
First we will have to find the absolute proportion of each element in the star. The gaia data has the abundances in a relative format compared to the Sun. 
Most are in relation to iron, but iron is related to hydrogen. So the only concentrations that are required are hydrogen and iron.

It operates on the following formula:

![Element Abundance Formula](imgs/latex_element_abundance.png)

Where:
`E sub 1` = Any one element such as iron, silicon, or the "metals" (all elements except hydrogen and helium)

`E sub 2` = Another element that E sub 1 is compared to, usually hydrogen or iron

Subscript star represents the ratio of the two elements in the given star

Subscript sun represents the ratio of the two elements in the Sun

## Solar Abundances
Solar abundances as given by [this paper](https://www.jstor.org/stable/pdf/1741141.pdf) where elemental abundances are measured relative to hydrogen
as the number of atoms of that element per 10^12 atoms of hydrogen.

The sun is 71% hydrogen by mass (91.2% of the atoms are hydrogen) by mass according to [this page](http://www.hyperphysics.gsu.edu/hbase/Tables/suncomp.html).
The sun's mass is 1.989 × 10^30 kg, so there are approximately 8.4 * 10^56 atoms of hydrogen in the sun.

The relevant abundances are:

| Element   | Log Abundance Per 10^12 Hydrogen Atoms | % of Sun's Mass | # of Atoms       |
|-----------|----------------------------------------|-----------------|------------------|
| Hydrogen  | 12                                     | 71%             | 8.438245 • 10^56 |
| Helium    | 10.8                                   | 26.89884%       | 8.049642 • 10^55 |
| Carbon    | 8.62                                   | 0.35273%        | 3.517646 • 10^53 |
| Nitrogen  | 7.94                                   | 0.08594%        | 7.349404 • 10^52 |
| Oxygen    | 8.84                                   | 0.77976%        | 5.837839 • 10^53 |
| Magnesium | 7.60                                   | 0.06817%        | 3.359326 • 10^52 |
| Silicon   | 7.65                                   | 0.08838%        | 3.769226 • 10^52 |
| Sulfur    | 7.2                                    | 0.03580%        | 1.337372 • 10^52 |
| Calcium   | 6.35                                   | 0.00632%        | 1.889088 • 10^51 |
| Iron      | 7.50                                   | 0.12441%        | 2.668407 • 10^52 |
| Nickel    | 6.28                                   | 0.00788%        | 1.607874 • 10^51 |

In [1]:
# First calculate the uncalculated abundances for each element

def format_scientific(n):
    a = '%E' % n
    return a.split('E')[0].rstrip('0').rstrip('.') + ' • 10^' + a.split('E')[1].lstrip('+')


HYDROGEN_MASS_CONTENT = 0.71
HYDROGEN_ATOM_CONTENT = .912
HELIUM_ATOM_CONTENT = .087
ATOMIC_UNIT = 1.6605402E-27  # kg
SOLAR_MASS = 1.989 * 10 ** 30  # kg
H_MASS = 1.00784 * ATOMIC_UNIT
HE_MASS = 4.002602 * ATOMIC_UNIT
CARBON_MASS = 12.011 * ATOMIC_UNIT
NITROGEN_MASS = 14.0067 * ATOMIC_UNIT
OXYGEN_MASS = 15.999 * ATOMIC_UNIT
NEON_MASS = 20.1797 * ATOMIC_UNIT
MG_MASS = 24.305 * ATOMIC_UNIT
SILICON_MASS = 28.0855 * ATOMIC_UNIT
SULFUR_MASS = 32.065 * ATOMIC_UNIT
CALCIUM_MASS = 40.08 * ATOMIC_UNIT
IRON_MASS = 55.845 * ATOMIC_UNIT
NICKEL_MASS = 58.6934 * ATOMIC_UNIT

SOLAR_HYDROGEN_ATOM_COUNT = (HYDROGEN_MASS_CONTENT * SOLAR_MASS) / H_MASS
SOLAR_ATOM_COUNT = SOLAR_HYDROGEN_ATOM_COUNT / HYDROGEN_ATOM_CONTENT
SOLAR_HELIUM_ATOM_COUNT = SOLAR_ATOM_COUNT * HELIUM_ATOM_CONTENT
HE_COUNT = SOLAR_ATOM_COUNT * HELIUM_ATOM_CONTENT
CARBON_COUNT = 10 ** 8.62 / 10 ** 12 * SOLAR_HYDROGEN_ATOM_COUNT
NITROGEN_COUNT = 10 ** 7.94 / 10 ** 12 * SOLAR_HYDROGEN_ATOM_COUNT
OXYGEN_COUNT = 10 ** 8.84 / 10 ** 12 * SOLAR_HYDROGEN_ATOM_COUNT
MAGNESIUM_COUNT = 10 ** 7.60 / 10 ** 12 * SOLAR_HYDROGEN_ATOM_COUNT
SILICON_COUNT = 10 ** 7.65 / 10 ** 12 * SOLAR_HYDROGEN_ATOM_COUNT
SULFUR_COUNT = 10 ** 7.2 / 10 ** 12 * SOLAR_HYDROGEN_ATOM_COUNT
CALCIUM_COUNT = 10 ** 6.35 / 10 ** 12 * SOLAR_HYDROGEN_ATOM_COUNT
IRON_COUNT = 10 ** 7.50 / 10 ** 12 * SOLAR_HYDROGEN_ATOM_COUNT
NICKEL_COUNT = 10 ** 6.28 / 10 ** 12 * SOLAR_HYDROGEN_ATOM_COUNT

# Print the number of atoms and the percentage of the sun's mass
print(f"Hydrogen: {SOLAR_HYDROGEN_ATOM_COUNT * H_MASS / SOLAR_MASS * 100:.5f}% | {format_scientific(SOLAR_HYDROGEN_ATOM_COUNT)} atoms")
print(f"Helium: {HE_COUNT * HE_MASS / SOLAR_MASS * 100:.5f}% | {format_scientific(HE_COUNT)} atoms")
print(f"Carbon: {CARBON_COUNT * CARBON_MASS / SOLAR_MASS * 100:.5f}% | {format_scientific(CARBON_COUNT)} atoms")
print(f"Nitrogen: {NITROGEN_COUNT * NITROGEN_MASS / SOLAR_MASS * 100:.5f}% | {format_scientific(NITROGEN_COUNT)} atoms")
print(f"Oxygen: {OXYGEN_COUNT * OXYGEN_MASS / SOLAR_MASS * 100:.5f}% | {format_scientific(OXYGEN_COUNT)} atoms")
print(f"Magnesium: {MAGNESIUM_COUNT * MG_MASS / SOLAR_MASS * 100:.5f}% | {format_scientific(MAGNESIUM_COUNT)} atoms")
print(f"Silicon: {SILICON_COUNT * SILICON_MASS / SOLAR_MASS * 100:.5f}% | {format_scientific(SILICON_COUNT)} atoms")
print(f"Sulfur: {SULFUR_COUNT * SULFUR_MASS / SOLAR_MASS * 100:.5f}% | {format_scientific(SULFUR_COUNT)} atoms")
print(f"Calcium: {CALCIUM_COUNT * CALCIUM_MASS / SOLAR_MASS * 100:.5f}% | {format_scientific(CALCIUM_COUNT)} atoms")
print(f"Iron: {IRON_COUNT * IRON_MASS / SOLAR_MASS * 100:.5f}% | {format_scientific(IRON_COUNT)} atoms")
print(f"Nickel: {NICKEL_COUNT * NICKEL_MASS / SOLAR_MASS * 100:.5f}% | {format_scientific(NICKEL_COUNT)} atoms")

Hydrogen: 71.00000% | 8.438245 • 10^56 atoms
Helium: 26.89884% | 8.049642 • 10^55 atoms
Carbon: 0.35273% | 3.517646 • 10^53 atoms
Nitrogen: 0.08594% | 7.349404 • 10^52 atoms
Oxygen: 0.77976% | 5.837839 • 10^53 atoms
Magnesium: 0.06817% | 3.359326 • 10^52 atoms
Silicon: 0.08838% | 3.769226 • 10^52 atoms
Sulfur: 0.03580% | 1.337372 • 10^52 atoms
Calcium: 0.00632% | 1.889088 • 10^51 atoms
Iron: 0.12441% | 2.668407 • 10^52 atoms
Nickel: 0.00788% | 1.607874 • 10^51 atoms


## Necessary Constants

The following constants are necessary for calculating average atomic mass of elements using spectral data taken from Gaia DR3.

In [2]:
import numpy as np

# Log10 of iron to hydrogen ratio
RATIO_FE_HYDROGEN = np.log10(IRON_COUNT / SOLAR_HYDROGEN_ATOM_COUNT)
RATIO_HE_HYDROGEN = np.log10(HE_COUNT / SOLAR_HYDROGEN_ATOM_COUNT)
SOLAR_METALICITY = np.log10((SOLAR_ATOM_COUNT - SOLAR_HYDROGEN_ATOM_COUNT - HE_COUNT) / SOLAR_HYDROGEN_ATOM_COUNT)
RATIO_SI_FE = np.log10(SILICON_COUNT / IRON_COUNT)
RATIO_CA_FE = np.log10(CALCIUM_COUNT / IRON_COUNT)
RATIO_S_FE = np.log10(SULFUR_COUNT / IRON_COUNT)
RATIO_MG_FE = np.log10(MAGNESIUM_COUNT / IRON_COUNT)
RATIO_N_FE = np.log10(NITROGEN_COUNT / IRON_COUNT)
RATIO_NI_FE = np.log10(NICKEL_COUNT / IRON_COUNT)

### Data Collection

Now we must retrieve the data from the Gaia DR3 data set.

The data is stored in a table called `astrophysical_parameters` (additional data like right ascension and declination are stored in `gaia_source` and will be included via a join).

We will use the following ADQL query to retrieve the data:

```sql
SELECT TOP 1000000
        gaia.source_id, 
		gaia.ra as ra,
		gaia.dec as dec,
        parameters.logg_gspphot AS log_surface_gravity,
        parameters.logg_gspphot_lower AS log_surface_gravity_lower,
        parameters.logg_gspphot_upper AS log_surface_gravity_upper,
        parameters.distance_gspphot AS dist,
        parameters.distance_gspphot_lower AS distance_lower,
        parameters.distance_gspphot_upper AS distance_upper,
        parameters.mass_flame AS mass,
        parameters.mass_flame_lower AS mass_lower,
        parameters.mass_flame_upper AS mass_upper,
        parameters.radius_gspphot AS radius,
        parameters.radius_gspphot_lower AS radius_lower,
        parameters.radius_gspphot_upper AS radius_upper,
        parameters.fem_gspspec AS log_fe_h_abudnance,
        parameters.fem_gspspec_lower AS log_fe_h_abudnance_lower,
        parameters.fem_gspspec_upper AS log_fe_h_abudnance_upper, -- Fe to hydrogen
        parameters.mh_gspspec AS log_metalicity,
        parameters.mh_gspspec_lower AS log_metalicity_lower,
        parameters.mh_gspspec_upper AS log_metalicity_upper, -- Metal (atoms heavier than helium) to hydrogen
        parameters.sife_gspspec AS log_si_fe_abundance,
        parameters.sife_gspspec_lower AS log_si_fe_abundance_lower,
        parameters.sife_gspspec_upper AS log_si_fe_abundance_upper, -- Silicon to iron
        parameters.cafe_gspspec AS log_ca_fe_abundance,
        parameters.cafe_gspspec_lower AS log_ca_fe_abundance_lower,
        parameters.cafe_gspspec_upper AS log_ca_fe_abundance_upper, -- Calcium to iron
        parameters.tife_gspspec AS log_ti_fe_abundance,
        parameters.tife_gspspec_lower AS log_ti_fe_abundance_lower,
        parameters.tife_gspspec_upper AS log_ti_fe_abundance_upper, -- Titanium to iron
        parameters.mgfe_gspspec AS log_mg_fe_abundance,
        parameters.mgfe_gspspec_lower AS log_mg_fe_abundance_lower,
        parameters.mgfe_gspspec_upper AS log_mg_fe_abundance_upper, -- Magnesium to iron
        parameters.ndfe_gspspec AS log_nd_fe_abundance,
        parameters.ndfe_gspspec_lower AS log_nd_fe_abundance_lower,
        parameters.ndfe_gspspec_upper AS log_nd_fe_abundance_upper, -- Neodymium to iron
        parameters.sfe_gspspec AS log_s_fe_abundance,
        parameters.sfe_gspspec_lower AS log_s_fe_abundance_lower,
        parameters.sfe_gspspec_upper AS log_s_fe_abundance_upper, -- Sulfur to iron
        parameters.zrfe_gspspec AS log_zr_fe_abundance,
        parameters.zrfe_gspspec_lower AS log_zr_fe_abundance_lower,
        parameters.zrfe_gspspec_upper AS log_zr_fe_abundance_upper, -- Zirconium to iron
        parameters.nfe_gspspec AS log_n_fe_abundance,
        parameters.nfe_gspspec_lower AS log_n_fe_abundance_lower,
        parameters.nfe_gspspec_upper AS log_n_fe_abundance_upper, -- Nitrogen to iron
        parameters.crfe_gspspec AS log_cr_fe_abundance,
        parameters.crfe_gspspec_lower AS log_cr_fe_abundance_lower,
        parameters.crfe_gspspec_upper AS log_cr_fe_abundance_upper, -- Chromium to iron
        parameters.cefe_gspspec AS log_ce_fe_abundance,
        parameters.cefe_gspspec_lower AS log_ce_fe_abundance_lower,
        parameters.cefe_gspspec_upper AS log_ce_fe_abundance_upper, -- Cerium to iron
        parameters.nife_gspspec AS log_ni_fe_abundance,
        parameters.nife_gspspec_lower AS log_ni_fe_abundance_lower,
        parameters.nife_gspspec_upper AS log_ni_fe_abundance_upper -- Nickel to iron
FROM gaiadr3.astrophysical_parameters AS parameters
INNER JOIN gaiadr3.gaia_source as gaia
ON gaia.source_id = parameters.source_id
WHERE 
        parameters.mass_flame IS NOT NULL
        AND parameters.radius_gspphot IS NOT NULL
        AND parameters.fem_gspspec IS NOT NULL
```

### NOTE
It may be necessary to run this query directly in the Gaia archive because the query may take too long to run and thus be incomplete.
Gaia's TAP service has a 1 minute timeout for queries. (That is what I had to do to get the full dataset).

In [15]:
# Import gaia libraries
import pandas as pd
from astropy.table import Table
from astropy.io import fits
from astroquery.gaia import Gaia

Gaia.MAIN_GAIA_TABLE = "gaiadr3.gaia_source"

## Log in to Gaia
Use credentials from gaia/CREDENTIALS file

In [None]:
Gaia.login(credentials_file='gaia/CREDENTIALS')
username = 'mwidmaie'

## Run job

In [None]:
job = Gaia.launch_job("""SELECT TOP 1000000
        gaia.source_id, 
		gaia.ra as ra,
		gaia.dec as dec,
        parameters.logg_gspphot AS log_surface_gravity,
        parameters.logg_gspphot_lower AS log_surface_gravity_lower,
        parameters.logg_gspphot_upper AS log_surface_gravity_upper,
        parameters.distance_gspphot AS dist,
        parameters.distance_gspphot_lower AS distance_lower,
        parameters.distance_gspphot_upper AS distance_upper,
        parameters.mass_flame AS mass,
        parameters.mass_flame_lower AS mass_lower,
        parameters.mass_flame_upper AS mass_upper,
        parameters.radius_gspphot AS radius,
        parameters.radius_gspphot_lower AS radius_lower,
        parameters.radius_gspphot_upper AS radius_upper,
        parameters.fem_gspspec AS log_fe_h_abudnance,
        parameters.fem_gspspec_lower AS log_fe_h_abudnance_lower,
        parameters.fem_gspspec_upper AS log_fe_h_abudnance_upper, -- Fe to hydrogen
        parameters.mh_gspspec AS log_metalicity,
        parameters.mh_gspspec_lower AS log_metalicity_lower,
        parameters.mh_gspspec_upper AS log_metalicity_upper, -- Metal (atoms heavier than helium) to hydrogen
        parameters.sife_gspspec AS log_si_fe_abundance,
        parameters.sife_gspspec_lower AS log_si_fe_abundance_lower,
        parameters.sife_gspspec_upper AS log_si_fe_abundance_upper, -- Silicon to iron
        parameters.cafe_gspspec AS log_ca_fe_abundance,
        parameters.cafe_gspspec_lower AS log_ca_fe_abundance_lower,
        parameters.cafe_gspspec_upper AS log_ca_fe_abundance_upper, -- Calcium to iron
        parameters.mgfe_gspspec AS log_mg_fe_abundance,
        parameters.mgfe_gspspec_lower AS log_mg_fe_abundance_lower,
        parameters.mgfe_gspspec_upper AS log_mg_fe_abundance_upper, -- Magnesium to iron
        parameters.sfe_gspspec AS log_s_fe_abundance,
        parameters.sfe_gspspec_lower AS log_s_fe_abundance_lower,
        parameters.sfe_gspspec_upper AS log_s_fe_abundance_upper, -- Sulfur to iron
        parameters.nfe_gspspec AS log_n_fe_abundance,
        parameters.nfe_gspspec_lower AS log_n_fe_abundance_lower,
        parameters.nfe_gspspec_upper AS log_n_fe_abundance_upper, -- Nitrogen to iron
        parameters.nife_gspspec AS log_ni_fe_abundance,
        parameters.nife_gspspec_lower AS log_ni_fe_abundance_lower,
        parameters.nife_gspspec_upper AS log_ni_fe_abundance_upper -- Nickel to iron
FROM gaiadr3.astrophysical_parameters AS parameters
INNER JOIN gaiadr3.gaia_source as gaia
ON gaia.source_id = parameters.source_id
WHERE 
        parameters.mass_flame IS NOT NULL
        AND parameters.radius_gspphot IS NOT NULL
        AND parameters.fem_gspspec IS NOT NULL""", dump_to_file=True, output_format='votable', output_file='data/gaia_astrophysical_params.vot')

results = job.get_results()

## Getting GALAH Survey Data (DR 3)
We can download the FITS file from GALAH's (cloud storage)[https://cloud.datacentral.org.au/teamdata/GALAH/public/GALAH_DR3/GALAH_DR3_main_allstar_v2.fits]

We will then load the file and match the data with Gaia's data.

In [None]:
import requests

url = 'https://cloud.datacentral.org.au/teamdata/GALAH/public/GALAH_DR3/GALAH_DR3_main_allstar_v2.fits'

response = requests.get(url)
response.raise_for_status()

# Store to 'data/galah_all_stars_spectroscopy.fits'
with open('data/GALAH_DR3_main_allstar_v2.fits.gz', 'wb') as f:
    f.write(response.content)

### Loading the data

Now we load the data into a pandas dataframe and we can begin to process it.
In general, it is safe to assume that any remaining mass in the star is helium.

Here we will run error calculations, determine average atomic mass, and compute temperature and pressure.

**Please note that it may take a long time to load the dataset because it is very large**

In [13]:
gaia_astrophysical_parameters: pd.DataFrame = Table.read('data/gaia_astrophysical_params.vot').to_pandas()

with fits.open('data/GALAH_DR3_main_allstar_v2.fits') as data:
    galah_spectroscopy: pd.DataFrame = pd.DataFrame(data[0].data)

Unnamed: 0,SOURCE_ID,ra,dec,log_surface_gravity,log_surface_gravity_lower,log_surface_gravity_upper,dist,distance_lower,distance_upper,mass,...,log_n_fe_abundance_upper,log_cr_fe_abundance,log_cr_fe_abundance_lower,log_cr_fe_abundance_upper,log_ce_fe_abundance,log_ce_fe_abundance_lower,log_ce_fe_abundance_upper,log_ni_fe_abundance,log_ni_fe_abundance_lower,log_ni_fe_abundance_upper
0,6030022070837144192,256.451915,-28.386731,0.4258,0.4236,0.4298,8873.726562,8792.102539,8896.735352,1.924311,...,,,,,,,,,,
1,6030024480434041088,256.473786,-28.289514,3.9771,3.9747,3.9794,177.857895,177.256805,178.513504,1.409851,...,0.65,,,,,,,,,
2,5961446561404741888,264.533028,-38.853071,0.8011,0.7857,0.8183,4266.241211,4183.351074,4363.954102,1.917952,...,,,,,,,,,,
3,5253591734375632768,160.349961,-62.539926,2.0188,1.9949,2.0451,2655.576416,2542.323242,2756.025879,5.191580,...,,,,,,,,,,
4,6030054785677545216,255.460271,-28.874299,2.8480,2.8385,2.8586,833.728271,824.458679,842.214905,1.325225,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
897941,1795090244313637760,328.126261,23.536874,3.2907,3.2828,3.2952,650.739502,647.503113,656.571472,1.610108,...,,,,,,,,,,
897942,1796199411028057344,326.934031,24.756984,2.2941,2.2742,2.3111,2155.986572,2120.895508,2202.656494,3.501088,...,,,,,,,,,,
897943,1796205252183600256,326.939726,24.866595,2.4276,2.4196,2.4526,427.665710,416.779785,431.484314,3.001646,...,0.04,,,,,,,-0.01,-0.15,0.1
897944,1796205900723417472,326.932210,24.903695,2.7891,2.7769,2.8004,946.286682,934.122070,959.664307,2.701854,...,,,,,,,,,,


In [None]:
gaia_astrophysical_parameters

In [None]:
galah_spectroscopy

### Solve absolute abundances
We will have to use stellar models to predict hydrogen/helium content to predict atomic averages

In [14]:
test_star = astrophysical_parameters.iloc[1]

# For now, we will have to use solar data as the basis for hydrogen content

HYDROGEN_COUNT = test_star['mass'] * SOLAR_MASS * HYDROGEN_MASS_CONTENT / H_MASS
ATOM_TOTAL = HYDROGEN_COUNT / HYDROGEN_ATOM_CONTENT

# Base formula is relative abundance = log(star abundance) - log(solar abundance)
def true_abundance(relative, solar, atom_base):
    return 10**(relative + solar) * atom_base

metals = true_abundance(test_star['log_metalicity'], SOLAR_METALICITY, HYDROGEN_COUNT)
iron_atoms = true_abundance(test_star['log_fe_h_abundance'], RATIO_FE_HYDROGEN, HYDROGEN_COUNT)
silicon_atoms = true_abundance(test_star['log_si_fe_abundance'], RATIO_SI_FE, iron_atoms)

remaining_metals = metals - iron_atoms - silicon_atoms
helium_atoms = ATOM_TOTAL - HYDROGEN_COUNT - metals

test_star

SOURCE_ID                    6.030024e+18
ra                           2.564738e+02
dec                         -2.828951e+01
log_surface_gravity          3.977100e+00
log_surface_gravity_lower    3.974700e+00
log_surface_gravity_upper    3.979400e+00
dist                         1.778579e+02
distance_lower               1.772568e+02
distance_upper               1.785135e+02
mass                         1.409851e+00
mass_lower                   1.369776e+00
mass_upper                   1.449930e+00
radius                       2.090500e+00
radius_lower                 2.082200e+00
radius_upper                 2.099100e+00
log_fe_h_abundance          -9.000000e-02
log_fe_h_abundance_lower    -1.900000e-01
log_fe_h_abundance_upper     4.000000e-02
log_metalicity               4.000000e-02
log_metalicity_lower         2.000000e-02
log_metalicity_upper         6.000000e-02
log_si_fe_abundance          6.000000e-02
log_si_fe_abundance_lower   -5.000000e-02
log_si_fe_abundance_upper    1.400