# 30 Day Map Challenge

## Day 9 - Space -> Part 1: Processing the starbase [data](http://www.astronexus.com/hyg) (HYG 3.0 used)

Plan: Drawing Inspiration from [Eleanor Lutz](https://github.com/eleanorlutz/western_constellations_atlas_of_space), star maps

In [1]:
import numpy as np
import os
import pandas as pd

Setting the Working Directory

In [2]:
os.getcwd()
os.chdir("c:\\Users\\vicks\\OneDrive\\Data Science (not uni)\\Portfolio\\30 Day Map Challenge\\30 Day Map Challenge Data\\Space")

Importing the data

In [3]:
stars = pd.read_csv("hygdata_v3.csv.gz")
stars.head()

Unnamed: 0,id,hip,hd,hr,gl,bf,proper,ra,dec,dist,...,bayer,flam,con,comp,comp_primary,base,lum,var,var_min,var_max
0,0,,,,,,Sol,0.0,0.0,0.0,...,,,,1,0,,1.0,,,
1,1,1.0,224700.0,,,,,6e-05,1.089009,219.7802,...,,,Psc,1,1,,9.63829,,,
2,2,2.0,224690.0,,,,,0.000283,-19.49884,47.9616,...,,,Cet,1,2,,0.392283,,,
3,3,3.0,224699.0,,,,,0.000335,38.859279,442.4779,...,,,And,1,3,,386.901132,,,
4,4,4.0,224707.0,,,,,0.000569,-51.893546,134.2282,...,,,Phe,1,4,,9.366989,,,


Removal of our sun from the data

In [4]:
stars = stars[stars["proper"] != "Sol"]

Building a dictionary of Bayer Designations and their relative non-ASCII Greek letters

In [5]:
greek_dict = {'Alp': u"α",'Bet': u"β",'Chi': u"χ",'Del': u"δ",'Eps': u"ε",'Eta': u"η", 'Gam': u"γ",'Iot': u"ι",'Kap': u"κ",
              'Lam': u"λ",'Mu': u"μ",'Nu': u"ν", 'Ome': u"ω",'Omi': u"ο",'Phi': u"φ",'Pi': u"π",'Psi': u"ψ",'Rho': u"ρ",
              'Sig': u"σ",'Tau': u"τ",'The': u"θ",'Ups': u"υ",'Xi': u"ξ",'Zet': u"ζ"}

Creating a list of greek letters based off the Bayer designations of the stars in the dataframe

In [6]:
print(stars[pd.notnull(stars['bayer'])]['bayer'].unique())
def get_greek_letter(n):
    if str(n) == 'nan':
        return(np.nan)
    split = n.split("-")
    greek = greek_dict.get(split[0])
    if len(split) > 1:
        r = greek + split[1]
    else: 
        r = greek
    return(r)

['Tau' 'The' 'Zet' 'Alp' 'Bet' 'Kap-1' 'Eps' 'Gam-3' 'Kap-2' 'Gam' 'Chi'
 'Sig' 'Iot' 'Pi' 'Rho' 'Kap' 'Eta' 'Lam-1' 'Bet-1' 'Bet-2' 'Lam' 'Bet-3'
 'Lam-2' 'Del' 'Mu' 'Xi' 'Phi-1' 'Omi' 'Nu' 'Phi-2' 'Ups-1' 'Phi-3'
 'Ups-2' 'Phi-4' 'Ome' 'Psi-1' 'Ups' 'Psi-2' 'Phi' 'Psi-3' 'Psi' 'Tau-1'
 'Tau-2' 'Eta-1' 'Gam-2' 'Eta-2' 'Gam-1' 'Xi-1' 'Pi-1' 'Pi-2' 'Xi-2'
 'Iot-1' 'Iot-2' 'Eta-3' 'Rho-1' 'Rho-2' 'Rho-3' 'The-1' 'Tau-3' 'Zet-1'
 'Zet-2' 'Tau-4' 'Chi-1' 'Chi-2' 'Chi-3' 'Tau-5' 'Tau-6' 'Tau-7' 'Tau-8'
 'Tau-9' 'Ome-1' 'Omi-1' 'Omi-2' 'Ome-2' 'Ups-4' 'Del-1' 'Del-2' 'Del-3'
 'The-2' 'Sig-1' 'Sig-2' 'Pi-3' 'Pi-4' 'Pi-5' 'Pi-6' 'Nu-1' 'Nu-2' 'Nu-3'
 'Psi-4' 'Psi-5' 'Psi-6' 'Psi-7' 'Psi-8' 'Psi-9' 'Mu-1' 'Mu-2' 'Sig-3'
 'Alp-1' 'Alp-2' 'Zet-3' 'Zet-4' 'Eps-1' 'Eps-2']


Adding a greek letter column to the dataframe

In [7]:
stars['greek_letters'] = stars['bayer'].apply(get_greek_letter)
stars.head()

Unnamed: 0,id,hip,hd,hr,gl,bf,proper,ra,dec,dist,...,flam,con,comp,comp_primary,base,lum,var,var_min,var_max,greek_letters
1,1,1.0,224700.0,,,,,6e-05,1.089009,219.7802,...,,Psc,1,1,,9.63829,,,,
2,2,2.0,224690.0,,,,,0.000283,-19.49884,47.9616,...,,Cet,1,2,,0.392283,,,,
3,3,3.0,224699.0,,,,,0.000335,38.859279,442.4779,...,,And,1,3,,386.901132,,,,
4,4,4.0,224707.0,,,,,0.000569,-51.893546,134.2282,...,,Phe,1,4,,9.366989,,,,
5,5,5.0,224705.0,,,,,0.000665,-40.591202,257.732,...,,Phe,1,5,,21.998851,,,,


Extracting the Spectral Class information and removing MK system luminosity class to look just at Morgan-Keenan designations

In [8]:
print(len(stars[pd.notnull(stars['spect'])]['spect'].unique()), 'unique spectral designations')

4307 unique spectral designations


In [9]:
def get_first_letter(name):
    '''Preprocess spectral designations to remove numbers'''
    if str(name) != 'nan':
        if len(name) > 1:
            if name[0:2] == 'sd':
                name = name[2::]
            alphas = ''.join(c for c in name if c not in '?:!/;.,[]{}()')
            return(alphas[0].upper())
        else:
            return(name.upper())
    return(name)

Clearing the NAN values

In [10]:
stars['dist'].replace(to_replace=100000, value=np.nan, inplace=True)

Extracting list of unique spectral designations

In [11]:
stars['spect_desig'] = stars['spect'].apply(get_first_letter)    
print(len(stars[pd.notnull(stars['spect_desig'])]['spect_desig'].unique()), 'unique spectral designations')
print(stars[pd.notnull(stars['spect_desig'])]['spect_desig'].unique())

14 unique spectral designations
['F' 'K' 'B' 'G' 'M' 'A' 'C' 'R' 'O' 'W' 'N' 'S' 'D' 'P']


Building a colour  dictionary so plotted stars can be colour coded by spectral designation

In [12]:
color_dict = { 
    'O':'#57F0DE', 'B':'#06D3AC', 'A':'#028DAE', 'F':'#216975', 'G':'#4D4C88', 'K':'#6F3AA4', 'M':'#5A3874',  'L':'#FF2620',
    'T':'#D6B4F8', 'Y':'#275DC6', 'C':'#009263', 'R':'#009263', 'W':'#009263', 'N':'#009263', 'S':'#009263', 'D':'#009263',
    'P':'#009263', 'nan': '#30275C'
}

Adding colour and line colour columns to the dataframe

In [13]:
stars['color'] = stars['spect_desig'].replace(to_replace=color_dict)
stars['color'] = stars['color'].replace(to_replace=np.nan, value='#30275C')

Adding a lavander outline for the dark NANs

In [14]:
stars['linecolor'] = stars['color'].replace(['#30275C'], ['#B6ACE6'])

Saving processed database to a csv and human eye observable stars as a seperate csv

In [16]:
stars.to_csv("hygdata_processed.csv", index = False)
stars65 = stars[stars["mag"] <= 6.5]
stars65.to_csv("hygdata_processed_mag65.csv", index=False)

In [18]:
stars65.to_csv("hygdata_processed_mag65.csv", index=False)