# Creation of Stars to be Visualized

This notebook should process the `mag_5_stars` dataset. A slight modification was needed to work with it, and this modification is the addition of the column name hemisphere. Because of this, this notebook should process `mag_5_stars_edited` file.

`mag_5_stars_edited.csv` should be placed in the `Data Input` folder

## Importing necessary packages

In [1]:
import pandas as pd
import numpy as np
import os
from pathlib import Path

## Setting up the important directories

After running the cell below, two folders should be created. `Data Input` is where input data will be placed. Inside `Data Processed` will be two new folders that will keep the processed information.

In [2]:
working_dir = Path.cwd()

data_input_dir = working_dir / "Data Input"
data_process_dir = working_dir / "Data Processed"

if not data_input_dir.exists():
    os.mkdir(data_input_dir)
if not data_process_dir.exists():
    os.mkdir(data_process_dir)

## Processing

### Mag 5 Star Catalog
---
I was able to acquire the catalog from `John P. Pratt` on his [website](https://www.johnpratt.com/items/astronomy/mag_5_stars.html). The website details why the catalog was created and the new features it brings over the catalogs that it was derived from.

Reading the `mag_5_stars_edited.csv` file. Here the stars are sorted in terms of brightness.

In [3]:
import_frame = pd.read_csv(str(data_input_dir/"mag_5_stars_edited.csv"))

processed_frame = import_frame.sort_values(by=['Vmag'])

Printing all the columns present in the catalog

In [4]:
processed_frame.head()

Unnamed: 0,HR,Name,ID,#,Ltr,Dbl,Con#,Con,RA,RAm,...,U-B,Sp Type,pmRA,pmD,Distance,RV,Sid Lat,Sid Lon,Zod lon,Zod
1007,2491,Sirius,Alp CMa,9.0,Alp,,34.0,CMa,6,45,...,-0.05,A1Vm,-0.553,-1.205,8.7,-8,-39.6353,286.5225,17,Gem
584,2326,Canopus,Alp Car,,Alp,,47.2,Car,6,23,...,0.1,F0II,0.022,0.021,116.4,21,-75.8413,287.4015,18,Gem
419,5340,Arcturus,Alp Boo,16.0,Alp,,25.0,Boo,14,15,...,1.27,K1.5III,-1.093,-1.998,36.2,-5,31.0098,26.6746,27,Vir
816,5459,Rigil Kentaurus,Alp1 Cen,,Alp1,A,37.0,Cen,14,39,...,0.24,G2V,-3.642,0.699,4.3,-22,-42.4348,61.9195,2,Sco
2322,7001,Vega,Alp Lyr,3.0,Alp,,16.0,Lyr,18,36,...,-0.01,A0Va,0.202,0.286,26.5,-14,61.7597,107.757,18,Sgr


In the following cell, I will process the data, retaining all the columns that I think I can use.

I have restricted the stars used in this analysis to the first 500 brightest stars. This could easily be changed in the next cell. In an attempt to convey their brightness visually, I have incorporated information regarding their size and opacity when plotted. It is important to note that this process is arbitrary in nature.

In [5]:
entry_list = []

sirius_mag = -1.46
process_sirius_mag = np.exp(sirius_mag)

count = 0
Pmag_500 = 0
AlphaPlot_500 = 0

for index, row in processed_frame.iterrows():
    RA_hour = row["RA"]
    RA_minute = row["RAm"]
    RA_second = row["RAs"]

    RA_time = RA_hour + (RA_minute/60) + (RA_second/3600)
    RA_deg = RA_time*15

    hemisphere = row["hemisphere"]

    Dec_deg = row["Dec"]
    Dec_minute = row["Dm"]
    Dec_second = row["Ds"]

    Dec_decimal = Dec_deg + (Dec_minute/60) + (Dec_second/3600)

    if hemisphere == "S":
        Dec_decimal = -Dec_decimal

    process_mag = np.exp(float(row["Vmag"]))
    process_mag = process_sirius_mag/process_mag
    Pmag = process_mag ** (1/15)
    AlphaPlot = process_mag ** (1/10)
    print(f"{Pmag} | {AlphaPlot}")

    count += 1
    if count == 500:
        Pmag_500 = Pmag
        AlphaPlot_500 = AlphaPlot

    new_entry = {'Name': row['Name'], 'RA': RA_deg,
                 "Dec": Dec_decimal, "Vmag": row['Vmag'], "Pmag": Pmag, "AlphaPlot": AlphaPlot}
    entry_list.append(new_entry)

print(f"Pmag 500: {Pmag_500} | AlphaPlot_500 {AlphaPlot_500}")

1.0 | 1.0
0.9518637888816799 | 0.9286716938412872
0.9096761093060532 | 0.867621256485914
0.9078585752273646 | 0.8650222931107413
0.9054408441009895 | 0.8615691148989583
0.9024277325964113 | 0.8572720210114574
0.9000244644245421 | 0.8538497819684817
0.8845584662462257 | 0.8319358038266718
0.8798533791446438 | 0.8253068684916823
0.8775102290555767 | 0.8220122346781865
0.8710986917457983 | 0.813019649987571
0.8618563524741942 | 0.8001148492945541
0.8572720210114575 | 0.7937394660352428
0.8510083543721982 | 0.7850561775518294
0.8498744326821505 | 0.7834876342628625
0.8408572823643508 | 0.7710515858035663
0.8397368864178139 | 0.7695110237075758
0.8347135501780261 | 0.7626164964050653
0.8347135501780261 | 0.7626164964050653
0.8302735949819326 | 0.7565399032150474
0.8302735949819326 | 0.7565399032150474
0.8291673012150301 | 0.755028335480208
0.8209169487181869 | 0.7437874280201796
0.8138330762829207 | 0.7341807700262634
0.8138330762829207 | 0.7341807700262634
0.8132907017103442 | 0.7334469562

## Saving the information

The dataframe will be saved to `Clean_Star_List.csv` file.

In [6]:
columns = ["Name", "RA", "Dec", "Vmag", "Pmag","AlphaPlot"]
processed_frame = pd.DataFrame(columns=columns)

for entry in entry_list:
    processed_frame = pd.concat([processed_frame, pd.DataFrame(
        [entry], columns=columns)], ignore_index=True)

output_path = data_process_dir / "Clean_Star_List.csv"
processed_frame.to_csv(str(output_path), index=False)

  processed_frame = pd.concat([processed_frame, pd.DataFrame(
