# Automated Star Classification with Random Forest

### Data Features
- **_RAJ2000**: Right ascension (deg)
- **_DEC2000**: Declination (deg)
- **HIP**: Identifier
- **RAhms**: Right ascension (hr, min, sec)
- **DEdms**: Declination (deg, min, sec)
- **Vmag**: Apparent magnitude aka. m (magnitude)
- **Plx**: Parallax (marcsec), distance to star = 1/parallax
- **e_Plx**: Parallax error
- **B-V**: the color index of a star, calculated as the difference between its apparent magnitudes in the B (blue) and V (visible) photometric bands. It provides insight into the star's temperature and spectral characteristics:
    - Blue stars (hotter) have smaller B-V values (negative or near zero).
    - Reddish stars (cooler) have larger B-V values (positive).
- **Notes**
- **SpType**: The spectral type classifies stars based on their surface temperature and luminosity, represented by a letter and optional Roman numeral:
    - **I, II, III**: Giants - Large and luminous stars at various stages of evolution.
    - **IV, V, VI**: Dwarfs - Main sequence stars (V), subgiants (IV), and subdwarfs (VI).
    - **VII**: White dwarfs - Small, dense stars near the end of their life cycle.
    - *None*: Special stars that do not fit the giant/dwarf classification scheme neatly.

*(Source: The Hipparcos and Tycho Catalogues)*

In [7]:
import pandas as pd

data = pd.read_csv('data_only.tsv', delimiter='\t')

     _RAJ2000   _DEJ2000  HIP        RAhms        DEdms   Vmag    Plx  e_Plx  \
0    0.000899   1.089009    1  00 00 00.22  +01 05 20.4   9.10   3.54   1.39   
1    0.004265 -19.498840    2  00 00 00.91  -19 29 55.8   9.27  21.90   3.10   
2    0.005024  38.859279    3  00 00 01.20  +38 51 33.4   6.61   2.81   0.63   
3    0.008629 -51.893546    4  00 00 02.01  -51 53 36.8   8.06   7.75   0.97   
4    0.009973 -40.591202    5  00 00 02.39  -40 35 28.4   8.55   2.87   1.11   
..        ...        ...  ...          ...          ...    ...    ...    ...   
195  0.627006  51.829976  196  00 02 30.48  +51 49 47.9   8.95   0.98   1.18   
196  0.632304  42.147779  197  00 02 31.76  +42 08 52.1   8.97   1.62   1.16   
197  0.632749  15.335173  198  00 02 31.85  +15 20 06.7   8.94   5.12   1.11   
198  0.635004  16.256565  199  00 02 32.41  +16 15 23.6   7.24   4.07   0.88   
199  0.635489 -33.992633  200  00 02 32.51  -33 59 33.6  10.32   3.17   1.67   

        B-V Notes SpType  
0     0.482 