# **Stars Classification**

## Overview

This notebook focuses on classifying stars based on their physical properties using machine learning techniques.

## Dataset

The dataset consisting of several features of stars.

Some of them are:

- Absolute Temperature (in K)
- Relative Luminosity (L/Lo)
- Relative Radius (R/Ro)
- Absolute Magnitude (Mv)
- Star Color (white,Red,Blue,Yellow,yellow-orange etc)
- Spectral Class (O,B,A,F,G,K,,M)
- Star Type **(Red Dwarf, Brown Dwarf, White Dwarf, Main Sequence , SuperGiants, HyperGiants)**
- Lo = 3.828 x 10^26 Watts (Avg Luminosity of Sun)
- Ro = 6.9551 x 10^8 m (Avg Radius of Sun)

## Data Preprocessing

In [None]:
# Import library
import pandas as pd
import numpy as np
pd.set_option('future.no_silent_downcasting', True)

In [None]:
# Import data
star = pd.read_csv('Stars.csv')

In [None]:
# Displays the first few rows of the DataFrame.
star.head()

In [None]:
# Overview of the DataFrame, including the data types and non-null counts.
star.info()

In [None]:
# Provides descriptive statistics of the DataFrame
star.describe()

## Feature Selection

In [None]:
# number of categories
star[['Spectral Class']].value_counts()

In [None]:
# encoding
star['Spectral Class'] = star['Spectral Class'].replace(
    {'M': 0, 'A': 1, 'B': 1, 'F': 1, 'O': 1, 'K': 1, 'G': 1}
).astype(int)

In [None]:
# number of categories
star[['Star type']].value_counts()

In [None]:
# Replaces values in the 'Star color' column for encoding categorical variables
star.replace({'Star color':{ 'Red':0, 'Yellow':1, 'White':2, 'White ': 2, 'Blue ':3, 'Blue':3 }}, inplace=True)

In [None]:
# Counts the occurrences of each category in the 'Star color' column.
star[['Star color']].value_counts()

In [None]:
# define target and features
y = star['Spectral Class']
X = star[['Temperature (K)', 'Luminosity (L/Lo)', 'Radius (R/Ro)',
       'Absolute magnitude (Mv)']]

## Train the Model

In [None]:
# split
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size = 0.8, random_state = 200)

In [None]:
# select model
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()

In [None]:
# train model
model = LogisticRegression(max_iter=1000)  # You can adjust the number as needed
model.fit(X_train,y_train)

## Make a prediction

In [None]:
# predict
y_pred = model.predict(X_test)

In [None]:
y_pred

In [None]:
# import function
from sklearn.metrics import confusion_matrix, classification_report

In [None]:
confusion_matrix(y_test,y_pred)

In [None]:
# Plot the confusion matrix using a heatmap
plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues", cbar=False)
plt.title("Confusion Matrix")
plt.xlabel("Predicted Star Type")
plt.ylabel("Actual Star Type")
plt.show()

In [None]:
print(classification_report(y_test,y_pred))