# Stars

The file **stars.csv** contains information about 240 stars.

- **Temperature (K)** — the temperature in Kelvin;
- **Luminosity (L/Lo)** — the luminosity of the star relative to the solar luminosity L = 3.828 * 10^26 (W);
- **Radius (R/Ro)** — the radius of the star relative to the radius of the sun R = 6.9551 * 10^8 (m);
- **Absolute magnitude (Mv)** — the absolute magnitude of the star;
- **Star color** — the color of the star;
- **Star type** — the type of star, represented by a number from 0 to 5, where:
	- 0 — Red Dwarf,
	- 1 — Brown Dwarf,
	- 2 — White Dwarf,
	- 3 — Main Sequence,
	- 4 — Super Giants,
	- 5 — Hyper Giants;
- **Spectral Class** — the spectral class of the star (one of O, B, A, F, G, K, or M).

I took the task here: [fadeevlecturer.github.io](https://fadeevlecturer.github.io/python_lectures/docs/index.html)

**`Task:`**

1. **Clean the color column:** standardize the values in this column so that variations like 'Blue white', 'Blue White', and 'Blue-white' are treated as the same;
2. **Star type names:** create a new column where the star type is represented as a full string instead of a number;
3. **Convert spectral class to numbers:** add a new column where the spectral class is represented by numbers, using the following mapping:
	- O → 0,
	- B → 1,
	- A → 2,
	- F → 3,
	- G → 4,
	- K → 5,
	- M → 6;
4. **Count the number of stars:** for each star color, star type, and spectral class, calculate the number of stars;
5. **Star type analysis:** for each star type, find the minimum, average, and maximum values of the absolute magnitude;
6. **Spectral class analysis:** for each spectral class, find the minimum, average, and maximum values of the temperature;
7. **Correlation analysis:** compute pairwise correlations between all numerical columns.


**`Cleaning of the color column:`**

In [22]:
import pandas as pd
stars = pd.read_csv('stars.csv')
stars.sample(n=10)

Unnamed: 0,Temperature (K),Luminosity(L/Lo),Radius(R/Ro),Absolute magnitude(Mv),Star type,Star color,Spectral Class
84,14100,0.00067,0.0089,12.17,2,Blue White,B
81,10574,0.00014,0.0092,12.02,2,White,F
142,18290,0.0013,0.00934,12.78,2,Blue,B
17,3692,0.00367,0.47,10.8,1,Red,M
145,8924,0.00028,0.00879,14.87,2,Blue white,A
168,17383,342900.0,30.0,-6.09,4,Blue,O
99,36108,198000.0,10.2,-4.4,3,Blue,O
95,11250,672.0,6.98,-2.3,3,Blue-white,A
19,3441,0.039,0.351,11.18,1,Red,M
150,29560,188000.0,6.02,-4.01,3,Blue-white,B


In [23]:
stars['Star color'].value_counts()

Star color
Red                   112
Blue                   55
Blue-white             26
Blue White             10
yellow-white            8
White                   7
Blue white              3
Yellowish White         3
white                   3
Whitish                 2
Orange                  2
yellowish               2
Pale yellow orange      1
White-Yellow            1
Blue                    1
Yellowish               1
Orange-Red              1
Blue white              1
Blue-White              1
Name: count, dtype: int64

In [24]:
replace_color_dict = {
  'Blue-white': 'Blue white',
  'Blue White': 'Blue white',
  'yellow-white': 'Yellowish white',
  'Yellowish White': 'Yellowish white',
  'white': 'White',
  'yellowish': 'Yellowish',
  'White-Yellow': 'White yellow',
  'Orange-Red': 'Orange red',
  'Blue-White': 'Blue white'
}
stars['Star color'] = stars['Star color'].replace(replace_color_dict)
stars['Star color'].unique()

array(['Red', 'Blue white', 'White', 'Yellowish white',
       'Pale yellow orange', 'Blue', 'Whitish', 'Orange', 'White yellow',
       'Blue ', 'Yellowish', 'Orange red', 'Blue white '], dtype=object)

**`Star type names`**

In [25]:
stars['Star type'].unique()

array([0, 1, 2, 3, 4, 5])

In [26]:
replace_type_dict = {
  0: 'Red Dwarf',
  1: 'Brown Dwarf',
  2: 'White Dwarf',
  3: 'Main Sequence',
  4: 'Super Giants',
  5: 'Hyper Giants'
}
stars['Star type names'] = stars['Star type'].replace(replace_type_dict)
stars['Star type names'].unique()

array(['Red Dwarf', 'Brown Dwarf', 'White Dwarf', 'Main Sequence',
       'Super Giants', 'Hyper Giants'], dtype=object)

In [27]:
stars.sample(n=10)

Unnamed: 0,Temperature (K),Luminosity(L/Lo),Radius(R/Ro),Absolute magnitude(Mv),Star type,Star color,Spectral Class,Star type names
69,2871,0.00072,0.12,19.43,0,Red,M,Red Dwarf
141,21020,0.0015,0.0112,11.52,2,Blue,B,White Dwarf
105,14245,231000.0,42.0,-6.12,4,Blue,O,Super Giants
153,16390,1278.0,5.68,-3.32,3,Blue white,B,Main Sequence
195,3598,0.0027,0.67,13.667,1,Red,M,Brown Dwarf
213,22012,6748.0,6.64,-2.55,3,Blue white,B,Main Sequence
124,3511,0.00064,0.109,17.12,0,Red,M,Red Dwarf
23,8500,0.0005,0.01,14.5,2,White,A,White Dwarf
79,3158,0.00135,0.161,13.98,1,Red,M,Brown Dwarf
130,3095,0.00019,0.492,10.87,1,Red,M,Brown Dwarf


**`Converting of spectral class to numbers:`**

In [28]:
stars['Spectral Class'].unique()

array(['M', 'B', 'A', 'F', 'O', 'K', 'G'], dtype=object)

In [34]:
replace_class_dict = {
  'M': 6,
  'B': 1,
  'A': 2,
  'F': 3,
  'O': 0,
  'K': 5,
  'G': 4
}
stars['Spectral Class numbers'] = stars['Spectral Class'].replace(replace_class_dict)
stars['Spectral Class numbers'].unique()

array([6, 1, 2, 3, 0, 5, 4], dtype=object)

In [35]:
stars.sample(n=10)

Unnamed: 0,Temperature (K),Luminosity(L/Lo),Radius(R/Ro),Absolute magnitude(Mv),Star type,Star color,Spectral Class,Star type names,Spectral Class numbers
89,19860,0.0011,0.0131,11.34,2,Blue,B,White Dwarf,1
209,19360,0.00125,0.00998,11.62,2,Blue,B,White Dwarf,1
79,3158,0.00135,0.161,13.98,1,Red,M,Brown Dwarf,6
31,30000,28840.0,6.3,-4.2,3,Blue white,B,Main Sequence,1
62,2983,0.00024,0.094,16.09,0,Red,M,Red Dwarf,6
9,2700,0.00018,0.13,16.05,0,Red,M,Red Dwarf,6
11,3129,0.0122,0.3761,11.79,1,Red,M,Brown Dwarf,6
138,3324,0.0034,0.34,12.23,1,Red,M,Brown Dwarf,6
197,3496,0.00125,0.336,14.94,1,Red,M,Brown Dwarf,6
82,8930,0.00056,0.0095,13.78,2,White,A,White Dwarf,2
