# Stars

The file **stars.csv** contains information about 240 stars.

- **Temperature (K)** — the temperature in Kelvin;
- **Luminosity (L/Lo)** — the luminosity of the star relative to the solar luminosity L = 3.828 * 10^26 (W);
- **Radius (R/Ro)** — the radius of the star relative to the radius of the sun R = 6.9551 * 10^8 (m);
- **Absolute magnitude (Mv)** — the absolute magnitude of the star;
- **Star color** — the color of the star;
- **Star type** — the type of star, represented by a number from 0 to 5, where:
	- 0 — Red Dwarf,
	- 1 — Brown Dwarf,
	- 2 — White Dwarf,
	- 3 — Main Sequence,
	- 4 — Super Giants,
	- 5 — Hyper Giants;
- **Spectral Class** — the spectral class of the star (one of O, B, A, F, G, K, or M).

I took the task here: [fadeevlecturer.github.io](https://fadeevlecturer.github.io/python_lectures/docs/index.html)

**`Task:`**

1. **Clean the color column:** standardize the values in this column so that variations like 'Blue white', 'Blue White', and 'Blue-white' are treated as the same;
2. **Star type names:** create a new column where the star type is represented as a full string instead of a number;
3. **Convert spectral class to numbers:** add a new column where the spectral class is represented by numbers, using the following mapping:
	- O → 0,
	- B → 1,
	- A → 2,
	- F → 3,
	- G → 4,
	- K → 5,
	- M → 6;
4. **Count the number of stars:** for each star color, star type, and spectral class, calculate the number of stars;
5. **Star type analysis:** for each star type, find the minimum, average, and maximum values of the absolute magnitude;
6. **Spectral class analysis:** for each spectral class, find the minimum, average, and maximum values of the temperature;
7. **Correlation analysis:** compute pairwise correlations between all numerical columns.


**`1. Cleaning of the color column:`**

In [4]:
import pandas as pd
stars = pd.read_csv('stars.csv')
stars.sample(n=10)

Unnamed: 0,Temperature (K),Luminosity(L/Lo),Radius(R/Ro),Absolute magnitude(Mv),Star type,Star color,Spectral Class
165,7282,131000.0,24.0,-7.22,4,Blue,O
223,23440,537430.0,81.0,-5.975,4,Blue,O
114,3610,132000.0,1522.0,-10.86,5,Red,M
134,3542,0.0009,0.62,14.23,1,Red,M
116,4015,282000.0,1534.0,-11.39,5,Red,K
69,2871,0.00072,0.12,19.43,0,Red,M
1,3042,0.0005,0.1542,16.6,0,Red,M
59,3535,195000.0,1546.0,-11.36,5,Red,M
141,21020,0.0015,0.0112,11.52,2,Blue,B
58,3752,209000.0,955.0,-11.24,5,Red,M


In [7]:
stars['Star color'].value_counts()

Star color
Red                   112
Blue                   55
Blue-white             26
Blue White             10
yellow-white            8
White                   7
Blue white              3
Yellowish White         3
white                   3
Whitish                 2
Orange                  2
yellowish               2
Pale yellow orange      1
White-Yellow            1
Blue                    1
Yellowish               1
Orange-Red              1
Blue white              1
Blue-White              1
Name: count, dtype: int64

In [8]:
replace_color_dict = {
  'Blue-white': 'Blue white',
  'Blue White': 'Blue white',
  'yellow-white': 'Yellowish white',
  'Yellowish White': 'Yellowish white',
  'white': 'White',
  'yellowish': 'Yellowish',
  'White-Yellow': 'White yellow',
  'Orange-Red': 'Orange red',
  'Blue-White': 'Blue white'
}
stars['Star color'] = stars['Star color'].replace(replace_color_dict)
stars['Star color'].unique()

array(['Red', 'Blue white', 'White', 'Yellowish white',
       'Pale yellow orange', 'Blue', 'Whitish', 'Orange', 'White yellow',
       'Blue ', 'Yellowish', 'Orange red', 'Blue white '], dtype=object)