# Stars

The file **stars.csv** contains information about 240 stars.

- **Temperature (K)** — the temperature in Kelvin;
- **Luminosity (L/Lo)** — the luminosity of the star relative to the solar luminosity L = 3.828 * 10^26 (W);
- **Radius (R/Ro)** — the radius of the star relative to the radius of the sun R = 6.9551 * 10^8 (m);
- **Absolute magnitude (Mv)** — the absolute magnitude of the star;
- **Star color** — the color of the star;
- **Star type** — the type of star, represented by a number from 0 to 5, where:
	- 0 — Red Dwarf,
	- 1 — Brown Dwarf,
	- 2 — White Dwarf,
	- 3 — Main Sequence,
	- 4 — Super Giants,
	- 5 — Hyper Giants;
- **Spectral Class** — the spectral class of the star (one of O, B, A, F, G, K, or M).

I took the task here: [fadeevlecturer.github.io](https://fadeevlecturer.github.io/python_lectures/docs/index.html)

**`Task:`**

1. **Clean the color column:** standardize the values in this column so that variations like 'Blue white', 'Blue White', and 'Blue-white' are treated as the same;
2. **Star type names:** create a new column where the star type is represented as a full string instead of a number;
3. **Convert spectral class to numbers:** add a new column where the spectral class is represented by numbers, using the following mapping:
	- O → 0,
	- B → 1,
	- A → 2,
	- F → 3,
	- G → 4,
	- K → 5,
	- M → 6;
4. **Count the number of stars:** for each star color, star type, and spectral class, calculate the number of stars;
5. **Star type analysis:** for each star type, find the minimum, average, and maximum values of the absolute magnitude;
6. **Spectral class analysis:** for each spectral class, find the minimum, average, and maximum values of the temperature;
7. **Correlation analysis:** compute pairwise correlations between all numerical columns.


**`Cleaning of the color column:`**

In [26]:
import pandas as pd
stars = pd.read_csv('stars.csv')
stars.sample(n=10)

Unnamed: 0,Temperature (K),Luminosity(L/Lo),Radius(R/Ro),Absolute magnitude(Mv),Star type,Star color,Spectral Class
131,3607,0.00023,0.38,10.34,1,Red,M
169,9373,424520.0,24.0,-5.99,4,Blue,O
210,22350,12450.0,6.36,-3.67,3,Blue-white,B
125,3225,0.00076,0.121,19.63,0,Red,M
111,3605,126000.0,1124.0,-10.81,5,Red,M
179,24490,248490.0,1134.5,-8.24,5,Blue-white,B
137,3598,0.0011,0.56,14.26,1,Red,M
153,16390,1278.0,5.68,-3.32,3,Blue-white,B
132,3100,0.008,0.31,11.17,1,Red,M
72,3304,0.0085,0.18,13.2,1,Red,M


In [27]:
stars['Star color'].value_counts()

Star color
Red                   112
Blue                   55
Blue-white             26
Blue White             10
yellow-white            8
White                   7
Blue white              3
Yellowish White         3
white                   3
Whitish                 2
Orange                  2
yellowish               2
Pale yellow orange      1
White-Yellow            1
Blue                    1
Yellowish               1
Orange-Red              1
Blue white              1
Blue-White              1
Name: count, dtype: int64

In [28]:
replace_color_dict = {
  'Blue-white': 'Blue white',
  'Blue White': 'Blue white',
  'yellow-white': 'Yellowish white',
  'Yellowish White': 'Yellowish white',
  'white': 'White',
  'yellowish': 'Yellowish',
  'White-Yellow': 'White yellow',
  'Orange-Red': 'Orange red',
  'Blue-White': 'Blue white'
}
stars['Star color'] = stars['Star color'].replace(replace_color_dict)
stars['Star color'].unique()

array(['Red', 'Blue white', 'White', 'Yellowish white',
       'Pale yellow orange', 'Blue', 'Whitish', 'Orange', 'White yellow',
       'Blue ', 'Yellowish', 'Orange red', 'Blue white '], dtype=object)

**`Star type names`**

In [29]:
stars['Star type'].unique()

array([0, 1, 2, 3, 4, 5])

In [30]:
replace_type_dict = {
  0: 'Red Dwarf',
  1: 'Brown Dwarf',
  2: 'White Dwarf',
  3: 'Main Sequence',
  4: 'Super Giants',
  5: 'Hyper Giants'
}
stars['Star type names'] = stars['Star type'].replace(replace_type_dict)
stars['Star type names'].unique()

array(['Red Dwarf', 'Brown Dwarf', 'White Dwarf', 'Main Sequence',
       'Super Giants', 'Hyper Giants'], dtype=object)

In [33]:
stars.sample(n=10)

Unnamed: 0,Temperature (K),Luminosity(L/Lo),Radius(R/Ro),Absolute magnitude(Mv),Star type,Star color,Spectral Class,Star type names
221,12749,332520.0,76.0,-7.02,4,Blue,O,Super Giants
30,39000,204000.0,10.6,-4.7,3,Blue,O,Main Sequence
0,3068,0.0024,0.17,16.12,0,Red,M,Red Dwarf
178,12100,120000.0,708.9,-7.84,5,Blue white,B,Hyper Giants
8,2650,0.00069,0.11,17.45,0,Red,M,Red Dwarf
237,8829,537493.0,1423.0,-10.73,5,White,A,Hyper Giants
212,13089,788.0,5.992,-0.12,3,Blue white,A,Main Sequence
225,18734,224780.0,46.0,-7.45,4,Blue,O,Super Giants
198,3324,0.0065,0.471,12.78,1,Red,M,Brown Dwarf
239,37882,294903.0,1783.0,-7.8,5,Blue,O,Hyper Giants
