## Astronomical Data
There are billions of stars in our universe. Especially over the last few decades a lot of data about their properties has been collected with telescopes like the *Hubble* or the *James Webb Space Telescope*. Catalogs like *Hipparcos* or *GAIA* compile large amounts of data. Analysing millions or billions of datasets can be a challenging task, but libraries like Polars make the process quite efficient.

#### Import Data
The file *hygdata_v41.csv* containts data about almost 120'000 stars from three well-known catalogs (Hipparcos, Yale Bright Stars, Gliese), hence the name HYG.

Import the data into a polars dataframe and find out about the meaning of the different columns (see https://github.com/astronexus/HYG-Database/tree/main/hyg).

Find data for some well-known stars, e.g. the distance to  Sirius or the radial velocity of Vega.

In [None]:
import polars as pl

path = 'data/hygdata_v41.csv'

df = pl.read_csv(path)

### Hertzsprung-Russell
Stars can be categorized based on their surface temperature (related to the colour of a star) and their luminosity (related to the brightness).

In a *Hertzsprung-Russell* the luminosity (usually in a logarithmic scale in units of *magnitudes*) is graphed vs. the surface temperature or the color index. As a convention, the color index decreases from left to right, which corresponds to a temperature increase.

Using the collumns 'ci' (color index) and 'absmag' (absolute magnitude), create a Hertzsprung-Russell diagram for all stars in the dataframe. Mark the position of our Sun (Sol) in the diagram.

In order to convert the color index to the perceived colour, the steps described in https://stackoverflow.com/questions/21977786/star-b-v-color-index-to-apparent-rgb-color can be followed. Create a version of the Hertzsprung-Russell diagram with the temperature as the horizontal axis and with data points plotted with the colour of the corresponding star.

The final version should look similar to this: 
![hr diagram](data/hr_diagram.png)

In [None]:
import numpy as np


def bv2T(bv):
    """
    Convert B-V color index to temperature in Kelvin.

    Parameters
    ----------
    bv (float): B-V color index.

    Returns
    -------
    float: Temperature in Kelvin.

    """

def T2rgb(T):
    convert = pl.read_csv('data/kelvin2rgb_10deg.csv')
    alpha = 0.5

    temp = np.array(convert.get_column('T'))
    red = np.array(convert.get_column('r'))
    green = np.array(convert.get_column('g'))
    blue = np.array(convert.get_column('b'))

    r = np.interp(T, temp, red)
    g = np.interp(T, temp, green)
    b = np.interp(T, temp, blue)

    colors = [(a, b, c, alpha) for a, b, c in zip(r, g, b)]

    return colors