# Star Charts Tutorial

## Overview
Astronomers have been cateloging our celestial neighbours since the advent of recorded history.  This tutorial makes use of a commonly available star database to produce a 3D perspective of our celestial neighbourhood, along with other useful perspectives illustrating various facts about our nearby stars.

## Learning Outcomes

1. Acquisition of data from file systems
2. Translation of data for use in various charts
3. Scatter 3D

## The Database

Our data will be from the HYG database made available by David Nash at the [Astronomy Nexus](http://astronexus.com/hyg).  Our edition of the database was acquired on June 11, 2021.

The HYG database is a CSV thay combines star information from the Hipparcos, Yale Bright Star, and Gliese catalogs.  In total, there are approximately 120,000 stars mentioned!

A ZIP'd copy of the database is included with this Notebook as `hygdata_v3.csv.zip`.  Let's create a working folder and extract the data:

In [20]:
import os
from os import path
import tempfile
import zipfile

# Get a temporary directory (this will be cleaned up per OS settings)
working_dir = tempfile.mkdtemp()

# Get our working directory for this session
data_dir = path.join(os.getcwd(), "other", "star_charts")

# Unzip our database into the working directory
data_archive_file = path.join(data_dir, "hygdata_v3.csv.zip")
with zipfile.ZipFile(data_archive_file, "r") as archive:
    archive.extractall(working_dir)
    
# Provide a reference for our CSV file
data_file = path.join(working_dir, "hygdata_v3.csv")
assert path.exists(data_file) and path.isfile(data_file)
print(f"The HYG Database is located at: {data_file}")

The HYG Database is located at: /tmp/tmpgtipc1zq/hygdata_v3.csv


Great! Let's take a quick look at the file:

In [35]:
import pandas

# Load the CSV but discard NaN values, then print a sample
hyg_data = pandas.read_csv(data_file, keep_default_na=False, low_memory=False)

print("Found columns:")
print(list(hyg_data.columns))
print("---")
print(f"Found rows: {hyg_data.shape[0]}")
print("---")
print("Sample:")
hyg_data.head()

Found columns:
['id', 'hip', 'hd', 'hr', 'gl', 'bf', 'proper', 'ra', 'dec', 'dist', 'pmra', 'pmdec', 'rv', 'mag', 'absmag', 'spect', 'ci', 'x', 'y', 'z', 'vx', 'vy', 'vz', 'rarad', 'decrad', 'pmrarad', 'pmdecrad', 'bayer', 'flam', 'con', 'comp', 'comp_primary', 'base', 'lum', 'var', 'var_min', 'var_max']
---
Found rows: 119614
---
Sample:


Unnamed: 0,id,hip,hd,hr,gl,bf,proper,ra,dec,dist,...,bayer,flam,con,comp,comp_primary,base,lum,var,var_min,var_max
0,0,,,,,,Sol,0.0,0.0,0.0,...,,,,1,0,,1.0,,,
1,1,1.0,224700.0,,,,,6e-05,1.089009,219.7802,...,,,Psc,1,1,,9.63829,,,
2,2,2.0,224690.0,,,,,0.000283,-19.49884,47.9616,...,,,Cet,1,2,,0.392283,,,
3,3,3.0,224699.0,,,,,0.000335,38.859279,442.4779,...,,,And,1,3,,386.901132,,,
4,4,4.0,224707.0,,,,,0.000569,-51.893546,134.2282,...,,,Phe,1,4,,9.366989,,,


Per David, the database is organized with the following fields:

1. **[Star]ID**: The database primary key from a larger "master database" of stars.
1. **HD**: The star's ID in the Henry Draper catalog, if known.
1. **HR**: The star's ID in the Harvard Revised catalog, which is the same as its number in the Yale Bright Star Catalog.
1. **Gliese**: The star's ID in the third edition of the Gliese Catalog of Nearby Stars.
1. **BayerFlamsteed**: The Bayer / Flamsteed
designation, from the Fifth Edition of the Yale Bright Star Catalog. This is a combination of the two designations. The Flamsteed number, if present, is given first; then a three-letter abbreviation for the Bayer Greek letter; the Bayer superscript number, if present; and finally, the three-letter constellation abbreviation. Thus Alpha Andromedae has the field value "21Alp And", and Kappa1 Sculptoris (no Flamsteed number) has "Kap1Scl".
1. **RA, Dec**: The star's right ascension and declination, for epoch 2000.0. Stars present only in the Gliese Catalog, which uses 1950.0 coordinates, have had these coordinates precessed to 2000.
1. **ProperName**: A common name for the star, such as "Barnard's Star" or "Sirius". I have taken these names primarily from the Hipparcos project's web site, which lists representative names for the 150 brightest stars and many of the 150 closest stars. I have added a few names to this list. Most of the additions are designations from catalogs mostly now forgotten (e.g., Lalande, Groombridge, and Gould ["G."]) except for certain nearby stars which are still best known by these designations.
1. **Distance**: The star's distance in parsecs, the most common unit in astrometry. To convert parsecs to light years, multiply by 3.262. A value of 10000000 indicates missing or dubious (e.g., negative) parallax data in Hipparcos.
1. **Mag**: The star's apparent visual magnitude.
1. **AbsMag**: The star's absolute visual magnitude (its apparent magnitude from a distance of 10 parsecs).
1. **Spectrum**: The star's spectral type, if known.
1. **ColorIndex**: The star's color index (blue magnitude - visual magnitude), where known.
1. **X,Y,Z**: The Cartesian coordinates of the star, in a system based on the equatorial coordinates as seen from Earth. +X is in the direction of the vernal equinox (at epoch 2000), +Z towards the north celestial pole, and +Y in the direction of R.A. 6 hours, declination 0 degrees.
1. **VX,VY,VZ**: The Cartesian velocity components of the star, in the same coordinate system described immediately above. They are determined from the proper motion and the radial velocity (when known). The velocity unit is parsecs per year; these are small values (around 10-5 to 10-6), but they enormously simplify calculations using parsecs as base units for celestial mapping.

David also notes potential quality issues with the data:

* The spectral types, in general, come from the Hipparcos catalog. A few stars -- those found only in Gliese, have a spectral type from that catalog. The spectral types from Hipparcos have not been closely vetted and I have already found some probable errors. For example, the spectral type of 36 Ophiuchi B (a double star that was merged in Hipparcos) is given as K2 III (giant), when its luminosity clearly indicates K2 V (main sequence). Also, the star HIP 84720 (Gliese 666 A) is listed as M0 V, whereas its luminosity and color index are more consistent with a late G-type star (about G8 V). M0 V appears to be the spectral type of Gliese 666 B, a companion to this star. Use the spectral types with caution.
* There may be errors in the Henry Draper numbers in one or more catalogs, leading to false cross-references.
* There may be errors in the matching of Gliese stars to Hipparcos stars by position and magnitude. In general, this is likely to be an issue only for multiple stars with highly uncertain magnitudes in both catalogs, as the position constraints were fairly severe (stars had to have positions matching to +/- 0.15 degrees, less than the radius of the full Moon). I have not seen any apparent errors on scanning the database thus far, but this is one area that could be a problem.
* Radial velocity information can be quite uncertain. Uncertainties of a few km/second are not unusual. There are 3 primary sources: the values in the Gliese catalog, the values in the Yale catalog, and the Wilson Evans Batten catalog mentioned earlier, in that order. I do not yet have a detailed breakdown of the uncertainties in these sources.

Other than distance concerns due to plot challenges, this tutorial will generally trust the data.

## 3D Chart