# High Quality Gaia Data

In [None]:
import numpy as np
np.set_printoptions(legacy='1.25')

from astropy.table import QTable
from astroquery.gaia import Gaia

### Let's get the data for the 50 brightest stars in the Gaia database.

In [None]:
my_query = """
SELECT TOP 50
source_id, parallax, phot_g_mean_mag
FROM gaiadr3.gaia_source_lite
WHERE phot_g_mean_mag < 3
AND parallax > 0.1
ORDER BY phot_g_mean_mag
"""

In [None]:
print(my_query)

In [None]:
my_job_query = Gaia.launch_job(my_query)

In [None]:
print(my_job_query)

In [None]:
target_table = my_job_query.get_results()

In [None]:
target_table[0:5]

### Let's find the distance and Absolute magnitues for the objects

In [None]:
def find_distance(my_parallax):
    result = 1 / (my_parallax / 1000)
    return result
    
def find_absmag(my_appmag, my_distance):
    result = my_appmag - (5 * np.log10(my_distance / 10))
    return result

In [None]:
target_table['Distance'] = find_distance(
                               my_parallax = target_table['parallax']
                           )

In [None]:
target_table['AbsMag'] = find_absmag(
                               my_appmag = target_table['phot_g_mean_mag'],
                               my_distance = target_table['Distance']
                         )

In [None]:
target_table[0:5]

### Brightest Objest

In [None]:
target_table.sort('AbsMag')

In [None]:
target_table[0]

### This is a crazy value for the Absolute Magnitude! 

### The most luminous stars in the universe have an absolute magnitude of around -10.

---

## What is wrong with the data?

<img src="https://uwashington-astro300.github.io/A300_images/data_error.png" width="600"/>

We have only used a small subset of the data available in the `gaia_source_lite` database. 

Full column list: [gaia_source_lite data columns](https://gaia.aip.de/metadata/gaiadr3/gaia_source_lite/)

You can see that there are a number of columns with names like `VALUE_error` and `VALUE_over_error`. These data will allow us to 
evaluate the quality of the measured data.

## Quality `parallax` data. 

The uncertainty of parallax measurements can come from many sources. Dim stars are harder to measure than bright stars. It is difficult to measure positions in crowded fields, like near the galactic plane. There are also systematic errors associated with the equipment and the data reduction pipeline. All of these contribute to the `standard error` of the `parallax` measurement.

The `parallax` error data are in the columns:

```
parallax_error          Standard error of parallax
parallax_over_error     Parallax divided by its standard error
```
The `parallax_over_error` column is particularly useful to get high quality parallax data. 

- It is very common to judge the quality of data by comparing the data (**signal**) to the error in the data (**noise**). 
- The ratio of the data / error is often called the signal-to-noise ratio (**SNR**). 
- The SNR for data is often referred to by a lower case sigma (σ)

The `parallax_over_error` column is the SNR for the parallax data.

What is considered a good SNR really depends on the particular situation. 

- As a very general rule of thumb is that a SNR of > 10 (10σ) is considered high quality
- A SNR of 3 (3σ) is considered "barely detected".

In stellar spectroscopy

- 20σ is considered barely adequate
- High quality data is > 100σ

---
## Back to our Strange Object

Our object with `source_id` = 2202630001603369856 has an absolute magnitude of -12.17.

What is going on?

In [None]:
query_strange = """
SELECT TOP 1
source_id, parallax, parallax_error, parallax_over_error
FROM gaiadr3.gaia_source_lite
WHERE source_id = 2202630001603369856
"""

In [None]:
my_job_query = Gaia.launch_job(query_strange)

In [None]:
my_job_query.get_results()

#### That value of `parallax_over_error` is not good ($\sigma$ = 0.45)!

#### The error in the parallax is about 2x the parallax value. 

---

# Galactic Coordinates

A coordinate system based
on the plane of the galaxy. It is centred on the Sun,
and longitude and latitude 0 point directly towards
the centre of the galaxy. Galactic longitude (**l**) is
measured with primary direction from the Sun to the
center of the galaxy in the galactic plane, while the
galactic latitude (**b**) measures the angle of the object
above the galactic plane.

- Galactic North Pole: b = 90°, l = n/a
- Galactic South Pole: b = -90°, l = n/a
- Galactic center: b = 0°, l = 0°
- Galactic anti-center: b = 0°, l = 180°

<p>
<img src="https://uwashington-astro300.github.io/A300_images/GalLongLat.jpg" width = "500">
</p>

---

### What type of object are we looking at that would lead to such an error?

We can use the information columns in the Gaia database to see if the object is a quasar, galaxy, or a multiple star.

```
in_qso_candidates    Flag indicating the availability of additional information in the QsoCandidates table
in_galaxy_candidates Flag indicating the availability of additional information in the GalaxyCandidates table
non_single_star      Flag indicating the availability of additional information in the various Non-Single Star tables

b                    Galactic latitude
l                    Galactic longitude
```

In [None]:
query_strange_two = """
SELECT TOP 1
source_id, in_qso_candidates, in_galaxy_candidates, non_single_star, b, l
FROM gaiadr3.gaia_source_lite
WHERE source_id = 2202630001603369856
"""

In [None]:
my_job_query = Gaia.launch_job(query_strange_two)

In [None]:
my_job_query.get_results()

### Seems that it is not a qso or a galaxy, but a single star.

### However, it is a single star very close to the galactic plane (b = 4.3°).

----

# SIMBAD - Name resolver

The purpose of Simbad is to provide information on astronomical objects of interest which have been studied in scientific articles. It provides the bibliography, as well as available basic information such as the nature of the object. 

One of Simbad's most useful features is its abilty to resolve the multitude names of objects given in the literature.

In [None]:
from astroquery.simbad import Simbad

In [None]:
Simbad.query_objectids("Gaia DR3 2202630001603369856")

In [None]:
Simbad.query_objectids("Gaia DR3 2202630001603369856").show_in_notebook()

### This is a very well studied bright star! 

<img src="https://uwashington-astro300.github.io/A300_images/MuCeph.png" width="600"/>


[Mu Cephei](https://en.wikipedia.org/wiki/Mu_Cephei) is visually nearly 100,000 times brighter than the Sun, with an absolute visual magnitude of −7.6. It is also one of the largest known stars with a radius around or over 1,000 times that of the sun.

It is also located in a region rich in gas and dust.

This is exactly the type of object Gaia is NOT designed to study.

---
## Note on SELECT-ORDER Bias issues

- Let's get some data for 300(ish) objects at the galactic anti-center (**l** = 180°), far above the galactic plane (**b** = +75°) 



In [None]:
my_query = """
SELECT TOP 300
source_id, b, l, phot_g_mean_mag 
FROM gaiadr3.gaia_source_lite
WHERE DISTANCE( POINT(180.0, 75.0), POINT(l, b) ) < 1.5
AND phot_g_mean_mag  < 15
ORDER BY phot_g_mean_mag
"""

In [None]:
print(my_query)

In [None]:
my_job_query = Gaia.launch_job(my_query)

In [None]:
print(my_job_query)

In [None]:
target_table = my_job_query.get_results()

In [None]:
target_table[0:5]

In [None]:
target_table['phot_g_mean_mag'].info('stats')

This query returned over 1,000 objects. However, since we set the `SELECT` to 300 and `ORDER BY` to magnitudes, we only got the 300 brightest objects in the field. The stats for this field is not representitive of the whole field.

### A more representitive sample

One way to get a more representitive sample, is to decrease the radius of the search until the total number of objects found is less that the value of `SELECT`

In [None]:
my_query = """
SELECT TOP 300
source_id, b, l, phot_g_mean_mag 
FROM gaiadr3.gaia_source_lite
WHERE DISTANCE( POINT(180.0, 75.0), POINT(l, b) ) < 0.81
AND phot_g_mean_mag  < 15
ORDER BY phot_g_mean_mag
"""

In [None]:
my_job_query = Gaia.launch_job(my_query)

In [None]:
print(my_job_query)

In [None]:
target_table = my_job_query.get_results()

In [None]:
target_table['phot_g_mean_mag'].info('stats')

---
Another way is to use the larger search radius, but increase the `SELECT` value until you get all of the objects in the field.

In [None]:
my_query = """
SELECT TOP 2000
source_id, b, l, phot_g_mean_mag 
FROM gaiadr3.gaia_source_lite
WHERE DISTANCE( POINT(180.0, 75.0), POINT(l, b) ) < 1.5
AND phot_g_mean_mag  < 15
ORDER BY phot_g_mean_mag
"""

In [None]:
my_job_query = Gaia.launch_job(my_query)

In [None]:
print(my_job_query)

In [None]:
target_table = my_job_query.get_results()

In [None]:
target_table['phot_g_mean_mag'].info('stats')

### Notice that the stats for the two different techniques are pretty close to each other