# Gaia Data

### On June 13, 2022 the [Gaia project](https://www.cosmos.esa.int/web/gaia/dr3) released is third major data release containg about 1.5 billion sources.

- For Astro 300, we will use a subset of the main data source. 

- This subset is still really large (1906.8 GB), so we will use python to access this data in an efficient manner.

- #### The Gaia database we will use is called `gaiadr3.gaia_source_lite`

In [None]:
import numpy as np
from astropy.table import QTable
from astroquery.gaia import Gaia

---
# SQL/ADQL Database query language
 
SQL (Structured Query Language) is a language designed for managing data held in a relational database management systems. SQL has became the most widely used database language.

Astronomical Data Query Language (ADQL) is a specialised variant of SQL developed for use with the proliferation of astronomical datasets, and extends the functionality of SQL in an astronomical context.

[The Gaia ADQL cookbook](https://www.gaia.ac.uk/data/gaia-data-release-1/adql-cookbook) is a great resource for learning the ADQL syntax.


## ADQL Query

A typical ADQL query has the form:

```
SELECT 
{columns}
FROM {database}
WHERE {conditions}
```

The ADQL commands are usually ALLCAPS and the other commands are lowercase.

There is a real example of a ADQL query to get the columns: `source_id`, `ra`, `dec`, and `parallax` from `gaiadr3.gaia_source_lite` database for all objects where the value of the `parallax` column is greater than 200 mas. The columns will be ordered by decreasing values of `parallax`:

```
SELECT TOP 10
source_id, ra, dec, parallax
FROM gaiadr3.gaia_source_lite
WHERE parallax > 200.0
ORDER BY parallax DESC
```

#### It is really good to add `TOP 10` to the `SELECT` when you first do a query, so you do not drop millions of lines into your notebook!

---
## Let's get some data

- First we create the query string as a doc-string

In [None]:
query_one = """
SELECT TOP 10
source_id, ra, dec, parallax
FROM gaiadr3.gaia_source_lite
WHERE parallax > 200
ORDER BY parallax DESC
"""

In [None]:
print(query_one)

## Submit our query to the Gaia archive server

In [None]:
my_job_query = Gaia.launch_job(query_one)

### Check the status of the job

In [None]:
print(my_job_query)

### Looks good so get the results

- The results will be a nice astropy Qtable

In [None]:
my_parallax_table = my_job_query.get_results()

In [None]:
my_parallax_table

---

# Getting data from a specific piece of the sky

### A very common search is to find objects within a certain angular distance from a point on the sky

<img src="https://uwashington-astro300.github.io/A300_images/Orion_Circle.png" width="400"/>


- The command `POINT(RA(deg), DEC(deg))` specifies a point on the celestial sphere.

- The command `DISTANCE(point1, point2)` computes the spherical angular distance between two points.

#### Here is a query to find all Gaia objects

- Within 0.5 degrees of RA = 90.0 deg, Dec = 10.0 deg 
- Brighter than 10th mag
- Have color (BP-RP) data.

```
SELECT TOP 100
source_id, ra, dec, phot_g_mean_mag, bp_rp
FROM gaiadr3.gaia_source_lite
WHERE DISTANCE( POINT(90.0, 10.0), POINT(ra, dec) ) < 0.5
AND phot_g_mean_mag < 10.0
AND bp_rp IS NOT NULL
ORDER BY bp_rp ASC
```

- #### Extra `conditions` can be added with the `AND` command
- #### The `IS NOT NULL` command is very useful for ignoring rows with no data

In [None]:
query_circle = """
SELECT TOP 100
source_id, ra, dec, phot_g_mean_mag, bp_rp
FROM gaiadr3.gaia_source_lite
WHERE DISTANCE( POINT(90.0, 10.0), POINT(ra, dec) ) < 0.5
AND phot_g_mean_mag < 10.0
AND bp_rp IS NOT NULL
ORDER BY bp_rp ASC
"""

In [None]:
print(query_circle)

In [None]:
my_job_query = Gaia.launch_job(query_circle)

In [None]:
print(my_job_query)

In [None]:
my_circle_table = my_job_query.get_results()

In [None]:
my_circle_table

### Another common search is to find objects within a certain region bound by sets of `RA(deg)` and `DEC(deg)` coordinates

<img src="https://uwashington-astro300.github.io/A300_images/Orion_Square.png" width="400"/>


#### Here is a query to find all Gaia objects

- Within 0.5 degrees of RA = 90.0 deg, Dec = 10.0 deg 
- Brighter than 12th mag
- Have color (BP-RP) data.

```
SELECT TOP 100
source_id, ra, dec, phot_g_mean_mag, bp_rp
FROM gaiadr3.gaia_source_lite
WHERE ra BETWEEN 89.0 AND 90.0
AND dec BETWEEN 8.0 AND 10.0 
AND phot_g_mean_mag < 10.0
AND bp_rp IS NOT NULL
ORDER BY bp_rp ASC
```

- #### The `BETWEEN` command is very useful for these sots of searches

In [None]:
query_square = """
SELECT TOP 100
source_id, ra, dec, phot_g_mean_mag, bp_rp
FROM gaiadr3.gaia_source_lite
WHERE ra BETWEEN 89.0 AND 90.0
AND dec BETWEEN 8.0 AND 10.0 
AND phot_g_mean_mag < 10.0
AND bp_rp IS NOT NULL
ORDER BY bp_rp ASC
"""

In [None]:
my_job_query = Gaia.launch_job(query_square)

In [None]:
my_square_table = my_job_query.get_results()

In [None]:
my_square_table

In [None]:
my_square_table.show_in_notebook()

---

## ADQL querys can get SUPER complicated! - I have shown you the merest baby steps. 

## If you want to see how the pros work, check out the [Gaia ADQL Guide](https://www.cosmos.esa.int/web/gaia-users/archive/writing-queries)

---
# Strange Object

In your last homework, I asked you do find the object with the brighest absolute magnitude. 

You found that the object with `source_id` = 2202630001603369856 had an absolute magnitude of -12.17.

This is a crazy value. The most luminous stars in the universe have an absolute magnitude of around -10.

What is going on?

In [None]:
query_strange = """
SELECT TOP 2
source_id, parallax, parallax_error
FROM gaiadr3.gaia_source_lite
WHERE source_id = 2202630001603369856
"""

In [None]:
my_job_query = Gaia.launch_job(query_strange)

In [None]:
my_job_query.get_results()

### That is not good! The error in the parallax is about 2 x the value. 

What type of object are we looking at that would lead to such an error?

----

# SIMBAD - Name resolver

The purpose of Simbad is to provide information on astronomical objects of interest which have been studied in scientific articles. It provides the bibliography, as well as available basic information such as the nature of the object. 

One of Simbad's most useful features is its abilty to resolve the multitude names of objects given in the literature.

In [None]:
from astroquery.simbad import Simbad

In [None]:
Simbad.query_objectids("Gaia DR3 2202630001603369856")

In [None]:
Simbad.query_objectids("Gaia DR3 2202630001603369856").show_in_notebook()

This is a very well studied bright star! 

Mu Cephei is visually nearly 100,000 times brighter than the Sun, with an absolute visual magnitude of −7.6. It is also one of the largest known stars with a radius around or over 1,000 times that of the sun.

This is exactly the type of object Gaia is NOT designed to study.

We can see what it looks like with tools like [WikiSky](http://wikisky.org/)