# Gaia Data

### On June 13, 2022 the [Gaia project](https://www.cosmos.esa.int/web/gaia/dr3) released is third major data release containg about 1.5 billion sources.

- For Astro 300, we will use a subset of the main data source.

- This subset is still really large (1906.8 GB), so we will use python to access this data in an efficient manner.

- #### The Gaia database we will use is called `gaiadr3.gaia_source_lite`

In [None]:
!pip install astroquery

In [None]:
import numpy as np
from astropy.table import QTable
from astroquery.gaia import Gaia

---
# SQL/ADQL Database query language

SQL (Structured Query Language) is a language designed for managing data held in a relational database management systems. SQL has became the most widely used database language.

Astronomical Data Query Language (ADQL) is a specialised variant of SQL developed for use with the proliferation of astronomical datasets, and extends the functionality of SQL in an astronomical context.

[The Gaia ADQL cookbook](https://www.gaia.ac.uk/data/gaia-data-release-1/adql-cookbook) is a great resource for learning the ADQL syntax.


## ADQL Query

A typical ADQL query has the form:

```
SELECT
{columns}
FROM {database}
WHERE {conditions}
```

The ADQL commands are usually ALLCAPS and the other commands are lowercase.

There is a real example of a ADQL query to get the columns: `source_id`, `ra`, `dec`, and `parallax` from `gaiadr3.gaia_source_lite` database for all objects where the value of the `parallax` column is greater than 200 mas. The columns will be ordered by decreasing values of `parallax`:

```
SELECT TOP 10
source_id, ra, dec, parallax
FROM gaiadr3.gaia_source_lite
WHERE parallax > 200.0
ORDER BY parallax DESC
```

#### It is really good to add `TOP 10` to the `SELECT` when you first do a query, so you do not drop millions of lines into your notebook!

---
## Let's get some data

- First we create the query string as a doc-string

In [None]:
query_one = """
SELECT TOP 10
source_id, ra, dec, parallax
FROM gaiadr3.gaia_source_lite
WHERE parallax > 200
ORDER BY parallax DESC
"""

In [None]:
print(query_one)


SELECT TOP 10
source_id, ra, dec, parallax
FROM gaiadr3.gaia_source_lite
WHERE parallax > 200
ORDER BY parallax DESC



## Submit our query to the Gaia archive server

In [None]:
my_job_query = Gaia.launch_job(query_one)

### Check the status of the job

In [None]:
print(my_job_query)

<Table length=10>
   name    dtype  unit                            description                            
--------- ------- ---- ------------------------------------------------------------------
source_id   int64      Unique source identifier (unique within a particular Data Release)
       ra float64  deg                                                    Right ascension
      dec float64  deg                                                        Declination
 parallax float64  mas                                                           Parallax
Jobid: None
Phase: COMPLETED
Owner: None
Output file: 1705527160970O-result.vot.gz
Results: None


### Looks good so get the results

- The results will be a nice astropy Qtable

In [None]:
my_parallax_table = my_job_query.get_results()

In [None]:
my_parallax_table

source_id,ra,dec,parallax
Unnamed: 0_level_1,deg,deg,mas
int64,float64,float64,float64
5853498713190525696,217.39232147200883,-62.67607511676666,768.0665391873573
4472832130942575872,269.4485025254384,4.739420051112412,546.975939730948
3864972938605115520,164.10319030755974,7.002726940984864,415.17941567802137
762815470562110464,165.83095967577933,35.948653032660104,392.75294543876464
2947050466531873024,101.28662552099247,-16.720932526023173,374.489588528761
5140693571158946048,24.771674208211856,-17.947682860008488,373.8443122683992
5140693571158739840,24.771554293454543,-17.948299887129313,367.71189618147696
4075141768785646848,282.4587890175222,-23.83709744872712,336.0266016683708
1926461164913660160,355.4800152581559,44.17037570074776,316.4811867822692
5164707970261890560,53.22829341517546,-9.458168216292322,310.5772928005821


---

# Getting data from a specific piece of the sky

### A very common search is to find objects within a certain angular distance from a point on the sky

<img src="https://uwashington-astro300.github.io/A300_images/Orion_Circle.png" width="400"/>


- The command `POINT(RA(deg), DEC(deg))` specifies a point on the celestial sphere.

- The command `DISTANCE(point1, point2)` computes the spherical angular distance between two points.

#### Here is a query to find all Gaia objects

- Within 0.5 degrees of RA = 90.0 deg, Dec = 10.0 deg
- Brighter than 10th mag
- Have color (BP-RP) data.

```
SELECT TOP 100
source_id, ra, dec, phot_g_mean_mag, bp_rp
FROM gaiadr3.gaia_source_lite
WHERE DISTANCE( POINT(90.0, 10.0), POINT(ra, dec) ) < 0.5
AND phot_g_mean_mag < 10.0
AND bp_rp IS NOT NULL
ORDER BY bp_rp ASC
```

- #### Extra `conditions` can be added with the `AND` command
- #### The `IS NOT NULL` command is very useful for ignoring rows with no data

In [None]:
query_circle = """
SELECT TOP 100
source_id, ra, dec, phot_g_mean_mag, bp_rp
FROM gaiadr3.gaia_source_lite
WHERE DISTANCE( POINT(90.0, 10.0), POINT(ra, dec) ) < 0.5
AND phot_g_mean_mag < 10.0
AND bp_rp IS NOT NULL
ORDER BY bp_rp ASC
"""

In [None]:
print(query_circle)


SELECT TOP 100
source_id, ra, dec, phot_g_mean_mag, bp_rp
FROM gaiadr3.gaia_source_lite
WHERE DISTANCE( POINT(90.0, 10.0), POINT(ra, dec) ) < 0.5
AND phot_g_mean_mag < 10.0
AND bp_rp IS NOT NULL
ORDER BY bp_rp ASC



In [None]:
my_job_query = Gaia.launch_job(query_circle)

In [None]:
print(my_job_query)

<Table length=17>
      name       dtype  unit                            description                            
--------------- ------- ---- ------------------------------------------------------------------
      source_id   int64      Unique source identifier (unique within a particular Data Release)
             ra float64  deg                                                    Right ascension
            dec float64  deg                                                        Declination
phot_g_mean_mag float32  mag                                              G-band mean magnitude
          bp_rp float32  mag                                                     BP - RP colour
Jobid: None
Phase: COMPLETED
Owner: None
Output file: 1705527161653O-result.vot.gz
Results: None


In [None]:
my_circle_table = my_job_query.get_results()

In [None]:
my_circle_table

source_id,ra,dec,phot_g_mean_mag,bp_rp
Unnamed: 0_level_1,deg,deg,mag,mag
int64,float64,float64,float32,float32
3341748267281799040,89.62669440318659,10.099907155196268,9.142086,0.062874794
3341770055652565120,89.76583701548212,10.31244549927474,9.805223,0.120651245
3341739028808993024,90.25540119781957,10.40149385723267,8.541857,0.25239468
3341725559792528768,90.19658840286726,10.162951277601085,9.89426,0.27869606
3323655876463584768,89.9710403003534,9.533599452192728,8.196316,0.34491205
3341729128905719808,90.11810499216004,10.218009295612235,9.086484,0.46809578
3341724151042395648,90.19475621146648,10.10503617710665,9.959866,0.47952843
3341761878034880512,89.62494656452243,10.207791109727046,9.195797,0.56916904
3341673225616983936,90.08217970157622,9.702005508677903,8.569285,0.57319355
3341683262953793152,90.23201235122484,9.906884851454713,9.263163,0.6175747


### Another common search is to find objects within a certain region bound by sets of `RA(deg)` and `DEC(deg)` coordinates

<img src="https://uwashington-astro300.github.io/A300_images/Orion_Square.png" width="400"/>


#### Here is a query to find all Gaia objects

- Within 0.5 degrees of RA = 90.0 deg, Dec = 10.0 deg
- Brighter than 12th mag
- Have color (BP-RP) data.

```
SELECT TOP 100
source_id, ra, dec, phot_g_mean_mag, bp_rp
FROM gaiadr3.gaia_source_lite
WHERE ra BETWEEN 89.0 AND 90.0
AND dec BETWEEN 8.0 AND 10.0
AND phot_g_mean_mag < 10.0
AND bp_rp IS NOT NULL
ORDER BY bp_rp ASC
```

- #### The `BETWEEN` command is very useful for these sots of searches

In [None]:
query_square = """
SELECT TOP 100
source_id, ra, dec, phot_g_mean_mag, bp_rp
FROM gaiadr3.gaia_source_lite
WHERE ra BETWEEN 89.0 AND 90.0
AND dec BETWEEN 8.0 AND 10.0
AND phot_g_mean_mag < 10.0
AND bp_rp IS NOT NULL
ORDER BY bp_rp ASC
"""

In [None]:
my_job_query = Gaia.launch_job(query_square)

In [None]:
my_square_table = my_job_query.get_results()

In [None]:
my_square_table

source_id,ra,dec,phot_g_mean_mag,bp_rp
Unnamed: 0_level_1,deg,deg,mag,mag
int64,float64,float64,float32,float32
3335676905808496896,89.38848043985271,9.581337497606762,8.391348,-0.035431862
3335698896040824704,89.11680996459627,9.509329864314287,5.9601636,-0.028367996
3323640753881335168,89.85727328078046,9.257982467210201,8.718452,0.09463692
3323455073854643072,89.1451158930138,8.31152881295884,9.9304,0.18818283
3323655876463584768,89.9710403003534,9.533599452192728,8.196316,0.34491205
3335658385909449856,89.76020498670931,9.47425573848086,9.206095,0.35547447
3335673946573907328,89.46195521752016,9.55274079662755,9.840582,0.36182785
3323523557111825280,89.93711657417262,8.594333738857253,9.81608,0.38627434
3335681922330220544,89.61968584711629,9.645174685233835,9.560016,0.39993286
...,...,...,...,...


In [None]:
my_square_table.show_in_notebook()

idx,source_id,ra,dec,phot_g_mean_mag,bp_rp
Unnamed: 0_level_1,Unnamed: 1_level_1,deg,deg,mag,mag
0,3335676905808496896,89.38848043985271,9.581337497606762,8.391348,-0.035431862
1,3335698896040824704,89.11680996459627,9.509329864314289,5.9601636,-0.028367996
2,3323640753881335168,89.85727328078046,9.2579824672102,8.718452,0.09463692
3,3323455073854643072,89.1451158930138,8.31152881295884,9.9304,0.18818283
4,3323655876463584768,89.9710403003534,9.533599452192728,8.196316,0.34491205
5,3335658385909449856,89.76020498670931,9.47425573848086,9.206095,0.35547447
6,3335673946573907328,89.46195521752016,9.55274079662755,9.840582,0.36182785
7,3323523557111825280,89.93711657417262,8.594333738857253,9.81608,0.38627434
8,3335681922330220544,89.61968584711629,9.645174685233837,9.560016,0.39993286
9,3335656599203090944,89.73216107415882,9.381710709535788,9.562954,0.40233994


---

## ADQL querys can get SUPER complicated! - I have shown you the merest baby steps.

## If you want to see how the pros work, check out the [Gaia ADQL Guide](https://www.cosmos.esa.int/web/gaia-users/archive/writing-queries)

---
# Strange Object

In your last homework, I asked you do find the object with the brighest absolute magnitude.

You found that the object with `source_id` = 2202630001603369856 had an absolute magnitude of -12.17.

This is a crazy value. The most luminous stars in the universe have an absolute magnitude of around -10.

What is going on?

In [None]:
query_strange = """
SELECT TOP 2
source_id, parallax, parallax_error
FROM gaiadr3.gaia_source_lite
WHERE source_id = 2202630001603369856
"""

In [None]:
my_job_query = Gaia.launch_job(query_strange)

In [None]:
my_job_query.get_results()

source_id,parallax,parallax_error
Unnamed: 0_level_1,mas,mas
int64,float64,float32
2202630001603369856,0.1190235841537322,0.26368567


### That is not good! The error in the parallax is about 2 x the value.

What type of object are we looking at that would lead to such an error?

----

# SIMBAD - Name resolver

The purpose of Simbad is to provide information on astronomical objects of interest which have been studied in scientific articles. It provides the bibliography, as well as available basic information such as the nature of the object.

One of Simbad's most useful features is its abilty to resolve the multitude names of objects given in the literature.

In [None]:
from astroquery.simbad import Simbad

In [None]:
Simbad.query_objectids("Gaia DR3 2202630001603369856")

ID
bytes32
Gaia DR3 2202630001603369856
TIC 260614141
NAME Herschel's Garnet Star
NAME Erakis
PLX 5252
* mu. Cep
AAVSO 2140+58
ADS 15271 A
AG+58 1378
BD+58 2316


In [None]:
Simbad.query_objectids("Gaia DR3 2202630001603369856").show_in_notebook()

idx,ID
0,Gaia DR3 2202630001603369856
1,TIC 260614141
2,NAME Herschel's Garnet Star
3,NAME Erakis
4,PLX 5252
5,* mu. Cep
6,AAVSO 2140+58
7,ADS 15271 A
8,AG+58 1378
9,BD+58 2316


This is a very well studied bright star!

Mu Cephei is visually nearly 100,000 times brighter than the Sun, with an absolute visual magnitude of −7.6. It is also one of the largest known stars with a radius around or over 1,000 times that of the sun.

This is exactly the type of object Gaia is NOT designed to study.

We can see what it looks like with tools like [WikiSky](http://wikisky.org/)