# Astroquery: a package for retrieving data from online sources

For more information, see [astroquery documentation](https://astroquery.readthedocs.io/en/latest/), which includes an extensive list of [services](https://astroquery.readthedocs.io/en/latest/#available-services), [catalogs, archives and other services](https://astroquery.readthedocs.io/en/latest/#catalog-archive-and-other). 

One of the nice things about `astroquery` is that it typically returns information as useful astropy objects, like tables.

In [None]:
from astroquery.vizier import Vizier
from astropy import units as u
from astropy.coordinates import SkyCoord
from astropy.table import Table

## Scenario: what other variables are in the field of DY Her?

Imagine you have some images of the variable DY Her and you wonder whether there are other variable in that field of view. If there are, you might also want to check whether you have data on any of them.

The astroquery package can help with that, by allowing us to retrieve data from the VizieR service.  [Vizier](https://vizier.cds.unistra.fr/viz-bin/VizieR) is an online catalog of catalogs.

### Use astroquery to find nearby variables

One of the things we will need for our search is the location of DY Her, so we'll start by looking that up using astropy coordinates.

In [None]:
dy_her_coord = SkyCoord.from_name('dy her')

We also need the names of the catalogs we want from VizieR. It is easiest to look up the names by searching at the Vizier web site. I've done that twp get the two names below.

VizieR maintains a copy of the VSX database (it is only updated monthly, so don't look there for the latest new variable stars), and has a copy of the APASS catalog. APASS is the AAVSO Photometric All Sky Survey.

In [None]:
from astroquery.vizier import Vizier

apass_name = 'II/336/apass9'
vsx_name = 'B/vsx/vsx'

Vizier.ROW_LIMIT = -1  # By default a search returns only 50 rows. Setting this to -1 requests that all rows be returned
cat = Vizier.query_region(dy_her_coord,   # RA/Dec of center of search
                          radius=20 * u.arcmin,  # Radius of search, in arcmin
                          catalog=vsx_name)

The `Vizier` class will return a list of tables -- it is possible to search *all* Vizier catalogs for a particular position by omitting the `catalog` argument above. In this case, we only searched one catalog, so we only get back one table.

In [None]:
dy_her_vsx = cat[0]

Let's take a look at the table we got back.

In [None]:
dy_her_vsx

There are only five variable stars in this field of view, but perhaps we might have caught changes in more than one. To see whether that is the case, let's focus on just a few of the columns.

In [None]:
dy_her_vsx['Name', 'Type', 'Period', 'max']

### What data do we happen to have on these stars?

The telescope we used to take the data on DY Her can only detect stars down to about 16th magnitude given the exposure time we used, leaving us with just three possible stars. Happily, they all have fairly short periods so we may have caught some change in brightness. 

As a next step, we will read the data into an astropy table.

In [None]:
dy_her_data = Table.read('dy her-2023-06-19-relative-flux.csv')
len(dy_her_data)

Let's take a look at the table...

In [None]:
dy_her_data[:5]

That is a little overwhelming. Fortunately, we can pull out just the data corresponding to one star if we can identify which star in our data matches each of the stars we found in VSX.

We will do that by matching the coordinates of the stars in this data to the coordinates of the stars in VSX.

### Catalog matching using `SkyCoord`

Astropy coordinate objects come with a mechanism for matching one list of coordinates against another. Here, we would like to find a match for each of the VSX stars in our data. By default, what we will get back is the row number of the closest match in our data.

We begin by making coordinate objects for the VSX data and our data.

In [None]:
dy_her_vsx_coord = SkyCoord(ra=dy_her_vsx['RAJ2000'], dec=dy_her_vsx['DEJ2000'], unit='degree')

In [None]:
dy_her_data_coords = SkyCoord(ra=dy_her_data['RA'], dec=dy_her_data['Dec'], unit='degree')

Note that there are five VSX stars, so there are five coordinates in `dy_her_vsx_coord`, one for each of the star:

In [None]:
dy_her_vsx_coord

The single line below is doing a lot of work, and the order in which the coordinates appear matters. We want 5 matches in the end, one for each VSX star, so we start with the `dy_her_vsx_coord` coordinates, and then match those to our data.

In [None]:
match_idx, d2d, d3d = dy_her_vsx_coord.match_to_catalog_sky(dy_her_data_coords)

We get three different results here:

+ The row of the closest match to each VSX star, called `match_idx` above.
+ The angular distance between each VSX star and its closest match in our data, called `d2d` above.
+ The distance (in 3 dimensions) between each VSX star and its closest match. This is not meaningful for in many cases because we do not have an distance information.

Let's look at each of the three below.

In [None]:
match_idx

In [None]:
d2d

In [None]:
d3d

In [None]:
star_5 = tab['star_id'] == 5
star_5_data = tab[star_5]

In [None]:
star_5_data['BJD', 'relative_flux']