# Download data 

## Using astroquery

### Download observations from MAST using astroquery for a part of an HST program

**TL;DR: If you want to download the data from the two visits, execute the script [on the PACMAN GitHub](https://github.com/sebastian-zieba/PACMAN/blob/master/docs/source/media/download/download_data_astroquery.py).**

This tutorial is based on the following Jupyter Notebooks on the STSci GitHub:
[(1)](https://github.com/spacetelescope/MAST-API-Notebooks/blob/master/HST/HST_Data_Access.ipynb)
and [(2)](https://github.com/spacetelescope/MAST-API-Notebooks/blob/master/AstroqueryIntro/AstroqueryFunctionalityDemo.ipynb).


Here we will download HST data using [astroquery](https://astroquery.readthedocs.io/en/latest/).

A script example on how to do is can be found [on the PACMAN GitHub](https://github.com/sebastian-zieba/PACMAN/blob/master/docs/source/media/download/download_data_astroquery.py).
In this tutorial, we are going to have a look at this script using a Jupyter Notebook.

It is recommended to run this script or jupyter notebook in a new directory.
The fits files will be saved there and this directory will then be used as the "data directory".
More on the data directory [here (in the context of the pcf)](https://pacmandocs.readthedocs.io/en/latest/pcf.html#datadir)
and [here](https://pacmandocs.readthedocs.io/en/latest/directories.html#nomenclature).

In this example, we are going to analyze just two visits (with index 5 and 6) taken in the middle of the GO13021 program for simplicity:

Dates (YYYY-MM-DD): 2013-03-13 and 2013-03-15.

If the user has already downloaded all 15 visits in GO13021, he or she can choose:
which_visits = [5,6] in the pcf (see the [pcf page](https://pacmandocs.readthedocs.io/en/latest/pcf.html#which-visits) about this parameter).

Alternative methods on how to download HST data can be found here: https://pacmandocs.readthedocs.io/en/latest/download_data.html

Let's first import some necessary packages.

In [1]:
from astropy.time import Time
import os
from astroquery.mast import Observations

### Set start and end time of the observations in MJD

We are going to first find out the start and end times of the observations to define the files we want to download.
Looking at the [Visit Status Report for program 13021](https://www.stsci.edu/cgi-bin/get-visit-status?id=13021&markupFormat=html&observatory=HST) we can see the times in UT are:
Note that STSci is using a different indexing for the visits! PACMAN uses a temporally sorted indexing scheme.

From the website:

###

START Mar 13, 2013 12:34:53

END Mar 13, 2013 18:17:34

###

START Mar 15, 2013 02:52:08

END Mar 15, 2013 08:34:55

In [4]:
t_obs_utc = ['2013-03-13T12:34:53', '2013-03-15T08:34:55'] # very start and end of the two visits
t_obs = Time(t_obs_utc, format='isot', scale='utc')

In [5]:
t_obs

<Time object: scale='utc' format='isot' value=['2013-03-13T12:34:53.000' '2013-03-15T08:34:55.000']>

In [6]:
t_min_obs =  t_obs.mjd[0] - 0.0001
t_max_obs =  t_obs.mjd[1] + 0.0001

The proposal ID is 13021 and we are just interested in the data taken with HST WFC3.

In [7]:
proposal_obs = Observations.query_criteria(proposal_id=13021, instrument_name='WFC3/IR', project='HST')
print("Number of observations:",len(proposal_obs))
print(proposal_obs)

Number of observations: 1115
dataproduct_type calib_level obs_collection ... intentType  obsid     objID  
---------------- ----------- -------------- ... ---------- -------- ---------
        spectrum           3            HST ...    science 23901558 139211258
        spectrum           3            HST ...    science 23901554 139211259
        spectrum           3            HST ...    science 23901557 139211260
        spectrum           3            HST ...    science 23901739 139211261
        spectrum           3            HST ...    science 23901737 139211268
        spectrum           3            HST ...    science 23901747 139211277
        spectrum           3            HST ...    science 23901722 139211708
             ...         ...            ... ...        ...      ...       ...
        spectrum           3            HST ...    science 23902270 139244759
        spectrum           3            HST ...    science 23902267 139244763
        spectrum           3       

### Use a mask so that we only download data taken during the wanted times

In [8]:
# create mask
select = (t_min_obs <= proposal_obs['t_min'].value.data) & (proposal_obs['t_min'].value.data <= t_max_obs)

In [9]:
# mask main array
proposal_obs_select = proposal_obs[select]

In [10]:
# list data products available during these times
data_products = Observations.get_product_list(proposal_obs_select)
print("Number of results:",len(data_products))
print(data_products)

Number of results: 2090
 obsID   obs_collection dataproduct_type ... dataRights calib_level
-------- -------------- ---------------- ... ---------- -----------
26572598            HST            image ...     PUBLIC           3
26572598            HST            image ...     PUBLIC           3
26572598            HST            image ...     PUBLIC           3
26572598            HST            image ...     PUBLIC           3
26572598            HST            image ...     PUBLIC           3
26572592            HST            image ...     PUBLIC           2
26572592            HST            image ...     PUBLIC           2
     ...            ...              ... ...        ...         ...
23902071            HST         spectrum ...     PUBLIC           1
23902071            HST         spectrum ...     PUBLIC           1
23902071            HST         spectrum ...     PUBLIC           3
23902071            HST         spectrum ...     PUBLIC           2
23902071            HST 

### We just want the _ima files

**PACMAN can currently just work with files with an ima extension, so you want to select these.
ima is an intermediate data product standing for calibrated intermediate IR multiaccum image.
From the [WFC3 data handbook](https://hst-docs.stsci.edu/wfc3dhb/chapter-2-wfc3-data-structure/2-1-types-of-wfc3-files):
“For the IR detector, an intermediate MultiAccum (ima) file is the result after all calibrations are applied (dark subtraction, linearity correction, flat fielding, etc.) to all of the individual readouts of the IR exposure.”**

In [11]:
# only select IMA files
data_products_ima = data_products[data_products['productSubGroupDescription'] == 'IMA']
print(data_products_ima)

 obsID   obs_collection dataproduct_type ... dataRights calib_level
-------- -------------- ---------------- ... ---------- -----------
23901914            HST            image ...     PUBLIC           2
23901915            HST         spectrum ...     PUBLIC           2
23901916            HST         spectrum ...     PUBLIC           2
23901917            HST         spectrum ...     PUBLIC           2
23901918            HST         spectrum ...     PUBLIC           2
23901919            HST         spectrum ...     PUBLIC           2
23901920            HST         spectrum ...     PUBLIC           2
     ...            ...              ... ...        ...         ...
23902065            HST         spectrum ...     PUBLIC           2
23902066            HST         spectrum ...     PUBLIC           2
23902067            HST         spectrum ...     PUBLIC           2
23902068            HST         spectrum ...     PUBLIC           2
23902069            HST         spectrum ...    

### Download the data

In [12]:
Observations.download_products(data_products_ima,mrp_only=False)

Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06d0q_ima.fits to ./mastDownload/HST/ibxy06d0q/ibxy06d0q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06d1q_ima.fits to ./mastDownload/HST/ibxy06d1q/ibxy06d1q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06d2q_ima.fits to ./mastDownload/HST/ibxy06d2q/ibxy06d2q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06d3q_ima.fits to ./mastDownload/HST/ibxy06d3q/ibxy06d3q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06d4q_ima.fits to ./mastDownload/HST/ibxy06d4q/ibxy06d4q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06d6q_ima.fits to ./mastDownload/HST/ibxy06d6q/ibxy06d6q_ima.fits ... [Done]
Downloading URL 

Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06f1q_ima.fits to ./mastDownload/HST/ibxy06f1q/ibxy06f1q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06f2q_ima.fits to ./mastDownload/HST/ibxy06f2q/ibxy06f2q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06f4q_ima.fits to ./mastDownload/HST/ibxy06f4q/ibxy06f4q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06f5q_ima.fits to ./mastDownload/HST/ibxy06f5q/ibxy06f5q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06f6q_ima.fits to ./mastDownload/HST/ibxy06f6q/ibxy06f6q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06f8q_ima.fits to ./mastDownload/HST/ibxy06f8q/ibxy06f8q_ima.fits ... [Done]
Downloading URL 

Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy07q0q_ima.fits to ./mastDownload/HST/ibxy07q0q/ibxy07q0q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy07q1q_ima.fits to ./mastDownload/HST/ibxy07q1q/ibxy07q1q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy07q2q_ima.fits to ./mastDownload/HST/ibxy07q2q/ibxy07q2q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy07q4q_ima.fits to ./mastDownload/HST/ibxy07q4q/ibxy07q4q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy07q5q_ima.fits to ./mastDownload/HST/ibxy07q5q/ibxy07q5q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy07q6q_ima.fits to ./mastDownload/HST/ibxy07q6q/ibxy07q6q_ima.fits ... [Done]
Downloading URL 

Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy07rrq_ima.fits to ./mastDownload/HST/ibxy07rrq/ibxy07rrq_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy07rsq_ima.fits to ./mastDownload/HST/ibxy07rsq/ibxy07rsq_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy07rtq_ima.fits to ./mastDownload/HST/ibxy07rtq/ibxy07rtq_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy07ruq_ima.fits to ./mastDownload/HST/ibxy07ruq/ibxy07ruq_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy07rvq_ima.fits to ./mastDownload/HST/ibxy07rvq/ibxy07rvq_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy07rwq_ima.fits to ./mastDownload/HST/ibxy07rwq/ibxy07rwq_ima.fits ... [Done]
Downloading URL 

Local Path,Status,Message,URL
str47,str8,object,object
./mastDownload/HST/ibxy06d0q/ibxy06d0q_ima.fits,COMPLETE,,
./mastDownload/HST/ibxy06d1q/ibxy06d1q_ima.fits,COMPLETE,,
./mastDownload/HST/ibxy06d2q/ibxy06d2q_ima.fits,COMPLETE,,
./mastDownload/HST/ibxy06d3q/ibxy06d3q_ima.fits,COMPLETE,,
./mastDownload/HST/ibxy06d4q/ibxy06d4q_ima.fits,COMPLETE,,
./mastDownload/HST/ibxy06d6q/ibxy06d6q_ima.fits,COMPLETE,,
...,...,...,...
./mastDownload/HST/ibxy07rsq/ibxy07rsq_ima.fits,COMPLETE,,
./mastDownload/HST/ibxy07rtq/ibxy07rtq_ima.fits,COMPLETE,,
./mastDownload/HST/ibxy07ruq/ibxy07ruq_ima.fits,COMPLETE,,


The files were all saved in separate directories with the following format:

- mastDownload
  - HST
     + file1
         -  file1_ima.fits
     + file2
         -  file2_ima.fits
     + file3
         -  file3_ima.fits

etc...

Let's now move all these IMA files into a common directory.

In [13]:
file_path = os.path.dirname(os.path.abspath("__file__"))

In [14]:
file_path

'/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source'

In [15]:
root_dir = file_path + '/mastDownload/HST' # Specify root directory to be searched for .sav files.
move_dir = file_path
filelist = []

# list all ima files in the subdirectories
for tree,fol,fils in os.walk(root_dir):
    filelist.extend([os.path.join(tree,fil) for fil in fils if fil.endswith('.fits')])

In [16]:
# all dowloaded ima files
print(filelist)

['/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy07rkq/ibxy07rkq_ima.fits', '/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy06ecq/ibxy06ecq_ima.fits', '/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy06fcq/ibxy06fcq_ima.fits', '/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy07qiq/ibxy07qiq_ima.fits', '/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy06f5q/ibxy06f5q_ima.fits', '/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy07qaq/ibxy07qaq_ima.fits', '/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy07rdq/ibxy07rdq_ima.fits', '/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy06esq/ibxy06esq_ima.fits', '/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy06eoq/ibxy06eoq_ima.fits', '/home/zieba/Deskt

In [17]:
for fil in filelist:
    name = fil.split('/')[-1]
    os.rename(fil,move_dir + '/' + name)

In [18]:
# delete the mastDownload directory
os.system("rm -r {0}".format(file_path + '/mastDownload'))

0

Now all 158 IMA files are in the same directory as the script and we can use this directory as the data directory.
You can obviously alternatively move this files somewhere else. This directory which includes all the fits files will be your "data directory".

Follow along to [Stage 00](https://pacmandocs.readthedocs.io/en/latest/stage00.html), to start with Stage 00 of PACMAN.