# Download data 

## Using astroquery

### Download observations from MAST using astroquery for a part of an HST program

You can download HST data using astroquery https://astroquery.readthedocs.io/en/latest/.

Here's an example using a Jupyter Notebook on how to do that.

A scipt version of this Jupyter Notebook can be found here: https://github.com/sebastian-zieba/PACMAN/blob/master/docs/source/media/download/download_data_astroquery.py

In this example, we are going to analyze just three visits taken in the middle of the GO13021 program for simplicity: 

Dates (YYYY-MM-DD): 2013-03-13, 2013-03-15, 2013-03-27

If the user downloaded all 15 visits in GO13021, he or she can choose: which_visits = [5,6,7] in the pcf.

**PACMAN can currently just work with files with an ima extension, so you want to select these. ima is an intermediate data product standing for calibrated intermediate IR multiaccum image. From the WFC3 data handbook (https://hst-docs.stsci.edu/wfc3dhb/chapter-2-wfc3-data-structure/2-1-types-of-wfc3-files): “For the IR detector, an intermediate MultiAccum (ima) file is the result after all calibrations are applied (dark subtraction, linearity correction, flat fielding, etc.) to all of the individual readouts of the IR exposure.”** 

Alternative methods on how to download HST data can be found here: https://pacmandocs.readthedocs.io/en/latest/download_data.html

This tutorial is based on the following Jupyter Notebooks:

https://github.com/spacetelescope/MAST-API-Notebooks/blob/master/HST/HST_Data_Access.ipynb

https://github.com/spacetelescope/MAST-API-Notebooks/blob/master/AstroqueryIntro/AstroqueryFunctionalityDemo.ipynb 
    

In [2]:
import numpy as np
import os
from astroquery.mast import Observations

### set start and end time of the observations in MJD

We are going to use the following times in MJD to define that files we want to download

Beginning: 2013-3-13 12:42:48.74     year-month-day hour:minute:second

End: 2013-3-27 23:31:20.93     year-month-day hour:minute:second

In [4]:
t_min_obs =  56364.52973075 - 0.0001
t_max_obs =  56378.980103359994 + 0.0001

The proposal ID is 13021 and we are just interested in the data taken with HST WFC3

In [5]:
proposal_obs = Observations.query_criteria(proposal_id=13021,  instrument_name='WFC3/IR', project='HST')
print("Number of observations:",len(proposal_obs))
print(proposal_obs)

Number of observations: 1115
dataproduct_type calib_level obs_collection ... intentType  obsid     objID  
---------------- ----------- -------------- ... ---------- -------- ---------
        spectrum           3            HST ...    science 23901558 139211258
        spectrum           3            HST ...    science 23901554 139211259
        spectrum           3            HST ...    science 23901557 139211260
        spectrum           3            HST ...    science 23901739 139211261
        spectrum           3            HST ...    science 23901737 139211268
        spectrum           3            HST ...    science 23901747 139211277
        spectrum           3            HST ...    science 23901722 139211708
        spectrum           3            HST ...    science 23901745 139211710
        spectrum           3            HST ...    science 23901748 139211712
        spectrum           3            HST ...    science 23901734 139211713
             ...         ...       

### Filter for the wanted times

In [6]:
select = (t_min_obs <= proposal_obs['t_min'].value.data) & (proposal_obs['t_min'].value.data <= t_max_obs)

In [7]:
proposal_obs_select = proposal_obs[select]

In [8]:
data_products = Observations.get_product_list(proposal_obs_select)
print("Number of results:",len(data_products))
print(data_products)

Number of results: 3147
 obsID   obs_collection dataproduct_type ... parent_obsid dataRights calib_level
-------- -------------- ---------------- ... ------------ ---------- -----------
26572598            HST            image ...     24807271     PUBLIC           3
26572598            HST            image ...     24807271     PUBLIC           3
26572598            HST            image ...     24807271     PUBLIC           3
26572598            HST            image ...     24807271     PUBLIC           3
26572598            HST            image ...     24807271     PUBLIC           3
26572592            HST            image ...     24807271     PUBLIC           2
26572592            HST            image ...     24807271     PUBLIC           2
26572592            HST            image ...     24807271     PUBLIC           2
26572592            HST            image ...     24807271     PUBLIC           2
26572592            HST            image ...     24807271     PUBLIC           2
    

### We just want the _ima files

As mentioned before, we are just interested in the ima files.

In [9]:
data_products_ima = data_products[data_products['productSubGroupDescription'] == 'IMA']
print(data_products_ima)

 obsID   obs_collection dataproduct_type ... parent_obsid dataRights calib_level
-------- -------------- ---------------- ... ------------ ---------- -----------
23901914            HST            image ...     24807271     PUBLIC           2
23901915            HST         spectrum ...     23901915     PUBLIC           2
23901916            HST         spectrum ...     23901916     PUBLIC           2
23901917            HST         spectrum ...     23901917     PUBLIC           2
23901918            HST         spectrum ...     23901918     PUBLIC           2
23901919            HST         spectrum ...     23901919     PUBLIC           2
23901920            HST         spectrum ...     23901920     PUBLIC           2
23901921            HST         spectrum ...     23901921     PUBLIC           2
23901922            HST         spectrum ...     23901922     PUBLIC           2
23901923            HST         spectrum ...     23901923     PUBLIC           2
     ...            ...     

### Download the data

In [10]:
#just a quick test downloading the first 5 files
#data_products_ima = data_products_ima[:5]

In [11]:
Observations.download_products(data_products_ima,mrp_only=False)

Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06d0q_ima.fits to ./mastDownload/HST/ibxy06d0q/ibxy06d0q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06d1q_ima.fits to ./mastDownload/HST/ibxy06d1q/ibxy06d1q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06d2q_ima.fits to ./mastDownload/HST/ibxy06d2q/ibxy06d2q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06d3q_ima.fits to ./mastDownload/HST/ibxy06d3q/ibxy06d3q_ima.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ibxy06d4q_ima.fits to ./mastDownload/HST/ibxy06d4q/ibxy06d4q_ima.fits ... [Done]


Local Path,Status,Message,URL
str47,str8,object,object
./mastDownload/HST/ibxy06d0q/ibxy06d0q_ima.fits,COMPLETE,,
./mastDownload/HST/ibxy06d1q/ibxy06d1q_ima.fits,COMPLETE,,
./mastDownload/HST/ibxy06d2q/ibxy06d2q_ima.fits,COMPLETE,,
./mastDownload/HST/ibxy06d3q/ibxy06d3q_ima.fits,COMPLETE,,
./mastDownload/HST/ibxy06d4q/ibxy06d4q_ima.fits,COMPLETE,,


In [12]:
file_path = os.path.dirname(os.path.abspath("__file__"))

In [13]:
file_path

'/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source'

In [15]:
root_dir = file_path + '/mastDownload/HST' # Specify root directory to be searched for .sav files.
move_dir = file_path
filelist = []

# list all ima files in the subdirectories
for tree,fol,fils in os.walk(root_dir):
    filelist.extend([os.path.join(tree,fil) for fil in fils if fil.endswith('.fits')])

In [16]:
# all dowloaded ima files
print(filelist)

['/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy06d1q/ibxy06d1q_ima.fits',
 '/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy06d0q/ibxy06d0q_ima.fits',
 '/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy06d3q/ibxy06d3q_ima.fits',
 '/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy06d2q/ibxy06d2q_ima.fits',
 '/home/zieba/Desktop/Projects/Open_source/PACMAN/docs/source/mastDownload/HST/ibxy06d4q/ibxy06d4q_ima.fits']

In [17]:
for fil in filelist:
    name = fil.split('/')[-1]
    os.rename(fil,move_dir + '/' + name)

In [18]:
# delete the mastDownload directory
os.system("rm -r {0}".format(file_path + '/mastDownload'))

0