# STARSKØPE

## Part III: Scraping the MAST API

Kepler observed parts of a 10 by 10 degree patch of sky near the constellation of Cygnus for four years (17, 3-month quarters) starting in 2009. The mission downloaded small sections of the sky at a 30-minute (long cadence) and a 1-minute (short cadence) in order to measure the variability of stars and find planets transiting these stars. These data are now available in the public s3://stpubdata/kepler/public S3 bucket on AWS.


https://mast-labs.stsci.io/

These data are available under the same terms as the public dataset for Hubble and TESS, that is, if you compute against the data from the AWS US-East region, then data access is free.

This script queries MAST for TESS FFI data for a single sector/camera/chip combination and downloads the data from the AWS public dataset rather than from MAST servers.

In [None]:
# Working with http://astroquery.readthedocs.io/en/latest/mast/mast.html
# Make sure you're running the latest version of Astroquery:
# pip install https://github.com/astropy/astroquery/archive/master.zip
# or !pip install astroquery

In [2]:
#!pip install astroquery

In [None]:
# !pip install awscli

In [7]:
import pandas as pd
import numpy as np
import os
from astroquery.mast import Observations
from astroquery.mast import Catalogs

import boto3

### Google Colabs

In [4]:
from google.colab import drive
drive.mount('/gdrive',force_remount=True)

In [5]:
%cd '/gdrive/My Drive/'
%mkdir config
%pwd

In [None]:
# text = '''
# [default]
# aws_access_key_id = <your access key id> 
# aws_secret_access_key = <your secret access key>
# region = <your region>
# '''
path = "/gdrive/My Drive/config/awscli.ini"
with open(path, 'w') as f:
   f.write(text)
# !cat /gdrive/My\ Drive/config/awscli.ini

In [None]:
!export AWS_SHARED_CREDENTIALS_FILE=/gdrive/My\ Drive/config/awscli.ini
path = path
os.environ['AWS_SHARED_CREDENTIALS_FILE'] = path
print(os.environ['AWS_SHARED_CREDENTIALS_FILE'])

/gdrive/My Drive/config/awscli.ini


In [None]:
%cd '/gdrive/My Drive/Colab Notebooks/starskope'

### Jupyter Lab

In [7]:
#%mkdir config

In [53]:
# text = '''
# [default]
# aws_access_key_id = <access_id>
# aws_secret_access_key = <access_key>
# aws_session_token= <token>
#'''

In [8]:
path = "./config/awscli.ini"
# with open(path, 'w') as f:
#    f.write(text)

In [9]:
!export AWS_SHARED_CREDENTIALS_FILE=./config/awscli.ini
path = path
os.environ['AWS_SHARED_CREDENTIALS_FILE'] = path
print(os.environ['AWS_SHARED_CREDENTIALS_FILE'])

./config/awscli.ini


In [10]:
# s3://stpubdata/kepler/public

region = 'us-east-1'
s3 = boto3.resource('s3', region_name=region)
bucket = s3.Bucket('stpubdata')
location = {'LocationConstraint': region}


Cloud data access is enabled using the enable_cloud_dataset function, which will cause AWS to become the prefered source for data access until it is disabled (disable_cloud_dataset).

To directly access a list of cloud URIs for a given dataset, use the get_cloud_uris function, however when cloud access is enabled, the standatd download function download_products will preferentially pull files from AWS when they are avilable. There is also a cloud_only flag, which when set to True will cause all data products not available in the cloud to be skipped.

In [11]:
Observations.enable_cloud_dataset(provider='AWS', profile='default')

INFO: Using the S3 STScI public dataset [astroquery.mast.cloud]
INFO: See Request Pricing in https://aws.amazon.com/s3/pricing/ for details [astroquery.mast.cloud]
INFO: If you have not configured boto3, follow the instructions here: https://boto3.readthedocs.io/en/latest/guide/configuration.html [astroquery.mast.cloud]




In [12]:
# Example catalog query
catalog_data = Catalogs.query_criteria(catalog="Tic",Bmag=[30,50],objType="STAR")
print(catalog_data)

    ID    version  HIP TYC ...     e_Dec_orig     raddflag wdflag   objID   
--------- -------- --- --- ... ------------------ -------- ------ ----------
463721073 20190415  --  -- ...  0.489828592248652       -1      1  710312391
261459129 20190415  --  -- ...  0.200397148604244        1      0 1701625107
282391528 20190415  --  -- ...   0.47766300834538        0      0  574723760
125414201 20190415  --  -- ...   0.22398993783274        1      0  579825329
 81609218 20190415  --  -- ...  0.146788572369267        1      0  630541794
260216294 20190415  --  -- ...  0.187170498094167        1      0  683390717
123585000 20190415  --  -- ...  0.618316068787371        0      0  574511442
 64575709 20190415  --  -- ...   0.21969663115091        1      0  595775997
 94322581 20190415  --  -- ...  0.205286802302475        1      0  606092549
282024596 20190415  --  -- ...  0.548806522539047        1      0  573765450
406300991 20190415  --  -- ... 0.0518318978617112        0      0 1411465651

In [13]:
print(Observations.list_missions())

['BEFS', 'EUVE', 'FUSE', 'GALEX', 'HLA', 'HLSP', 'HST', 'HUT', 'IUE', 'JWST', 'K2', 'K2FFI', 'Kepler', 'KeplerFFI', 'PS1', 'SPITZER_SHA', 'SWIFT', 'TESS', 'TUES', 'WUPPE']


In [14]:
# Downloading from the cloud
obs_table = Observations.query_criteria(obs_collection=['TESS'], filters='TESS', sequence_number=1)
obs_table[:5]

dataproduct_type,calib_level,obs_collection,obs_id,target_name,s_ra,s_dec,t_min,t_max,t_exptime,wavelength_region,filters,em_min,em_max,target_classification,obs_title,t_obs_release,instrument_name,proposal_pi,proposal_id,proposal_type,project,sequence_number,provenance_name,s_region,jpegURL,dataURL,dataRights,mtFlag,srcDen,intentType,obsid,objID
str10,int64,str4,str47,str9,float64,float64,float64,float64,float64,str7,str4,float64,float64,str1,str1,float64,str10,str14,str55,str1,str4,int64,str4,str111,str1,str73,str6,bool,float64,str7,str11,str11
timeseries,3,TESS,tess2018206045859-s0001-0000000029437243-0120-s,29437243,313.835773,-28.839668,58324.79285543,58352.6761776,120.0,Optical,TESS,600.0,1000.0,--,--,58458.58333,Photometer,"Ricker, George",,--,TESS,1,SPOC,CIRCLE ICRS 313.83577300 -28.83966800 0.00138889,--,mast:TESS/product/tess2018206045859-s0001-0000000029437243-0120-s_lc.fits,PUBLIC,False,,science,17000015080,17001681432
timeseries,3,TESS,tess2018206045859-s0001-0000000092451314-0120-s,92451314,314.623836,-31.705534,58324.79285543,58352.6761776,120.0,Optical,TESS,600.0,1000.0,--,--,58458.58333,Photometer,"Ricker, George",,--,TESS,1,SPOC,CIRCLE ICRS 314.62383600 -31.70553400 0.00138889,--,mast:TESS/product/tess2018206045859-s0001-0000000092451314-0120-s_lc.fits,PUBLIC,False,,science,17000003516,17001681814
timeseries,3,TESS,tess2018206045859-s0001-0000000206505336-0120-s,206505336,332.424436,-19.889275,58324.79285543,58352.6761776,120.0,Optical,TESS,600.0,1000.0,--,--,58458.58333,Photometer,"Ricker, George",G011250,--,TESS,1,SPOC,CIRCLE ICRS 332.42443600 -19.88927500 0.00138889,--,mast:TESS/product/tess2018206045859-s0001-0000000206505336-0120-s_lc.fits,PUBLIC,False,,science,17000014152,17001682620
timeseries,3,TESS,tess2018206045859-s0001-0000000260161144-0120-s,260161144,92.785389,-58.287688,58324.79285543,58352.6761776,120.0,Optical,TESS,600.0,1000.0,--,--,58458.58333,Photometer,"Ricker, George",G011250,--,TESS,1,SPOC,CIRCLE ICRS 92.78538900 -58.28768800 0.00138889,--,mast:TESS/product/tess2018206045859-s0001-0000000260161144-0120-s_lc.fits,PUBLIC,False,,science,17000009846,17001683256
timeseries,3,TESS,tess2018206045859-s0001-0000000355544723-0120-s,355544723,320.36052100000006,-54.076302,58324.79285543,58352.6761776,120.0,Optical,TESS,600.0,1000.0,--,--,58458.58333,Photometer,"Ricker, George",G011123,--,TESS,1,SPOC,CIRCLE ICRS 320.36052100 -54.07630200 0.00138889,--,mast:TESS/product/tess2018206045859-s0001-0000000355544723-0120-s_lc.fits,PUBLIC,False,,science,17000005066,17001684176


In [None]:
# Confirmed Exoplanets from K2:
# The planetary system K2-62 hosts at least 2 planets.
# K2-62bc

In [23]:
# Downloading from the cloud
obs_table = Observations.query_criteria(obs_collection=['K2'],
                                        objectname="K2-62",
                                        filters='KEPLER',
                                        provenance_name='K2',
                                        )

In [26]:
obs_table[:5]

dataproduct_type,calib_level,obs_collection,obs_id,target_name,s_ra,s_dec,t_min,t_max,t_exptime,wavelength_region,filters,em_min,em_max,target_classification,obs_title,t_obs_release,instrument_name,proposal_pi,proposal_id,proposal_type,project,sequence_number,provenance_name,s_region,jpegURL,dataURL,dataRights,mtFlag,srcDen,intentType,obsid,objID
str10,int64,str4,str47,str9,float64,float64,float64,float64,float64,str7,str4,float64,float64,str1,str1,float64,str10,str14,str55,str1,str4,int64,str4,str111,str1,str73,str6,bool,float64,str7,str11,str11
timeseries,3,TESS,tess2018206045859-s0001-0000000079403675-0120-s,79403675,320.208121,-53.034181,58324.79285543,58352.6761776,120.0,Optical,TESS,600.0,1000.0,--,--,58458.58333,Photometer,"Ricker, George",G011175_G011025_G011178_G011176_G011250,--,TESS,1,SPOC,CIRCLE ICRS 320.20812100 -53.03418100 0.00138889,--,mast:TESS/product/tess2018206045859-s0001-0000000079403675-0120-s_lc.fits,PUBLIC,False,,science,17000001837,17001681698
timeseries,3,TESS,tess2018206045859-s0001-0000000092501872-0120-s,92501872,314.93729800000006,-30.698645,58324.79285543,58352.6761776,120.0,Optical,TESS,600.0,1000.0,--,--,58458.58333,Photometer,"Ricker, George",,--,TESS,1,SPOC,CIRCLE ICRS 314.93729800 -30.69864500 0.00138889,--,mast:TESS/product/tess2018206045859-s0001-0000000092501872-0120-s_lc.fits,PUBLIC,False,,science,17000013149,17001681822
timeseries,3,TESS,tess2018206045859-s0001-0000000092708508-0120-s,92708508,316.4563069999999,-30.797143,58324.79285543,58352.6761776,120.0,Optical,TESS,600.0,1000.0,--,--,58458.58333,Photometer,"Ricker, George",,--,TESS,1,SPOC,CIRCLE ICRS 316.45630700 -30.79714300 0.00138889,--,mast:TESS/product/tess2018206045859-s0001-0000000092708508-0120-s_lc.fits,PUBLIC,False,,science,17000009865,17001681876
timeseries,3,TESS,tess2018206045859-s0001-0000000149308175-0120-s,149308175,83.492523,-63.90236,58324.79285543,58352.6761776,120.0,Optical,TESS,600.0,1000.0,--,--,58458.58333,Photometer,"Ricker, George",,--,TESS,1,SPOC,CIRCLE ICRS 83.49252300 -63.90236000 0.00138889,--,mast:TESS/product/tess2018206045859-s0001-0000000149308175-0120-s_lc.fits,PUBLIC,False,,science,17000008818,17001682282
timeseries,3,TESS,tess2018206045859-s0001-0000000207081058-0120-s,207081058,331.86723200000006,-41.815456,58324.79285543,58352.6761776,120.0,Optical,TESS,600.0,1000.0,--,--,58458.58333,Photometer,"Ricker, George",G011278_G011211,--,TESS,1,SPOC,CIRCLE ICRS 331.86723200 -41.81545600 0.00138889,--,mast:TESS/product/tess2018206045859-s0001-0000000207081058-0120-s_lc.fits,PUBLIC,False,,science,17000001355,17001682658


In [37]:
obs_table['obsid'][0]

'9500053094'

In [38]:
products = Observations.get_product_list(obs_table)

In [39]:
products[:5]

obsID,obs_collection,dataproduct_type,obs_id,description,type,dataURI,productType,productGroupDescription,productSubGroupDescription,productDocumentationURL,project,prvversion,proposal_id,productFilename,size,parent_obsid
str10,str2,str10,str20,str38,str1,str96,str9,str28,str8,str1,str2,str2,str27,str34,int64,str10
9500052804,K2,timeseries,ktwo206089508-c03_lc,Preview-Full,S,mast:K2/url/missions/k2/previews/c3/206000000/89000/ktwo206089508-c03_llc_bw_large.png,PREVIEW,--,--,--,K2,26,GO3038_GO3054,ktwo206089508-c03_llc_bw_large.png,16906,9500052804
9500052804,K2,timeseries,ktwo206089508-c03_lc,Lightcurve Long Cadence (KLC) - C03,S,mast:K2/url/missions/k2/lightcurves/c3/206000000/89000/ktwo206089508-c03_llc.fits,SCIENCE,Minimum Recommended Products,LLC,--,K2,26,GO3038_GO3054,ktwo206089508-c03_llc.fits,368640,9500052804
9500052804,K2,timeseries,ktwo206089508-c03_lc,Target Pixel Long Cadence (KTL) - C03,S,mast:K2/url/missions/k2/target_pixel_files/c3/206000000/89000/ktwo206089508-c03_lpd-targ.fits.gz,SCIENCE,Minimum Recommended Products,LPD-TARG,--,K2,26,GO3038_GO3054,ktwo206089508-c03_lpd-targ.fits.gz,5135319,9500052804
9500052804,K2,timeseries,ktwo206089508-c03_lc,Preview-Thumb,S,mast:K2/url/missions/k2/previews/c3/206000000/89000/ktwo206089508-c03_llc_bw_thumb.png,THUMBNAIL,--,--,--,K2,26,GO3038_GO3054,ktwo206089508-c03_llc_bw_thumb.png,1586,9500052804
9500052892,K2,timeseries,ktwo206092110-c03_lc,Preview-Full,S,mast:K2/url/missions/k2/previews/c3/206000000/92000/ktwo206092110-c03_llc_bw_large.png,PREVIEW,--,--,--,K2,26,GO3069_GO3107,ktwo206092110-c03_llc_bw_large.png,22172,9500052892


In [None]:
# Download from MAST
#manifest = Observations.download_products(products, cloud_only=True)

In [42]:
# LLC (lightcurve long cadence)
llc = Observations.filter_products(products,
                                        productSubGroupDescription='LLC')
llc[:5]

obsID,obs_collection,dataproduct_type,obs_id,description,type,dataURI,productType,productGroupDescription,productSubGroupDescription,productDocumentationURL,project,prvversion,proposal_id,productFilename,size,parent_obsid
str10,str2,str10,str20,str38,str1,str96,str9,str28,str8,str1,str2,str2,str27,str34,int64,str10
9500052804,K2,timeseries,ktwo206089508-c03_lc,Lightcurve Long Cadence (KLC) - C03,S,mast:K2/url/missions/k2/lightcurves/c3/206000000/89000/ktwo206089508-c03_llc.fits,SCIENCE,Minimum Recommended Products,LLC,--,K2,26,GO3038_GO3054,ktwo206089508-c03_llc.fits,368640,9500052804
9500052892,K2,timeseries,ktwo206092110-c03_lc,Lightcurve Long Cadence (KLC) - C03,S,mast:K2/url/missions/k2/lightcurves/c3/206000000/92000/ktwo206092110-c03_llc.fits,SCIENCE,Minimum Recommended Products,LLC,--,K2,26,GO3069_GO3107,ktwo206092110-c03_llc.fits,368640,9500052892
9500052909,K2,timeseries,ktwo206092615-c03_lc,Lightcurve Long Cadence (KLC) - C03,S,mast:K2/url/missions/k2/lightcurves/c3/206000000/92000/ktwo206092615-c03_llc.fits,SCIENCE,Minimum Recommended Products,LLC,--,K2,26,GO3069,ktwo206092615-c03_llc.fits,368640,9500052909
9500052922,K2,timeseries,ktwo206093036-c03_lc,Lightcurve Long Cadence (KLC) - C03,S,mast:K2/url/missions/k2/lightcurves/c3/206000000/93000/ktwo206093036-c03_llc.fits,SCIENCE,Minimum Recommended Products,LLC,--,K2,26,GO3051_GO3081,ktwo206093036-c03_llc.fits,368640,9500052922
9500052942,K2,timeseries,ktwo206093540-c03_lc,Lightcurve Long Cadence (KLC) - C03,S,mast:K2/url/missions/k2/lightcurves/c3/206000000/93000/ktwo206093540-c03_llc.fits,SCIENCE,Minimum Recommended Products,LLC,--,K2,26,GO3051_GO3069_GO3106_GO3107,ktwo206093540-c03_llc.fits,368640,9500052942


In [43]:
# LPD-TARG (target pixel long cadence)
tplc = Observations.filter_products(products,
                                        productSubGroupDescription='LPD-TARG')
tplc[5:]

obsID,obs_collection,dataproduct_type,obs_id,description,type,dataURI,productType,productGroupDescription,productSubGroupDescription,productDocumentationURL,project,prvversion,proposal_id,productFilename,size,parent_obsid
str10,str2,str10,str20,str38,str1,str96,str9,str28,str8,str1,str2,str2,str27,str34,int64,str10
9500052969,K2,timeseries,ktwo206094039-c03_lc,Target Pixel Long Cadence (KTL) - C03,S,mast:K2/url/missions/k2/target_pixel_files/c3/206000000/94000/ktwo206094039-c03_lpd-targ.fits.gz,SCIENCE,Minimum Recommended Products,LPD-TARG,--,K2,26,GO3051,ktwo206094039-c03_lpd-targ.fits.gz,6027384,9500052969
9500052972,K2,timeseries,ktwo206094098-c03_lc,Target Pixel Long Cadence (KTL) - C03,S,mast:K2/url/missions/k2/target_pixel_files/c3/206000000/94000/ktwo206094098-c03_lpd-targ.fits.gz,SCIENCE,Minimum Recommended Products,LPD-TARG,--,K2,26,GO3051,ktwo206094098-c03_lpd-targ.fits.gz,6719998,9500052972
9500052979,K2,timeseries,ktwo206094342-c03_lc,Target Pixel Long Cadence (KTL) - C03,S,mast:K2/url/missions/k2/target_pixel_files/c3/206000000/94000/ktwo206094342-c03_lpd-targ.fits.gz,SCIENCE,Minimum Recommended Products,LPD-TARG,--,K2,26,GO3051,ktwo206094342-c03_lpd-targ.fits.gz,8609670,9500052979
9500052992,K2,timeseries,ktwo206094605-c03_lc,Target Pixel Long Cadence (KTL) - C03,S,mast:K2/url/missions/k2/target_pixel_files/c3/206000000/94000/ktwo206094605-c03_lpd-targ.fits.gz,SCIENCE,Minimum Recommended Products,LPD-TARG,--,K2,26,GO3023_GO3054_GO3081_GO3104,ktwo206094605-c03_lpd-targ.fits.gz,22883051,9500052992
9500053009,K2,timeseries,ktwo206095133-c03_lc,Target Pixel Long Cadence (KTL) - C03,S,mast:K2/url/missions/k2/target_pixel_files/c3/206000000/95000/ktwo206095133-c03_lpd-targ.fits.gz,SCIENCE,Minimum Recommended Products,LPD-TARG,--,K2,26,GO3048,ktwo206095133-c03_lpd-targ.fits.gz,11435653,9500053009
9500053044,K2,timeseries,ktwo206096022-c03_lc,Target Pixel Long Cadence (KTL) - C03,S,mast:K2/url/missions/k2/target_pixel_files/c3/206000000/96000/ktwo206096022-c03_lpd-targ.fits.gz,SCIENCE,Minimum Recommended Products,LPD-TARG,--,K2,26,GO3054_GO3104,ktwo206096022-c03_lpd-targ.fits.gz,5979559,9500053044
9500053070,K2,timeseries,ktwo206096602-c03_lc,Target Pixel Long Cadence (KTL) - C03,S,mast:K2/url/missions/k2/target_pixel_files/c3/206000000/96000/ktwo206096602-c03_lpd-targ.fits.gz,SCIENCE,Minimum Recommended Products,LPD-TARG,--,K2,26,GO3051_GO3054_GO3104,ktwo206096602-c03_lpd-targ.fits.gz,6247976,9500053070
9500053071,K2,timeseries,ktwo206096692-c03_lc,Target Pixel Long Cadence (KTL) - C03,S,mast:K2/url/missions/k2/target_pixel_files/c3/206000000/96000/ktwo206096692-c03_lpd-targ.fits.gz,SCIENCE,Minimum Recommended Products,LPD-TARG,--,K2,26,GO3048,ktwo206096692-c03_lpd-targ.fits.gz,11823223,9500053071
9500053094,K2,timeseries,ktwo206097453-c03_lc,Target Pixel Long Cadence (KTL) - C03,S,mast:K2/url/missions/k2/target_pixel_files/c3/206000000/97000/ktwo206097453-c03_lpd-targ.fits.gz,SCIENCE,Minimum Recommended Products,LPD-TARG,--,K2,26,GO3054_GO3104,ktwo206097453-c03_lpd-targ.fits.gz,6108393,9500053094
9500053128,K2,timeseries,ktwo206098619-c03_lc,Target Pixel Long Cadence (KTL) - C03,S,mast:K2/url/missions/k2/target_pixel_files/c3/206000000/98000/ktwo206098619-c03_lpd-targ.fits.gz,SCIENCE,Minimum Recommended Products,LPD-TARG,--,K2,26,GO3051,ktwo206098619-c03_lpd-targ.fits.gz,6133412,9500053128


In [15]:
def obs_uris(obs_collection=['K2'], objectname='K2-62', filters='KEPLER', provenance_name='K2', prod_subgroup='LLC'):
    """
    returns filtered list of s3 uris to use for downloading from MAST archive
    """
    # Getting the cloud URIs
    obs_table = Observations.query_criteria(obs_collection=obs_collection,
                                        objectname=objectname,
                                        filters=filters,
                                        provenance_name=provenance_name)
    products = Observations.get_product_list(obs_table)
    filtered = Observations.filter_products(products,
                                        productSubGroupDescription=prod_subgroup)
    return  Observations.get_cloud_uris(filtered)

In [17]:
# llc_uris = obs_uris()

In [18]:
# # Getting the cloud URIs
# obs_table = Observations.query_criteria(obs_collection=['K2'],
#                                         objectname="K2-62",
#                                         filters='KEPLER',
#                                         provenance_name='K2')
# products = Observations.get_product_list(obs_table)
# filtered = Observations.filter_products(products,
#                                         productSubGroupDescription='LLC')
# s3_uris = Observations.get_cloud_uris(filtered)
# print(s3_uris)

ClientError: An error occurred (ForbiddenException) when calling the GetRoleCredentials operation: The requested role with name AWSAdministratorAccess does not exist

In [None]:
len(s3_uris)

23

In [None]:
#%mkdir -p data/mast/k2_c3_62
%cd data/mast/k2_c3_62

/gdrive/My Drive/Colab Notebooks/starskope/data/mast/k2_c3_62


In [None]:
# Just download a few of the files (remove the [0:3] to download them all)
# for url in s3_uris[0:3]:
for url in s3_uris:
   # Extract the S3 key from the S3 URL
   fits_s3_key = url.replace("s3://stpubdata/", "")
   root = url.split('/')[-1]
   bucket.download_file(fits_s3_key, root, ExtraArgs={"RequestPayer": "requester"})

In [None]:
%ls

ktwo206089508-c03_llc.fits  ktwo206096692-c03_llc.fits
ktwo206092110-c03_llc.fits  ktwo206097453-c03_llc.fits
ktwo206092615-c03_llc.fits  ktwo206098619-c03_llc.fits
ktwo206093036-c03_llc.fits  ktwo206098990-c03_llc.fits
ktwo206093540-c03_llc.fits  ktwo206099456-c03_llc.fits
ktwo206094039-c03_llc.fits  ktwo206099582-c03_llc.fits
ktwo206094098-c03_llc.fits  ktwo206099965-c03_llc.fits
ktwo206094342-c03_llc.fits  ktwo206100060-c03_llc.fits
ktwo206094605-c03_llc.fits  ktwo206102898-c03_llc.fits
ktwo206095133-c03_llc.fits  ktwo206103033-c03_llc.fits
ktwo206096022-c03_llc.fits  ktwo212235329-c03_llc.fits
ktwo206096602-c03_llc.fits


In [None]:
# K2-1 has 4 confirmed planets
obs_table = Observations.query_criteria(obs_collection=['K2'],
                                        objectname="K2-1",
                                        filters='KEPLER',
                                        provenance_name='K2')
products = Observations.get_product_list(obs_table)
filtered = Observations.filter_products(products,
                                        productSubGroupDescription='LLC')

In [None]:
s3_uris = Observations.get_cloud_uris(filtered)
print(s3_uris)

['s3://stpubdata/k2/public/lightcurves/c19/246300000/67000/ktwo246367108-c19_llc.fits', 's3://stpubdata/k2/public/lightcurves/c19/246300000/67000/ktwo246367153-c19_llc.fits', 's3://stpubdata/k2/public/lightcurves/c19/246300000/67000/ktwo246367299-c19_llc.fits', 's3://stpubdata/k2/public/lightcurves/c19/246300000/67000/ktwo246367757-c19_llc.fits', 's3://stpubdata/k2/public/lightcurves/c19/246300000/67000/ktwo246367814-c19_llc.fits', 's3://stpubdata/k2/public/lightcurves/c19/246300000/69000/ktwo246369514-c19_llc.fits', 's3://stpubdata/k2/public/lightcurves/c19/246300000/69000/ktwo246369828-c19_llc.fits', 's3://stpubdata/k2/public/lightcurves/c12/246300000/71000/ktwo246371559-c12_llc.fits', 's3://stpubdata/k2/public/lightcurves/c19/246300000/72000/ktwo246372474-c19_llc.fits', 's3://stpubdata/k2/public/lightcurves/c12/246300000/73000/ktwo246373487-c12_llc.fits', 's3://stpubdata/k2/public/lightcurves/c12/246300000/75000/ktwo246375278-c12_llc.fits', 's3://stpubdata/k2/public/lightcurves/c12/

In [None]:
len(s3_uris)

29

In [None]:
%cd ../
%mkdir k2_1
%cd k2_1

/gdrive/My Drive/Colab Notebooks/starskope/data/mast
/gdrive/My Drive/Colab Notebooks/starskope/data/mast/k2_1


In [None]:
for url in s3_uris:
   # Extract the S3 key from the S3 URL
   fits_s3_key = url.replace("s3://stpubdata/", "")
   root = url.split('/')[-1]
   bucket.download_file(fits_s3_key, root, ExtraArgs={"RequestPayer": "requester"})

In [None]:
%ls

ktwo246367108-c19_llc.fits  ktwo246377449-c12_llc.fits
ktwo246367153-c19_llc.fits  ktwo246378021-c12_llc.fits
ktwo246367299-c19_llc.fits  ktwo246378180-c12_llc.fits
ktwo246367757-c19_llc.fits  ktwo246378745-c12_llc.fits
ktwo246367814-c19_llc.fits  ktwo246378979-c12_llc.fits
ktwo246369514-c19_llc.fits  ktwo246380361-c12_llc.fits
ktwo246369828-c19_llc.fits  ktwo246380497-c12_llc.fits
ktwo246371559-c12_llc.fits  ktwo246380594-c12_llc.fits
ktwo246372474-c19_llc.fits  ktwo246381420-c12_llc.fits
ktwo246373487-c12_llc.fits  ktwo246381842-c12_llc.fits
ktwo246375278-c12_llc.fits  ktwo246382010-c12_llc.fits
ktwo246375295-c12_llc.fits  ktwo246382136-c12_llc.fits
ktwo246375730-c19_llc.fits  ktwo246383792-c12_llc.fits
ktwo246376758-c12_llc.fits  ktwo246384368-c12_llc.fits
ktwo246377341-c12_llc.fits


In [19]:
Observations.disable_cloud_dataset()

# Get all K2 Confirmed Planet Targets

In [None]:
K2_confirmed_targets = ['K2-21','K2-28', 'K2-39', 'K2-54','K2-55', 'K2-57','K2-58','K2-59',
                        'K2-60', 'K2-61', 'K2-63', 'K2-64', 'K2-65', 'K2-66', 'K2-68',
                        'K2-70','K2-71','K2-72','K2-73','K2-74','K2-75','K2-76','K2-116',
                        'K2-167','K2-168','K2-169','K2-170','K2-171','K2-172']
# K2-1b
# K2-21bc
# K2-28b
# K2-39b
# K2-54b
# K2-55b
# K2-57b
# K2-58bcd
# K2-59bc
# K2-60b
# K2-61b
# K2-62b
# K2-63b
# K2-64b
# K2-65b
# K2-66b
# K2-68b
# K2-70b
# K2-71b
# K2-72bcde
# K2-73B
# K2-74B
# K2-75BC
# K2-76b
# K2-116b
# K2-167b
# K2-168b
# K2-169
# K2-170bc
# K2-171b
# K2-172bc
# WASP-75B

In [None]:
Observations.enable_cloud_dataset()

In [None]:
obs_table = Observations.query_criteria(obs_collection=['K2'],
                                        objectname="K2-62",
                                        filters='KEPLER',
                                        provenance_name='K2')
products = Observations.get_product_list(obs_table)
filtered = Observations.filter_products(products,
                                        productSubGroupDescription='LLC')

In [None]:
# Getting the cloud URIs
# obs_table = Observations.query_criteria(obs_collection='HST',
#                                         filters='F606W',
#                                         instrument_name='ACS/WFC',
#                                         proposal_id=['12062'],
#                                         dataRights='PUBLIC')
# products = Observations.get_product_list(obs_table)
# filtered = Observations.filter_products(products,
#                                         productSubGroupDescription='DRZ')
# s3_uris = Observations.get_cloud_uris(filtered)
# print(s3_uris)

In [None]:

# target = "K2-21"

# #Do a cone search and find the Kepler long cadence data for your target
# obs = Observations.query_object(target)
# want = (obs['obs_collection'] == "Kepler") & (obs['t_exptime'] ==1800.0)

In [None]:
# #Pick which data you want to retrieve
# data_prod = Observations.get_product_list(obs[want])
# filt_prod = Observations.filter_products(data_prod, description="Lightcurve Long Cadence (LLC)")

In [None]:
# #Move data from the S3 bucket to the default astroquery location. 
# #cloud_only=True means that data will only be retrieved if available on AWS S3
# # %cd ../
# # %mkdir k2_21
# # %cd k2_21
# manifest = Observations.download_products(filt_prod, cloud_only=True)

In [None]:

# Observations.disable_cloud_dataset()

# TESS

In [None]:
#catalog_data = Catalogs.query_criteria(catalog="Tic",Bmag=[30,50],objType="STAR")
catalog_data = Catalogs.query_region("334.211055 -12.165732", radius=0.1,
                                      catalog="Tic")
print("Number of results:",len(catalog_data))
print(catalog_data[:10])



Number of results: 152


In [None]:
# Query for observations in sector 1 (s0001), camera 1, chip 1 (1-1)
obsTable = Observations.query_criteria(obs_id="tess-s0001-1-1")

In [None]:

# Get the products associated with these observations
products = Observations.get_product_list(obsTable)

In [None]:
# Return only the calibrated FFIs (.ffic.fits)
filtered = Observations.filter_products(products, 
                                        productSubGroupDescription="FFIC",
                                        mrp_only=False)

len(filtered)
# > 1282

# Enable 'cloud mode' for module which will return S3-like URLs for FITs files
# e.g. s3://stpubdata/tess/.../tess2018206192942-s0001-1-1-0120-s_ffic.fits
Observations.enable_cloud_dataset()

# Grab the S3 URLs for each of the observations
s3_urls = Observations.get_cloud_uris(filtered)

s3 = boto3.resource('s3')

# Create an authenticated S3 session. Note, download within US-East is free
# e.g. to a node on EC2.
s3_client = boto3.client('s3',
                         aws_access_key_id='YOURAWSACCESSKEY',
                         aws_secret_access_key='YOURSECRETACCESSKEY')

bucket = s3.Bucket('stpubdata')

# Just download a few of the files (remove the [0:3] to download them all)
for url in s3_urls[0:3]:
  # Extract the S3 key from the S3 URL
  fits_s3_key = url.replace("s3://stpubdata/", "")
  root = url.split('/')[-1]
  bucket.download_file(fits_s3_key, root, ExtraArgs={"RequestPayer": "requester"})
  

Bernoulli Restricted Boltzmann Machine (RBM).

    A Restricted Boltzmann Machine with binary visible units and
    binary hidden units. Parameters are estimated using Stochastic Maximum
    Likelihood (SML), also known as Persistent Contrastive Divergence (PCD)
    [2].

    The time complexity of this implementation is ``O(d ** 2)`` assuming
    d ~ n_features ~ n_components.

In [None]:

import numpy as np
import matplotlib.pyplot as plt

from scipy.ndimage import convolve
from sklearn import linear_model, datasets, metrics
from sklearn.model_selection import train_test_split
from sklearn.neural_network import BernoulliRBM
from sklearn.pipeline import Pipeline
from sklearn.base import clone

In [None]:
# Authors: Yann N. Dauphin, Vlad Niculae, Gabriel Synnaeve
# License: BSD

# #############################################################################
# Setting up

def nudge_dataset(X, Y):
    """
    This produces a dataset 5 times bigger than the original one,
    by moving the 8x8 images in X around by 1px to left, right, down, up
    """
    direction_vectors = [
        [[0, 1, 0],
         [0, 0, 0],
         [0, 0, 0]],

        [[0, 0, 0],
         [1, 0, 0],
         [0, 0, 0]],

        [[0, 0, 0],
         [0, 0, 1],
         [0, 0, 0]],

        [[0, 0, 0],
         [0, 0, 0],
         [0, 1, 0]]]

    def shift(x, w):
        return convolve(x.reshape((8, 8)), mode='constant', weights=w).ravel()

    X = np.concatenate([X] +
                       [np.apply_along_axis(shift, 1, X, vector)
                        for vector in direction_vectors])
    Y = np.concatenate([Y for _ in range(5)], axis=0)
    return X, Y


# Load Data
X, y = datasets.load_digits(return_X_y=True)
X = np.asarray(X, 'float32')
X, Y = nudge_dataset(X, y)
X = (X - np.min(X, 0)) / (np.max(X, 0) + 0.0001)  # 0-1 scaling

X_train, X_test, Y_train, Y_test = train_test_split(
    X, Y, test_size=0.2, random_state=0)

# Models we will use
logistic = linear_model.LogisticRegression(solver='newton-cg', tol=1)
rbm = BernoulliRBM(random_state=0, verbose=True)

rbm_features_classifier = Pipeline(
    steps=[('rbm', rbm), ('logistic', logistic)])

# #############################################################################
# Training

# Hyper-parameters. These were set by cross-validation,
# using a GridSearchCV. Here we are not performing cross-validation to
# save time.
rbm.learning_rate = 0.06
rbm.n_iter = 10
# More components tend to give better prediction performance, but larger
# fitting time
rbm.n_components = 100
logistic.C = 6000

# Training RBM-Logistic Pipeline
rbm_features_classifier.fit(X_train, Y_train)

# Training the Logistic regression classifier directly on the pixel
raw_pixel_classifier = clone(logistic)
raw_pixel_classifier.C = 100.
raw_pixel_classifier.fit(X_train, Y_train)

# #############################################################################
# Evaluation

Y_pred = rbm_features_classifier.predict(X_test)
print("Logistic regression using RBM features:\n%s\n" % (
    metrics.classification_report(Y_test, Y_pred)))

Y_pred = raw_pixel_classifier.predict(X_test)
print("Logistic regression using raw pixel features:\n%s\n" % (
    metrics.classification_report(Y_test, Y_pred)))

# #############################################################################
# Plotting

plt.figure(figsize=(4.2, 4))
for i, comp in enumerate(rbm.components_):
    plt.subplot(10, 10, i + 1)
    plt.imshow(comp.reshape((8, 8)), cmap=plt.cm.gray_r,
               interpolation='nearest')
    plt.xticks(())
    plt.yticks(())
plt.suptitle('100 components extracted by RBM', fontsize=16)
plt.subplots_adjust(0.08, 0.02, 0.92, 0.85, 0.08, 0.23)

plt.show()