# Accessing CDF and FITS from S3

Here we walk through how to access CDF files and FITS files that are in AWS S3 storage.

- https://github.com/heliocloud-data/science-tutorials/blob/main/S3-Access-Demo.ipynb

Here is an example of a 'raw' read, where we access any binary file and extract information. In this example, we open a CDF file as bytes then extract the checksum 'magic number' first field from it (which should read as 'cdf30001').

In [None]:
import boto3
from botocore import UNSIGNED
from botocore.client import Config
s3c = boto3.client('s3', config=Config(signature_version=UNSIGNED))
# https://stackoverflow.com/questions/34865927/can-i-use-boto3-anonymously

import io

mybucket = 'gov-nasa-hdrl-data1'
mykey = 'demo-data/mms_fgm.cdf'
obj = s3c.get_object(Bucket=mybucket, Key=mykey, Range='bytes=0-8')
rawdata = obj['Body'].read()
bdata = io.BytesIO(rawdata)

magic_number = bdata.read(4).hex()
print("Should print 'cdf30001' if read was correct:", magic_number)

Should print 'cdf30001' if read was correct: cdf30001


## The Core Examples

Here is the code to read each file, in brief. We'll then go into each in more depth.

In [None]:
# CDF reading from S3 cloud
import cdflib
s3name="s3://gov-nasa-hdrl-data1/demo-data/mms_fgm.cdf"
with cdflib.CDF(s3name) as cdfin1:
    print(cdfin1.cdf_info())

CDFInfo(CDF='s3://gov-nasa-hdrl-data1/demo-data/mms_fgm.cdf', Version='3.6.0', Encoding=6, Majority='Column_major', rVariables=[], zVariables=['Epoch', 'mms1_fgm_b_gse_brst_l2', 'mms1_fgm_b_gsm_brst_l2', 'mms1_fgm_b_dmpa_brst_l2', 'mms1_fgm_b_bcs_brst_l2', 'mms1_fgm_flag_brst_l2', 'Epoch_state', 'mms1_fgm_r_gse_brst_l2', 'mms1_fgm_r_gsm_brst_l2', 'label_b_gse', 'label_b_gsm', 'label_b_dmpa', 'label_b_bcs', 'label_r_gse', 'label_r_gsm', 'represent_vec_tot', 'mms1_fgm_hirange_brst_l2', 'mms1_fgm_bdeltahalf_brst_l2', 'mms1_fgm_stemp_brst_l2', 'mms1_fgm_etemp_brst_l2', 'mms1_fgm_mode_brst_l2', 'mms1_fgm_rdeltahalf_brst_l2'], Attributes=[{'Project': 'Global'}, {'Source_name': 'Global'}, {'Discipline': 'Global'}, {'Data_type': 'Global'}, {'Descriptor': 'Global'}, {'File_naming_convention': 'Global'}, {'Data_version': 'Global'}, {'PI_name': 'Global'}, {'PI_affiliation': 'Global'}, {'TEXT': 'Global'}, {'Instrument_type': 'Global'}, {'Mission_group': 'Global'}, {'Logical_source': 'Global'}, {'L

In [None]:
# CDF reading in a URL
import cdflib
s3name="https://gov-nasa-hdrl-data1.s3.amazonaws.com/demo-data/mms_fgm.cdf"
with cdflib.CDF(s3name) as cdfin1:
    print(cdfin1.cdf_info())

CDFInfo(CDF='https://gov-nasa-hdrl-data1.s3.amazonaws.com/demo-data/mms_fgm.cdf', Version='3.6.0', Encoding=6, Majority='Column_major', rVariables=[], zVariables=['Epoch', 'mms1_fgm_b_gse_brst_l2', 'mms1_fgm_b_gsm_brst_l2', 'mms1_fgm_b_dmpa_brst_l2', 'mms1_fgm_b_bcs_brst_l2', 'mms1_fgm_flag_brst_l2', 'Epoch_state', 'mms1_fgm_r_gse_brst_l2', 'mms1_fgm_r_gsm_brst_l2', 'label_b_gse', 'label_b_gsm', 'label_b_dmpa', 'label_b_bcs', 'label_r_gse', 'label_r_gsm', 'represent_vec_tot', 'mms1_fgm_hirange_brst_l2', 'mms1_fgm_bdeltahalf_brst_l2', 'mms1_fgm_stemp_brst_l2', 'mms1_fgm_etemp_brst_l2', 'mms1_fgm_mode_brst_l2', 'mms1_fgm_rdeltahalf_brst_l2'], Attributes=[{'Project': 'Global'}, {'Source_name': 'Global'}, {'Discipline': 'Global'}, {'Data_type': 'Global'}, {'Descriptor': 'Global'}, {'File_naming_convention': 'Global'}, {'Data_version': 'Global'}, {'PI_name': 'Global'}, {'PI_affiliation': 'Global'}, {'TEXT': 'Global'}, {'Instrument_type': 'Global'}, {'Mission_group': 'Global'}, {'Logical_sou

In [None]:
# FITS, using s3fs, reading from S3 cloud
import astropy.io.fits
# note some versions of AstroPy can be compiled to open S3 files directly, with no intermediary
s3name="s3://gov-nasa-hdrl-data1/demo-data/sdo_aia.fits"
try:
    data = astropy.io.fits.open(s3name)
    print("astropy was compiled with S3 support!")
except:
    print("astropy was not compiled with S3 support, using 's3fs'")
    import s3fs
    fs=s3fs.S3FileSystem(anon=True)
    fgrab = fs.open(s3name)
    data = astropy.io.fits.open(fgrab)

print(data[0].header[0:10])

astropy was not compiled with S3 support, using 's3fs'
SIMPLE  =                    T / file does conform to FITS standard             BITPIX  =                   16 / number of bits per data pixel                  NAXIS   =                    0 / number of data axes                            EXTEND  =                    T / FITS dataset may contain extensions            COMMENT   FITS (Flexible Image Transport System) format is defined in 'AstronomyCOMMENT   and Astrophysics', volume 376, page 359; bibcode: 2001A&A...376..359H END                                                                                                                                                                                                                                                                                                                                                                                                                                                                              