public-ch10-data

Repo containing information regarding publicly available chapter 10 data

Public NASA ch10 data

The data derived from NASA Tail 667 flights is stored in the bucket nasa-public-ch10. It has the following structure:

chapter10/*.c10
ICD/icd.yaml
parsed/*.parquet
translated/*.parquet

This bucket is set to request-pays, so that data access costs will be incurred by the requester of the data. Because of this, an AWS account with public/private access keys is required. See these docs for more information.

Accessing the NASA ch10 data

The following assumes that python is installed with the latest versions of s3fs, pandas, and pyarrow. To install these via conda, please run the following from the root of this repository:

conda env create -f environment.yaml

With these packages installed s3fs filesystem can then be instantiated. Please replace <public_key> and <private_key> with the appropriate keys from AWS:

import s3fs

fs = s3fs.S3FileSystem(
    key='<public_key>', 
    secret='<private_key>', 
    anon=False, 
    requester_pays=True
)

To list out the available files in the bucket, run:

top_keys = fs.ls('s3://nasa-public-ch10/')
print(top_keys)

['nasa-public-ch10/ICD',
 'nasa-public-ch10/chapter10',
 'nasa-public-ch10/parsed',
 'nasa-public-ch10/translated']

This will show the 4 top level keys in the bucket. Running the same command with the subkeys such as `fs.ls('s3://nasa-public-ch10/parsed/') will display the filenames of the data.

Downloading and displaying data

An example of downloading a chapter 10 file to the /tmp directory locally can be seen below:

fs.get('s3://nasa-public-ch10/chapter10/667200103210556.ch10', '/tmp/667200103210556.ch10')

While the ch10 data is binary, the parsed and translated data are both in parquest and can be read in with pandas:

import pyarrow.parquet as pq
import pandas as pd

parsed_example = (
    pq.ParquetDataset(
    'nasa-public-ch10/translated/667200103210556_1553_translated_NAV__00.parquet', 
    filesystem=fs)
    .read_pandas()
    .to_pandas()
)

translated_example = (
    pq.ParquetDataset(
    's3://nasa-public-ch10/translated/667200103210556_1553_translated_NAV__00.parquet', 
    filesystem=fs)
    .read_pandas()
    .to_pandas()
)

The number preceding _1553_ is a unique identifier that links the *.c10, parsed, and translated data

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

public-ch10-data

Public NASA ch10 data

Accessing the NASA ch10 data

Downloading and displaying data

About

Releases

Packages

Contributors 2

ZM-USAF/public-ch10-data

Folders and files

Latest commit

History

Repository files navigation

public-ch10-data

Public NASA ch10 data

Accessing the NASA ch10 data

Downloading and displaying data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages