Skip to content

Commit

Permalink
[WIP] S3 support for spock config files (#56)
Browse files Browse the repository at this point in the history
S3 support for spock config files + docs + unit tests (via moto).

Signed-off-by: Nicholas Cilfone <nicholas.cilfone@fmr.com>
  • Loading branch information
ncilfone committed May 10, 2021
1 parent 5fd9809 commit da396a7
Show file tree
Hide file tree
Showing 45 changed files with 1,703 additions and 1,147 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/python-coverage.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ on:

jobs:
build:

runs-on: ubuntu-latest

steps:
Expand All @@ -26,6 +25,7 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install -r DEV_REQUIREMENTS.txt
pip install -r S3_REQUIREMENTS.txt
- name: Test with pytest
run: |
Expand Down
43 changes: 43 additions & 0 deletions .github/workflows/python-pytest-s3.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# This workflow will install Python dependencies, run S3 tests with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: pytest-s3

on:
pull_request:
branches: [master]
push:
branches: [master]

jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.6, 3.7, 3.8, 3.9]

steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e .
pip install -r DEV_REQUIREMENTS.txt
pip install -r S3_REQUIREMENTS.txt
- name: Test with pytest
run: |
pytest tests/s3 --cov=spock --cov-config=.coveragerc --junitxml=junit/test-results-${{ matrix.python-version }}.xml --cov-report=xml --cov-report=html
- name: Upload pytest test results
uses: actions/upload-artifact@v2
with:
name: pytest-results-${{ matrix.python-version }}
path: junit/test-results-${{ matrix.python-version }}.xml
# Use always() to always run this step to publish test results when there are test failures
if: ${{ always() }}
11 changes: 6 additions & 5 deletions .github/workflows/python-pytest.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# This workflow will install Python dependencies, run tests with a variety of Python versions
# This workflow will install Python dependencies, run general tests with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: pytest
Expand All @@ -11,7 +11,6 @@ on:

jobs:
build:

runs-on: ubuntu-latest
strategy:
matrix:
Expand All @@ -23,15 +22,17 @@ jobs:
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e .
pip install pylint pytest pytest-cov
if [ -f REQUIREMENTS.txt ]; then pip install -r REQUIREMENTS.txt; fi
pip install -r DEV_REQUIREMENTS.txt
- name: Test with pytest
run: |
pytest --cov=spock --cov-config=.coveragerc --junitxml=junit/test-results-${{ matrix.python-version }}.xml --cov-report=xml --cov-report=html
pytest tests/base --cov=spock --cov-config=.coveragerc --junitxml=junit/test-results-${{ matrix.python-version }}.xml --cov-report=xml --cov-report=html
- name: Upload pytest test results
uses: actions/upload-artifact@v2
with:
Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Created by .ignore support plugin (hsz.mobi)
### Python template

# Debugging folder
debug/

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
Expand Down
1 change: 1 addition & 0 deletions DEV_REQUIREMENTS.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
-r REQUIREMENTS.txt
coveralls
coverage
moto
portray
pytest
pytest-cov
Expand Down
22 changes: 14 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,12 @@ Requires Python 3.6+
pip install spock-config
```

w/ S3 extension

```bash
pip install spock-config[s3]
```

## Version(s)

All prior versions are available on PyPi. If legacy API and backend support is needed please install a pre v2.0.0+
Expand All @@ -39,18 +45,16 @@ recent features, bugfixes, and hotfixes.

See [Releases](https://github.com/fidelity/spock/releases) for more information.

#### March 18th, 2021
#### May 6th, 2021
* Added S3 support with `pip install spock-config[s3]`
* S3 addon supports automatically handling loading/saving from paths defined with `s3://` URI(s) by passing in an
active `boto3.Session`

#### March 18th, 2021

* Support for Google docstring style annotation of `spock` class (and Enums) and attributes
* Added in ability to print docstring annotated help information to command line with `--help` argument

#### March 1st, 2021

* Removed legacy backend and API (dataclasses and custom typed interface)
* Updated markdown save call to support advanced types so that saved configurations are now valid `spock` config
input files
* Changed tuples to support length restrictions

## Documentation

Current documentation and more information can be found [here](https://fidelity.github.io/spock/).
Expand All @@ -72,6 +76,8 @@ set of parameters.
* Tractability and Reproducibility: Save runtime parameter configuration to YAML, TOML, or JSON with a single chained
command (with extra runtime info such as Git info, Python version, machine FQDN, etc). The saved markdown file can be
used as the configuration input to reproduce prior runtime configurations.
* S3 Addon: Automatically detects `s3://` URI(s) and handles loading and saving `spock` configuration files when an
active `boto3.Session` is passed in (plus any additional `S3Transfer` configurations)

#### Main Contributors

Expand Down
4 changes: 4 additions & 0 deletions S3_REQUIREMENTS.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
boto3
botocore
hurry.filesize
s3transfer
7 changes: 6 additions & 1 deletion docs/Installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,16 @@

### Install/Upgrade

#### Pip/PyPi
#### PyPi
```bash
pip install spock-config
```

#### w/ S3 Extension
```bash
pip install spock-config[s3]
```

#### Pip From Source
```bash
pip install git+https://github.com/fidelity/spock
Expand Down
11 changes: 8 additions & 3 deletions docs/Motivation.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,11 +79,16 @@ and automatic defaults.
* Easily Managed Parameter Groups: Each class automatically generates its own object within a single namespace.
* Parameter Inheritance: Classes support inheritance allowing for complex configurations derived from a common base
set of parameters.
* Complex Types: Nested Lists/Tuples, List/Tuples of Enum of `@spock` classes, List of repeated `@spock` classes
* Multiple Configuration File Types: Configurations are specified from YAML, TOML, or JSON files.
* Hierarchical Configuration: Composed from multiple configuration files via simple include statements.
* Command-Line Overrides: Quickly experiment by overriding a value with automatically generated command line arguments
* Hierarchical Configuration: Compose from multiple configuration files via simple include statements.
* Command-Line Overrides: Quickly experiment by overriding a value with automatically generated command line arguments.
* Immutable: All classes are *frozen* preventing any misuse or accidental overwrites.
* Tractability and Reproducibility: Save currently running parameter configuration with a single chained command.
* Tractability and Reproducibility: Save runtime parameter configuration to YAML, TOML, or JSON with a single chained
command (with extra runtime info such as Git info, Python version, machine FQDN, etc). The saved markdown file can be
used as the configuration input to reproduce prior runtime configurations.
* S3 Addon: Automatically detects `s3://` URI(s) and handles loading and saving `spock` configuration files when an
active `boto3.Session` is passed in (plus any additional `S3Transfer` configurations)

### Other Libraries

Expand Down
130 changes: 130 additions & 0 deletions docs/addons/S3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# S3 Support

When installed with the S3 addon `spock` will attempt to identify S3 URI(s) (e.g. `s3://<bucket-name>/<key>`) and handle
them automatically. The user only needs to provide an active `boto3.session.Session` to an `S3Config` object and pass
it to the `ConfigArgBuilder`.


### Installing

Install `spock` with the extra s3 related dependencies.

```bash
pip install spock-config[s3]
```

### Creating a boto3 Session

The user must provide an active `boto3.session.Session` object to `spock` in order for the library to automatically
handle S3 URI(s). Configuration is **highly dependent** upon your current AWS setup/security. Please refer to the
`boto3` docs for [session](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html) and
[credentials](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html) for help on how to
correctly configure your `boto3.session.Session`.

For instance, let's just suppose we are going to get our tokens via SAML authorization
where we already have the SAMLAssertion, RoleArn, and PrincipalArn stored as env variables:

```python
import boto3
import os

client = boto3.client('sts')

token = client.assume_role_with_saml(
RoleArn=os.environ.get("RoleArn"), PrincipalArn=os.environ.get("PrincipalArn"),
SAMLAssertion=os.environ.get("SamlString")
)

credentials = token['Credentials']

session = boto3.Session(
aws_access_key_id=credentials['AccessKeyId'],
aws_secret_access_key=credentials['SecretAccessKey'],
aws_session_token=credentials['SessionToken'],
region_name=os.environ.get('AWS_REGION'))
```

### Using the S3Config Object

As an example let's create a basic `@spock` decorated class, instantiate a `S3Config` object from `spock.addons` with
the `boto3.session.Session` we created above, and pass it to the `ConfigArgBuilder`.

```python
from spock.addons import S3Config
from spock.builder import ConfigArgBuilder
from spock.config import spock
from typing import List

@spock
class BasicConfig:
"""Basic spock configuration for example purposes
Attributes:
parameter: simple boolean that flags rounding
fancy_parameter: parameter that multiplies a value
fancier_parameter: parameter that gets added to product of val and fancy_parameter
most_fancy_parameter: values to apply basic algebra to
"""
parameter: bool
fancy_parameter: float
fancier_parameter: float
most_fancy_parameter: List[int]

def main():
# Create an S3Config object and pass in the boto3 session
s3_config = S3Config(
session=session
)
# Chain the generate function to the ConfigArgBuilder call
# Pass in the S3Config object
config = ConfigArgBuilder(
BasicConfig,
desc='S3 example',
s3_config=s3_config
).generate()
```

### Defining the configuration file with a S3 URI

Usually we pass a relative or absolute system path as the configuration file command line argument. Here we pass
in a S3 URI instead:

```bash
$ python simple.py -c s3://my-bucket/path/to/file/config.yaml
```

With a `S3Config` object passed into the `ConfigArgBuilder` the S3 URI will automatically be handled by `spock`.

### Saving to a S3 URI

Similarly, we usually pass a relative or absolute system path to the `SavePath` special argument type or
to the `user_specified_path` kwarg. Again, instead we give a S3 URI:

```python
def main():
# Create an S3Config object and pass in the boto3 session
s3_config = S3Config(
session=session
)
# Chain the generate function to the ConfigArgBuilder call
# Pass in the S3Config object
config = ConfigArgBuilder(
BasicConfig,
desc='S3 example',
s3_config=s3_config
).save(user_specified_path="s3://my-bucket/path/to/file/").generate()
```

With a `S3Config` object passed into the `ConfigArgBuilder` the S3 URI will automatically be handled by `spock`.

### S3Transfer ExtraArgs

If you require any other settings for uploading or downloading files from S3 the `S3Config` class has two extra
attributes:

`download_config` which takes a `S3DownloadConfig` object from `spock.addons` which supports all ExtraArgs from
[S3Transfer.ALLOWED_DOWNLOAD_ARGS](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer.ALLOWED_DOWNLOAD_ARGS)

`upload_config` which takes a `S3UploadConfig` object from `spock.addons` which supports all ExtraArgs from
[S3Transfer.ALLOWED_UPLOAD_ARGS](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer.ALLOWED_UPLOAD_ARGS)
5 changes: 5 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -60,5 +60,10 @@ Motivation = "docs/Motivation.md"
[[tool.portray.mkdocs.nav."Advanced Features"]]
"Command Line Overrides" = "docs/advanced_features/Command-Line-Overrides.md"

# Addons
[[tool.portray.mkdocs.nav]]
[[tool.portray.mkdocs.nav."Addons"]]
"S3" = "docs/addons/S3.md"

[[tool.portray.mkdocs.nav]]
Contributing = "CONTRIBUTING.md"
10 changes: 9 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# -*- coding: utf-8 -*-

# Copyright 2019 FMR LLC <opensource@fidelity.com>
# SPDX-License-Identifier: Apache-2.0

"""Spock Setup"""

from pkg_resources import parse_requirements
Expand All @@ -11,6 +15,9 @@
with open('REQUIREMENTS.txt', 'r') as fid:
install_reqs = [str(req) for req in parse_requirements(fid)]

with open('S3_REQUIREMENTS.txt', 'r') as fid:
s3_reqs = [str(req) for req in parse_requirements(fid)]

setuptools.setup(
name='spock-config',
description='Spock is a framework designed to help manage complex parameter configurations for Python applications',
Expand Down Expand Up @@ -44,5 +51,6 @@
keywords=['configuration', 'argparse', 'parameters', 'machine learning', 'deep learning', 'reproducibility'],
packages=setuptools.find_packages(exclude=["*.tests", "*.tests.*", "tests.*", "tests"]),
python_requires='>=3.6',
install_requires=install_reqs
install_requires=install_reqs,
extras_require={'s3': s3_reqs}
)
15 changes: 15 additions & 0 deletions spock/addons/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# -*- coding: utf-8 -*-

# Copyright 2019 FMR LLC <opensource@fidelity.com>
# SPDX-License-Identifier: Apache-2.0

"""
Spock is a framework that helps manage complex parameter configurations for Python applications
Please refer to the documentation provided in the README.md
"""
from spock.addons.s3.utils import S3Config
from spock.addons.s3.configs import S3DownloadConfig
from spock.addons.s3.configs import S3UploadConfig

__all__ = ["s3", "S3Config", "S3DownloadConfig", "S3UploadConfig"]
Loading

0 comments on commit da396a7

Please sign in to comment.