Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sagemaker an optional dependency #184

Open
2 tasks done
nicolaei opened this issue Dec 8, 2021 · 5 comments
Open
2 tasks done

Make sagemaker an optional dependency #184

nicolaei opened this issue Dec 8, 2021 · 5 comments

Comments

@nicolaei
Copy link

nicolaei commented Dec 8, 2021

I'd like to see sagemaker marked as an optional dependency, as it is not always required.

Use Case

Not all users of this package are using the sagemaker dependency, and it adds a lot of bloat to overall installations by forcing users to have gcc and g++ to be able to install numpy and pandas.

Proposed Solution

It's quite simple to actually do this.

In setup.py you'd have to change install_requires and add extras_require:

setup(
    # [...]
    install_requires = [
        "boto3>=1.14.38",
        "pyyaml"
    ]
    extras_require = {
        "sagemaker":  ["sagemaker>=2.1.0"],
        "test": [] # [...]
    }
   # [...]
)

In addition to this, some extra documentation should probably be added to the README to make it easier for users to understand how to install the extra dependency (pip install stepfunctions[sagemaker]).

Other

If this is implemented, the major version probably has to be bumped because it's a change in how the package is installed.


  • 👋 I may be able to implement this feature request
  • ⚠️ This feature might incur a breaking change

This is a 🚀 Feature Request

@wong-a
Copy link
Contributor

wong-a commented Dec 9, 2021

Makes sense. We received similar request in #116

I agree this would be a breaking change. I'm not sure how many customers are using this SDK without sagemaker. So far, I've only seen this request from 3 individuals.

@nicolaei
Copy link
Author

I did some testing, and it looks like some of the implicit imports in stepfunctions.steps would also need to be conditional if this change is to be implemented. Specifically the ones around sagemaker.

If you want me to, I'll can look more into it and send a PR next week!


As a side-note, here is a bit more info about my use-case:

We're currently using step functions as a deployment pipeline inside our accounts.
I'm currently doing a rewrite to make the pipeline more powerful by using this library to dynamically create step functions based on a configuration file for each repo.

The main problem right now is that I have to build the lambda inside of a docker container running terraform. This container doesn't have gcc, g++ or make by default, so I have to install it at runtime to avoid redoing a bunch of container images. Numpy and pandas also take a long time to install 😞

Luckely, I saw the announcement about lambdas being able to use ECR in other account a few days ago, so that should mostly alleviate the problem for me when I get it implemented (we're using an account per team, per environment).

Here is the current work in progress, if you're interested: nsbno/terraform-aws-delivery-pipeline.

@DrewThomasCorps
Copy link

My team would love if this work was completed and merged. We prefer writing amazon states language in Python as opposed to JSON, and this gives us the functionality to do that. An implementation detail prevents us from using the CDK where the step functions are defined, so making this SDK lighter would make it easier to use.

@wong-a
Copy link
Contributor

wong-a commented Jun 3, 2022

An implementation detail prevents us from using the CDK where the step functions are defined

Can you elaborate on what issue prevents you from using CDK? @DrewThomasCorps

@DrewThomasCorps
Copy link

For the public record, this conversation was held offline. CDK also adds some bloat and in our case it's preferable to write Amazon States Language with a simple library, rather than requiring constructs, IDs, and stacks.

This is not a pressing concern and there are workarounds in place so we do not need this immediately resolved, but will be keeping an eye on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants