-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to use this library in AWS Lambda due to package size exceeded max limit #1200
Comments
Hi @nemalipuri ! Unfortunately, running sagemaker-python-sdk in AWS Lambda is not currently supported. This is a pain point that we're aware of and for which we are working on prioritizing a solution. I would normally recommend pinning python-dateutil to 2.8.0 to resolve the conflict, but I actually experimented locally and found that, even without boto3, the zip (55MB) is still over the 50MB zipped limit for Lambda. An alternative is to remove numpy and scipy dependencies entirely for specific sagemaker installations, as they account for ~73% of the installation size. Thanks! |
@knakad Thanks for looking into this. Almost a year back I've used Sagemaker Python SDK in Lambda without any issues, the version it was 1.18.0 and size of the package was smaller. Another use-case came up now and when I trie to pull latest package the size is larger than unzipped limit(260MB). Use-case is build a ML model with custom container and implement Lambda functions for creating training job and endpoint creation. StepFunctions will invoke these Lambda services at scheduled times to automate workflow. I am not using scipy in my client code. I did try without boto3, botocare and scipy, but Lambda failed with error 'No module named 'numpy.core._multiarray_umath'. If you could provide some workaround it would be great otherwise plan sdk(via boto3) is the only option I would have to implement sagemaker apis in Lambda. Thank you. |
Until sagemaker-python-sdk is officially supported in AWS Lambda, here's a workaround that removes a bit of bloat from the installation, allowing it to fit in lambda without sacrificing any functionality:
I was able to upload the following zip along with a simple handler that called This solution also doesn't require you to fork any of the code, so you can more easily run the latest sagemaker-python-sdk with the latest features/bug fixes. Please try it out and let me know if you run into any issues =) |
Perfect, it worked after executing the above steps. Thank you so much! |
Anytime! Leaving this issue open to track the workaround and the feature request. |
@knakad This looks like a great solution and I'd like to implement it. I followed the steps you listed out, created a layer and attached it to my Lambda function, but I still get the error when I try and import sagemaker package in my lambda function:
Any idea what could be causing the issue? I don't get any hints in the logs in CloudWatch and it just looks like the function is not able to find the sagemaker package from the attached layer. Thanks for your help on this, in advance. |
Is there any date decided for the support in AWS Lambda for sagemaker-python-sdk ? |
After facing the issue myself, I read through the documentation and found the requirement on the path within the zip file that must be followed. There are two options: Documentation for ease of reference: https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html |
+1 to all of this! Looking forward to using SageMaker in Lambda once this is resolved. |
@knakad do we need to manually zip sagemaker installation along with handler.py and upload it manually to s3? also, how will the lambda function pick up the new zip file? It would be helpful if you could list down the steps to do this. Thanks! |
In order to create a valid sagemaker SDK layer it is important to create the layer using an AWS compatible numpy version (since some numpy packages are binary). Here is a slightly updated version of the above that has proved to work for me:
|
@arne-munch-ellingsen Thank you for the lead. I tried your code but when testing the lambda function I got the following error: My local machine (where I ran your code) is Mac OS, any idea what am I missing? |
@shlomi-schwartz Are you trying to import scipy in your Lambda function? If that is the case you will have to add scipy to your layer as well using the same "trick" that I used to add the AWS Lambda Python 3.7 specific numpy library. The Sagemaker SDK does not include scipy. |
@arne-munch-ellingsen Thanks for the tip, I was not calling scipy, it was one of the dependencies for sagemaker==1.71.1, but I used your trick and downloaded the .whl file, it works now! Thanks again 👍 |
This worked for me, but I am looking forward to the actual support for SageMaker SDK in Lambda. mkdir lambda_deployment
cd lambda_deployment
touch lambda_function.py Write the logic in the
Then upload |
further to @arne-munch-ellingsen's post, you can skip the download of the numpy whl and use the AWSLambda-Python37-SciPy1x layer provided by AWS (arn:aws:lambda:eu-west-2:142628438157:layer:AWSLambda-Python37-SciPy1x:35) instead |
I'm trying to follow this tutorial about scheduling data wrangler processing jobs. I created my lambda function uploading the zip file that was created following these commands:
The zip file has around 35MB. Then, when I try to add the Scipy layer to the lambda I got the following error: "Function code combined with layers exceeds the maximum allowed size of 262144000 bytes. The actual size is 263682135 bytes." Does anyone know how to deal with this? Could I somehow reduce even more the sagemaker size? |
Tried many approaches and nothing worked for me. It turns out it had to do with my local machine not using a Linux operating system (I have a macOS Catalina). Followed the instructions here for the installation of numpy and it worked like a charm 😄 (credits to Shandy Roque) :
|
@knakad any news on this? Lambda functions are great for orchestrating more complicated flows, which end with SageMaker prediction. Not being able to use SDK is irritating. Heavy dependencies can be moved to package extra, like |
Any updates on this? new package updates has broken the way to import sagemaker module |
As mentioned by @KaramRazooq, the above instructions needed to be updated. In a nutshell, I had to downgrade jsonschema to 4.17.3 and install linux specific pandas package. I built upon the solution given by @arne-munch-ellingsen. Here is the version that worked for me:
|
@PreethiJC's method works, as long as you include the Compatible runtimes and architectures! |
Hey, y'all! Any news on the MR? I feel that this is an important update for sagemaker-python-sdk, especially for the usage on Lambdas. |
The sagemaker library really be something made available through https://aws.amazon.com/serverless/serverlessrepo/. All I want to do is kick off a |
Seems a fix has been merged, does the issue still exist? |
Close this issue now, feel free to open if there is still a problem on it. |
Hmm. I am executing the following script:
Maybe the issue is resolved, but I feel that we can still have Could we still try to optimize the size of this package, the abstractions that are maintained here are really good for handling SageMaker Pipelines. |
Thanks for your information @guimorg, I updated the priority to the highest level to make it get a quicker attention of resolving. |
Breaking out sagemaker.workflow (and maybe others) into their own libraries would address many use cases.
I still have to install all of sagemaker and all its huge dependencies and end up with a lambda image size that is too big to deploy as a zipfile image. |
Please fill out the form below.
System Information
Describe the problem
I'm trying to use Sagemaker Python SDK in Lambda to trigger train and deploy steps. Packaged the dependencies along with function code and when trying to create Lambda function it is throwing error 'Unzipped size must be smaller than 262144000 bytes'
Sorry, though this issue is related to Lambda service limit I want to check is there anyway I can reduce the size of the dependencies?
I have tried removing boto3 and botocare from function zip file since Lambda provides these libraries but it lead to different issue 'expecting python-dateutil<2.8.1,>=2.1'
Minimal repro / logs
AWS Lambda error 'Unzipped size must be smaller than 262144000 bytes'
mkdir python
cd python
pip install sagemaker --target .
chmod 777 python
zip python directory
Upload Zip file to S3
Error when creating AWS Layer 'Failed to create layer version: Unzipped size must be smaller than 262144000 bytes'
Similarly, instead of Laye when packaged code with dependencies and uploading the zip file into Lambda function I received error 'Unzipped size must be smaller than 262144000 bytes'
Appreciate your help.
The text was updated successfully, but these errors were encountered: