Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement experimental AWS SageMaker support #1968

Merged
merged 26 commits into from Nov 1, 2023

Conversation

AdeelH
Copy link
Collaborator

@AdeelH AdeelH commented Oct 31, 2023

Overview

Supersedes #1871.

This PR adds the rastervision_aws_sagemaker module which allows running RV jobs on AWS SageMaker in much the same way that rastervision_aws_batch allows running jobs on AWS Batch.

To use this functionality, you can add something like the following to your ~/.ratervision/default configuration file:

[SAGEMAKER]
exec_role=AmazonSageMakerExecutionRole
cpu_image=123.dkr.ecr.us-east-1.amazonaws.com/raster-vision
cpu_instance_type=ml.p3.2xlarge
gpu_image=123.dkr.ecr.us-east-1.amazonaws.com/raster-vision
gpu_instance_type=ml.p3.2xlarge
use_spot_instances=yes

where

  • exec_role is an IAM role with appropriate S3 and SageMaker permissions.
  • *_image is the URI of a docker image.
  • *_instance_type is the instance type. See here for a list.

Checklist

  • Added needs-backport label if PR is bug fix that applies to previous minor release
  • Ran scripts/format_code and committed any changes
  • Documentation updated if needed
  • PR has a name that won't get you publicly shamed for vagueness

Notes

  • This should be considered very early-stage and experimental.
  • It is currently not included in the default pip install rastervision.
  • We are not currently making use of SageMaker features like hyperparameter tuning jobs and model deployment.

Testing Instructions

  • See new unit tests.
  • Run an example with rastervision run sagemaker .. instead of rastervision run batch ...

@AdeelH AdeelH mentioned this pull request Oct 31, 2023
4 tasks
@AdeelH AdeelH changed the title Implement experimental SageMaker support Implement experimental AWS SageMaker support Oct 31, 2023
@AdeelH AdeelH force-pushed the sagemaker-part1 branch 2 times, most recently from 92079a2 to 761ceaf Compare November 1, 2023 18:58
@AdeelH AdeelH marked this pull request as ready for review November 1, 2023 19:46
Copy link

codecov bot commented Nov 1, 2023

Codecov Report

Merging #1968 (e22fd18) into master (7106652) will increase coverage by 0.00%.
The diff coverage is 88.88%.

@@           Coverage Diff           @@
##           master    #1968   +/-   ##
=======================================
  Coverage   86.19%   86.20%           
=======================================
  Files         191      193    +2     
  Lines        9362     9434   +72     
=======================================
+ Hits         8070     8133   +63     
- Misses       1292     1301    +9     
Files Coverage Δ
...s_sagemaker/rastervision/aws_sagemaker/__init__.py 100.00% <100.00%> (ø)
...rastervision/aws_sagemaker/aws_sagemaker_runner.py 87.50% <87.50%> (ø)

... and 1 file with indirect coverage changes

@AdeelH AdeelH merged commit e56f93d into azavea:master Nov 1, 2023
2 checks passed
@AdeelH AdeelH deleted the sagemaker-part1 branch November 1, 2023 21:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant