-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MLOps modules #62
MLOps modules #62
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just did a quick prelim review. More to follow in later discussions but two quick items:
- I've called out a few places where you can use
name_prefix
to avoid potential naming collisions - I understand the csv files are valuable in git as a training dataset, but would be good to replace the zip (binary) files with their code equivalent if and when possible. As a general best practice, it's a good idea to avoid binary files since their diffs can't be readily evaluated by humans.
If these zips are for lambda functions, #2
will get a lot easier after refactoring onto the existing lambda-python module - since the zipping and packaging become automated during the terraform apply.
- Resolved
Hi @aaronsteers , I've added in the name prefix's. Yes, the ZIPs are the Lambda functions (not CSVs) so yes I hope that will become easier with the lambda module! One question, are there any plans to add "force destroy" to the S3 mod? Sagemaker training jobs will store their models in S3 buckets which makes clean-up harder if we want to destroy said bucket. Jack |
I would like to avoid using What do you think? Do you think this would work for what you are looking for? |
@aaronsteers Completely get the rationale and thinking about it, it probably makes sense for me not to use force_destroy on my buckets also.. I tested the feature and it works fine. However, it would only work with an existing bucket right? Is there a way to create a dependency if you wanted to create that bucket in the same Terraform apply? |
* replace zip files with py * move lambda files to catalog * remove extra comments * reference arns from lambda module * refactor vars to insulate lambda defs from s3_triggers * get arns from lambda module * refactor lambda function definitions * python cleanup and auto-formatting (using black) * fix source path * move lambda functions to ml ops * fix errors from refactoring * fix missing requirements.txt * Attempted bugfix: accessing non-existent pip[0] * typo * improved examples * Update components/aws/lambda-python/outputs.tf per suggested change Co-Authored-By: Jack Sandom <60360603+jacksandom@users.noreply.github.com> * updated variable name * updated variable name * output iam roles for ecs-task and lambda * Lambda IAM SageMaker policy attachment Co-authored-by: Jack Sandom <60360603+jacksandom@users.noreply.github.com> Co-authored-by: jacksandom <jack.sandom@slalom.com>
@jacksandom - When I merged in master, it started including documentation metadata in the CI/CD tests. Here's the output. Basically, this just means we need to add a new comment block at the top of the /*
* This is my short description about the module and how to use it.
* _Markdown_ formatting *is* supported.
*
*/ Do you mind taking a stab at this? If you are interested, you can also test the auto-documentation by running the following:
The above will update the two |
@jacksandom - I went ahead and added the comment header in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jacksandom - I am almost done with the full review, and I wanted to send this over since I'm already late sending you this feedback. I may have a few other smaller comments but I think this covers the bulk of it. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Round 2 of feedback. I think this is everything. Thanks!
@jacksandom - I have good news! It believe I've resolved the resource count issue in this commit: a65ac67 The problem-area was a distinct list of bucket names which was then being used to drive the count of resource permission objects created for iam policies. Instead, there is now only one iam/policy object of each type. While the objects the policy refers to are still driven by the distinct list of buckets, that count no longer affects the number of resources terraform is creating. For readability, I also migrated a policy JSON string into the |
@jacksandom - I sent you a direct message in regards to the ECR login error. I would like to use the AWS Credential Helper if possible - here: https://github.com/awslabs/amazon-ecr-credential-helper I remember I got this working before but I don't see where/if I had logged any documentation on that here in this repo. The desired behavior would be that we could use the credential helper instead of having to run |
@aaronsteers - thanks for reviewing - all my changes are in. Two main things left:
|
Regarding Glue jobs:
Complexities:
|
python-lambda
module. (P1)whl
file for glue transformation dependency. (P2)