-
Notifications
You must be signed in to change notification settings - Fork 6
Adds Python tests for ingestor API lib to Github Actions. #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| # client = boto3.client("s3", **get_s3_credentials()) | ||
| client = boto3.client("s3") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? The intention is that we should test access via the data access role, not as the role of this service. This way, there is a single role that can be used by many different services for data access. In the event that we need to ask data providers to grant access to our system, it reduces the burden to ask them to grant access to just one role that we use throughout our system.
| # client = boto3.client("s3", **get_s3_credentials()) | |
| client = boto3.client("s3") | |
| client = boto3.client("s3", **get_s3_credentials()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alukach If you recall, the Lambda role assumption was failing when we performed initial testing of the construct (see PR comment #8 (comment)). This is a temporary fix so that we can have a functional stack with public buckets until we can diagnose the issue. There are a few discussions of this on Slack as well https://developmentseed.slack.com/archives/C03N2CN2YJ0/p1670378666706349.
| Bucket=bucket, | ||
| Key=key, | ||
| **{"RequestPayer": "requester"} if settings.requester_pays else {}, | ||
| **{"RequestPayer": "requester"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? My inclination is to keep the service configurable, it's possible that others would not want to assume the data access costs.
If we do want to remove the configurable nature, then we should probably drop the requester_pays setting from our configuration tooling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed in Slack DM, this was failing in real world tests. I'd flag this also needs further investigation. Based on my initial research there is no cost/downside to specifying requester-pays for requests against buckets with no requester pays configuration.
| # collections = set([item["collection"] for item in items]) | ||
| # for collection in collections: | ||
| # loader.update_collection_summaries(collection) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this removal? Should we put behind a settings flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd concur with a settings flag for this behavior but I think this would require a more in-depth PR. Most of these commits were to make the construct usable for our initial testing. In this case I'd recommend a follow on PR which makes this behavior configurable on cron schedule (which will require additional infrastructure).
| name: Test & Build | ||
|
|
||
| on: | ||
| push: | ||
|
|
||
| jobs: | ||
| package: | ||
| uses: ./.github/workflows/build.yaml | ||
| with: | ||
| release: true | ||
| secrets: | ||
| DS_RELEASE_BOT_ID: ${{ secrets.DS_RELEASE_BOT_ID }} | ||
| DS_RELEASE_BOT_PRIVATE_KEY: ${{ secrets.DS_RELEASE_BOT_PRIVATE_KEY }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will still need this to build and release the package.
We should most likely filter by branch name for that though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused on this point. Shouldn't this be fully handled by the build -> release workflows and only triggered by release tags? I was unclear due to the use of nested workflows.
| handler.addToRolePolicy( | ||
| new iam.PolicyStatement({ | ||
| actions: ["s3:Get*", "s3:List*"], | ||
| resources: ["arn:aws:s3:::*"], | ||
| }) | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe that this service should need to directly connect to any S3 buckets, rather that is the job of the data access role.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As described above, because role assumption within this handler was failing we needed to apply a broad role to support reading from public buckets until we can diagnose what is preventing us from injecting a user defined role to assume.
|
Thanks for taking this on! |
|
@sharkinsspatial if we're going to pull features out because they're causing issues, can you document these problems via GitHub issues? The assume role thing is strange but I'm guessing it's due to a small issue in the cdk configuration (I'm testing out to see if I can replicate/resolve) I'm still unsure as to what the issue is with the requesters pay flag, I don't understand how it could cause a problem. |
|
@alukach I made issues covering these at #12 and #11. As mentioned in #8 these changes were intended to be temporary so that we could have a working published release of I attempted to investigate the above issues unsuccessfully but given our time constraints to deploy a working stack I implemented these temporary hack fixes. I'd be psyched if you can reproduce these and discover the root causes 🙇 . In order to track potential regressions like this in future releases, I also propose we build out a simple integration test suite for the construct #13. |
No description provided.