Skip to content

Conversation

@sharkinsspatial
Copy link
Member

No description provided.

@sharkinsspatial sharkinsspatial marked this pull request as ready for review January 25, 2023 02:18
@jjfrench jjfrench mentioned this pull request Jan 25, 2023
Comment on lines +28 to +29
# client = boto3.client("s3", **get_s3_credentials())
client = boto3.client("s3")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? The intention is that we should test access via the data access role, not as the role of this service. This way, there is a single role that can be used by many different services for data access. In the event that we need to ask data providers to grant access to our system, it reduces the burden to ask them to grant access to just one role that we use throughout our system.

Suggested change
# client = boto3.client("s3", **get_s3_credentials())
client = boto3.client("s3")
client = boto3.client("s3", **get_s3_credentials())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alukach If you recall, the Lambda role assumption was failing when we performed initial testing of the construct (see PR comment #8 (comment)). This is a temporary fix so that we can have a functional stack with public buckets until we can diagnose the issue. There are a few discussions of this on Slack as well https://developmentseed.slack.com/archives/C03N2CN2YJ0/p1670378666706349.

Bucket=bucket,
Key=key,
**{"RequestPayer": "requester"} if settings.requester_pays else {},
**{"RequestPayer": "requester"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? My inclination is to keep the service configurable, it's possible that others would not want to assume the data access costs.

If we do want to remove the configurable nature, then we should probably drop the requester_pays setting from our configuration tooling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed in Slack DM, this was failing in real world tests. I'd flag this also needs further investigation. Based on my initial research there is no cost/downside to specifying requester-pays for requests against buckets with no requester pays configuration.

Comment on lines +100 to +102
# collections = set([item["collection"] for item in items])
# for collection in collections:
# loader.update_collection_summaries(collection)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this removal? Should we put behind a settings flag?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd concur with a settings flag for this behavior but I think this would require a more in-depth PR. Most of these commits were to make the construct usable for our initial testing. In this case I'd recommend a follow on PR which makes this behavior configurable on cron schedule (which will require additional infrastructure).

Comment on lines -1 to -13
name: Test & Build

on:
push:

jobs:
package:
uses: ./.github/workflows/build.yaml
with:
release: true
secrets:
DS_RELEASE_BOT_ID: ${{ secrets.DS_RELEASE_BOT_ID }}
DS_RELEASE_BOT_PRIVATE_KEY: ${{ secrets.DS_RELEASE_BOT_PRIVATE_KEY }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will still need this to build and release the package.

We should most likely filter by branch name for that though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused on this point. Shouldn't this be fully handled by the build -> release workflows and only triggered by release tags? I was unclear due to the use of nested workflows.

Comment on lines +111 to +116
handler.addToRolePolicy(
new iam.PolicyStatement({
actions: ["s3:Get*", "s3:List*"],
resources: ["arn:aws:s3:::*"],
})
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe that this service should need to directly connect to any S3 buckets, rather that is the job of the data access role.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As described above, because role assumption within this handler was failing we needed to apply a broad role to support reading from public buckets until we can diagnose what is preventing us from injecting a user defined role to assume.

@alukach
Copy link
Member

alukach commented Jan 26, 2023

Thanks for taking this on!

@alukach
Copy link
Member

alukach commented Jan 30, 2023

@sharkinsspatial if we're going to pull features out because they're causing issues, can you document these problems via GitHub issues?

The assume role thing is strange but I'm guessing it's due to a small issue in the cdk configuration (I'm testing out to see if I can replicate/resolve)

I'm still unsure as to what the issue is with the requesters pay flag, I don't understand how it could cause a problem.

@sharkinsspatial
Copy link
Member Author

@alukach I made issues covering these at #12 and #11. As mentioned in #8 these changes were intended to be temporary so that we could have a working published release of cdk-pgstac that could be used directly in our CI deployment for https://github.com/developmentseed/aws-asdi-pgstac without the need for deploying against a local build of the construct.

I attempted to investigate the above issues unsuccessfully but given our time constraints to deploy a working stack I implemented these temporary hack fixes. I'd be psyched if you can reproduce these and discover the root causes 🙇 .

In order to track potential regressions like this in future releases, I also propose we build out a simple integration test suite for the construct #13.

@sharkinsspatial sharkinsspatial deleted the fix/ci_tests branch March 10, 2023 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants