-
Notifications
You must be signed in to change notification settings - Fork 6
Adds Python tests for ingestor API lib to Github Actions. #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
523f2bf
45742f6
71ebbf8
9538370
1409e18
95b0b83
f6fb43d
707b44e
427dbee
5020d95
cf596a8
eed076f
3baf9c5
05ca824
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| name: Run Python Tests | ||
|
|
||
| on: | ||
| push: | ||
|
|
||
| jobs: | ||
| test: | ||
| env: | ||
| AWS_DEFAULT_REGION: us-west-2 | ||
| runs-on: ubuntu-latest | ||
| strategy: | ||
| matrix: | ||
| python: [3.9] | ||
| steps: | ||
| - uses: actions/checkout@v2 | ||
| - name: Setup Python | ||
| uses: actions/setup-python@v2 | ||
| with: | ||
| python-version: ${{ matrix.python }} | ||
| - name: Install Tox and any other packages | ||
| run: pip install tox | ||
| - name: Run Tox | ||
| # Run tox using the version of Python in `PATH` | ||
| run: tox -e py |
This file was deleted.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| httpx | ||
| moto[dynamodb, ssm]>=4.0.9 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -10,7 +10,8 @@ | |
| from pypgstac.load import Methods | ||
| from pypgstac.db import PgstacDB | ||
|
|
||
| from .dependencies import get_settings, get_table | ||
| from .dependencies import get_table | ||
| from .config import settings | ||
| from .schemas import Ingestion, Status | ||
| from .vedaloader import VEDALoader | ||
|
|
||
|
|
@@ -80,6 +81,7 @@ def load_into_pgstac(creds: DbCreds, ingestions: Sequence[Ingestion]): | |
| """ | ||
| Bulk insert STAC records into pgSTAC. | ||
| """ | ||
| print("Connecting to pgstac") | ||
| with PgstacDB(dsn=creds.dsn_string, debug=True) as db: | ||
| loader = VEDALoader(db=db) | ||
|
|
||
|
|
@@ -88,18 +90,16 @@ def load_into_pgstac(creds: DbCreds, ingestions: Sequence[Ingestion]): | |
| convert_decimals_to_float(i.item) | ||
| for i in ingestions | ||
| ] | ||
|
|
||
| print(f"Ingesting {len(items)} items") | ||
| loading_result = loader.load_items( | ||
| file=items, | ||
| # use insert_ignore to avoid overwritting existing items or upsert to replace | ||
| insert_mode=Methods.upsert, | ||
| ) | ||
|
|
||
| # Trigger update on summaries and extents | ||
| collections = set([item.collection for item in items]) | ||
| for collection in collections: | ||
| loader.update_collection_summaries(collection) | ||
| # collections = set([item["collection"] for item in items]) | ||
| # for collection in collections: | ||
| # loader.update_collection_summaries(collection) | ||
|
Comment on lines
+100
to
+102
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why this removal? Should we put behind a settings flag?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd concur with a settings flag for this behavior but I think this would require a more in-depth PR. Most of these commits were to make the construct usable for our initial testing. In this case I'd recommend a follow on PR which makes this behavior configurable on |
||
|
|
||
| return loading_result | ||
|
|
||
|
|
@@ -114,7 +114,7 @@ def update_dynamodb( | |
| """ | ||
| # Update records in DynamoDB | ||
| print(f"Updating ingested items status in DynamoDB, marking as {status}...") | ||
| table = get_table(get_settings()) | ||
| table = get_table(settings) | ||
| with table.batch_writer(overwrite_by_pkeys=["created_by", "id"]) as batch: | ||
| for ingestion in ingestions: | ||
| batch.put_item( | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -8,8 +8,8 @@ def get_s3_credentials(): | |||||||
| from .config import settings | ||||||||
|
|
||||||||
| print("Fetching S3 Credentials...") | ||||||||
|
|
||||||||
| response = boto3.client("sts").assume_role( | ||||||||
| client = boto3.client("sts") | ||||||||
| response = client.assume_role( | ||||||||
| RoleArn=settings.data_access_role, | ||||||||
| RoleSessionName="stac-ingestor-data-validation", | ||||||||
| ) | ||||||||
|
|
@@ -24,14 +24,14 @@ def s3_object_is_accessible(bucket: str, key: str): | |||||||
| """ | ||||||||
| Ensure we can send HEAD requests to S3 objects. | ||||||||
| """ | ||||||||
| from .config import settings | ||||||||
|
|
||||||||
| client = boto3.client("s3", **get_s3_credentials()) | ||||||||
| # client = boto3.client("s3", **get_s3_credentials()) | ||||||||
| client = boto3.client("s3") | ||||||||
|
Comment on lines
+28
to
+29
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why this change? The intention is that we should test access via the data access role, not as the role of this service. This way, there is a single role that can be used by many different services for data access. In the event that we need to ask data providers to grant access to our system, it reduces the burden to ask them to grant access to just one role that we use throughout our system.
Suggested change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @alukach If you recall, the Lambda role assumption was failing when we performed initial testing of the construct (see PR comment #8 (comment)). This is a temporary fix so that we can have a functional stack with public buckets until we can diagnose the issue. There are a few discussions of this on Slack as well https://developmentseed.slack.com/archives/C03N2CN2YJ0/p1670378666706349. |
||||||||
| try: | ||||||||
| client.head_object( | ||||||||
| Bucket=bucket, | ||||||||
| Key=key, | ||||||||
| **{"RequestPayer": "requester"} if settings.requester_pays else {}, | ||||||||
| **{"RequestPayer": "requester"}, | ||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why this change? My inclination is to keep the service configurable, it's possible that others would not want to assume the data access costs. If we do want to remove the configurable nature, then we should probably drop the
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As we discussed in Slack DM, this was failing in real world tests. I'd flag this also needs further investigation. Based on my initial research there is no cost/downside to specifying |
||||||||
| ) | ||||||||
| except client.exceptions.ClientError as e: | ||||||||
| raise ValueError( | ||||||||
|
|
||||||||
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe that this service should need to directly connect to any S3 buckets, rather that is the job of the data access role.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As described above, because role assumption within this handler was failing we needed to apply a broad role to support reading from public buckets until we can diagnose what is preventing us from injecting a user defined role to assume.