Skip to content
This repository has been archived by the owner on Apr 13, 2023. It is now read-only.

feat: update lambda state machine to accommodate tenantId #367

Merged
merged 3 commits into from
Jun 30, 2021

Conversation

Bingjiling
Copy link
Contributor

@Bingjiling Bingjiling commented Jun 25, 2021

Issue #, if available:

Description of changes:

Checklist:

  • Have you successfully deployed to an AWS account with your changes?
  • Have you written new tests for your core changes, as applicable?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@Bingjiling Bingjiling changed the title [WIP] feat: update lambda state machine to accomandate tenantId [WIP] feat: update lambda state machine to accommodate tenantId Jun 25, 2021
Comment on lines +71 to +73
filtered_tenant_id_frame = Filter.apply(frame = original_data_source_dyn_frame,
f = lambda x:
x['_tenantId'] == tenantId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of doing glue side filtering would it be better to have a secondary index on the tenantId? This will become an expensive operation if we have to scan across all tenants

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emm that's a great question. In the design doc, it specified the filtering is to be done as part of the Glue job, and secondary index was not introduced for any tables. @carvantes Any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The glue job always scans the entire DDB table no matter what, there's no way to use a query. This is a limitation on the current AWS Glue + DDB integration.

There are existing scenarios where this is far from ideal. e.g. exporting a single FHIR resource type or exporting the resources modified in the last hour will both scan the entire table.

There is room for improvement on the bulk export solution, but we are not changing the fundamentals here.

@Bingjiling Bingjiling changed the title [WIP] feat: update lambda state machine to accommodate tenantId feat: update lambda state machine to accommodate tenantId Jun 28, 2021
bulkExport/glueScripts/export-script.py Outdated Show resolved Hide resolved
@Bingjiling Bingjiling merged commit 9fedf56 into feat-multitenancy Jun 30, 2021
@Bingjiling Bingjiling deleted the multi-tenancy-bulk-export branch June 30, 2021 14:37
carvantes added a commit that referenced this pull request Aug 18, 2021
* feat: add tenantId attribute to Cognito user pool (#348)

* feat: remove unneeded scope checks in authorizer (#347)

* feat: update lambda state machine to accommodate tenantId (#367)

* feat: add "enableMultiTenancy" CFN parameter  (#381)

* test: add multi-tenancy integ tests (#387)

* fix: remove _id, _tenantId from bulk export results (#384)

* feat: Group export scripts (#389)

* fix: add multi-tenant metadata route (#392)

* fix: allow more concurrent export jobs for multi-tenant deployments (#397)

* test: integ tests for Group export (#393)

* feat: add ES hard delete config value (#398)

* docs: update postman collection and docs to use Id token  (#399)

* docs: add multi-tenancy docs (#400)


Co-authored-by: Yanyu Zheng <yz2690@columbia.edu>

BREAKING CHANGE: The Cognito IdToken is now used instead of the accessToken to authorize requests.
carvantes added a commit that referenced this pull request Aug 24, 2021
* feat: update lambda state machine to accommodate tenantId (#367)

* feat: add "enableMultiTenancy" CFN parameter (#382)

* fix: pass enableMultiTenancy to ES

* fix: remove _id, _tenantId from bulk export results

* feat: Group export scripts (#389)

* chore: script generating patient compartment search params

* feat: update Glue script for group export

* Upload patient compartment jsons to S3

* fix: allow more concurrent export jobs for multi-tenant deployments (#397)

* feat: add ES hard delete config value (#398)

* docs: add multi-tenancy docs (#400)

* fix: pass enableMultiTenancy flag to s3DataService

* test: add multi-tenancy integ tests (#387)

* test: integ tests for Group export (#393)

* chore: upgrade dependencies

* add public multi-tenant routes

* add system/read and user/delete permissions to defaults

* test: fix tests for smart multi-tenancy

* test: update gh actions to also test multi-tenant environment

* docs: update bulk export docs to mention group export

Co-authored-by: Yanyu Zheng <yz2690@columbia.edu>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants