Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support WRITE access for consumer roles #1332

Closed
zsaltys opened this issue Jun 14, 2024 · 4 comments
Closed

Support WRITE access for consumer roles #1332

zsaltys opened this issue Jun 14, 2024 · 4 comments

Comments

@zsaltys
Copy link
Contributor

zsaltys commented Jun 14, 2024

Is your idea related to a problem? Please describe.
Currently data.all only grants READ only access to consumer IAM roles. However organizations like ours need to manage WRITE access as well to define which roles can write to which S3 buckets or which databases. Otherwise we end up managing read only access with data.all and all write access outside of data.all. We would like to unify and manage both read and write access via data.all.

Describe the solution you'd like
We need to think this through in context of:

  • S3 bucket sharing
  • Access points
  • LakeFormation
  • Incoming RedShift integration

S3 bucket sharing
Overall a very simple change. The IAM role just needs to be granted PUTObject permissions + KMS Encrypt permissions on the key etc. I would not want to grant DELETE or anything else besides PUT? Perhaps this can be configurable on config what permissions to grant so organizations can decide. Those extra permissions be granted by the user themselves on the IAM role using IaC when needed.

Access points
I don't know too much about these but I suspect they work similarly to S3 bucket sharing. Hope the team can clarify on this ticket.

LakeFormation
The question is what permissions we should grant here. We could limit only to just basic to add new partitions (don't know which permission controls that atm) which is what most of the writing roles will ever need. Or we can also grant things like DROP, INSERT, DELETE, CREATE. I think to be most useful to everyone WRITE should grant full write access with all the other permissions that are not currently granted. Though this could be inconsistent with S3 permissions if for example we only grant PUT on S3 but grant DELETE on the DB? Perhaps this could also be configurable but the default should be consistent for all.

RedShift
Don't know enough on this to comment.

My proposal would be that WRITE access is defined when creating a share. We don't want to do it when registering a consumer role because consumer role could also be used for both write access on same account and read access on other accounts (ex EMR role). We can either define WRITE access per share item or for the entire share. I think defining it for the entire SHARE makes sense. Default should be READ ONLY. Write access should only be selectable (and validated by backend) if the consumer role and dataset belong to the same environment. The only problem I see is currently share items let you select specific tables so we could grant write access to them. But how do we grant CREATE access on the DB? Do we just do it implicitly which can confuse the user.. What if he doesn't select any tables should we still grant WRITE access to the DB? The only way I can think of solving this is that there must be a new share item for the DATABASE itself.

We also must make sure the share validator / health checker is made aware of the new extra permissions.

@anmolsgandhi
Copy link
Contributor

Hi - Thanks for opening the issue, it will be part of v2.7 release. cc: @dlpzx

@dlpzx
Copy link
Contributor

dlpzx commented Jul 25, 2024

Hi @petrkalos, having a look at the docs, Redshift write permissions for datashares (which is what we are going to use) are currently in preview :https://docs.aws.amazon.com/redshift/latest/dg/multi-warehouse-writes-data-sharing.html

@petrkalos
Copy link
Contributor

Hi @zsaltys please take a look at the following video demonstrating the WRITE access workflow.
As you can see I am initially trying to PutObject as a scientist and get permission denied as expected.
I then request READ/WRITE/MODIFY access for the Scientist team and try again which succeeds.

Some commends wrt to your initial thoughts

  • I split the permissions to WRITE and MODIFY which will map to the corresponding rules depending on the share type (i.e for s3 WRITE will have PutObject and MODIFY will have DeleteObject etc). I prefer not to make this configurable as feature flags add complexity to the codebase.
  • consumer role and dataset belong to the same environment
    Since we are implementing this feature I'd prefer to make generally available and not have those restrictions that are not enforced by any tech limitations. At the end of the day a request for access is being made and approved.

write.mp4

@noah-paige noah-paige added this to the v2.7.0 Sprint 4 milestone Aug 27, 2024
petrkalos added a commit that referenced this issue Sep 2, 2024
### Feature or Bugfix
- Feature

### Detail
Adding support for requesting WRITE/MODIFY permissions for an S3Bucket

### Testing
Tested in local and dev delpoyment

### Relates
#1332 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
@dlpzx
Copy link
Contributor

dlpzx commented Sep 5, 2024

Closing as complete. To be released in 2.7

@dlpzx dlpzx closed this as completed Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

5 participants