-
Notifications
You must be signed in to change notification settings - Fork 14
fix: Standardized aws service schemas using base schema #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
mfranceschit
commented
Apr 4, 2022
- Created a common interface for AWS schemas following the pattern at the GCP and Azure providers.
- Required changes to enable new RulesEngine optimization.
src/services/base/schema.graphql
Outdated
| arn: String! @id @search(by: [hash]) | ||
| accountId: String! @search(by: [hash, regexp]) | ||
| region: String @search(by: [hash, regexp]) | ||
| cgId: String @search(by: [hash]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not required? why do we need it in connections then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why. As I told you, it is an uncommon case, but it exists. It appears so infrequently that it is better to leave it that way instead of creating a new base interface that only we will use once or twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I see! thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this cg id for cloudtrail causes us a lot of issues. I wonder if it makes sense to reconsider our solution here. 🤔 Idk what we could do outside of limiting cloudtrails to only show up in the account they belong to. (I believe this is only an issue for cloudtrails in multi accounts)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aacotroneo We decided to use the id and remove the cgId to reduce edge cases like cloudtrail.
|
you are purposely disabling all query, mutation, and subscription GraphQL APIs to be able to implement it on a service type(dodge a couple of DGraph errors). The directive is inherited as well. You should re-enable all the |
835c148 to
12ddf03
Compare
| accountId: String! @search(by: [hash]) | ||
| arn: String! @id @search(by: [hash, regexp]) | ||
| region: String @search(by: [hash, regexp]) | ||
| cgId: String @search(by: [hash, regexp]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to add this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we're trying to standardize the common attributes that an AWS schema could have. Notice that is nullable because for this service the cgId doesn't perform a role. To keep the solution agnostic at the RulesEngine, we're using the extraFields field to create connections, take a look here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but if its a standard field would it not be in the base schema and not here? I am just wondering why it is being added here but not to other schemas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how would this change the 'data contract'? At EP, we assume that every resource has a unique id. is it not possible to keep that nice property? (we'll need to code special cases to deal with this too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aacotroneo sadly the reason is due mainly to cloudtrail. because cloudtrail can live in multiple accounts (but still has the arn of the account where it was created), our multi account scanning had an issue. Either create a new id (which we did which is cgId) or only show the cloudtrail in the account it "belongs" to. (which I considered but was suggested from @ckoning that users will want to see it in both accounts as it would show in both in the management console). I am wondering if we want to reconsider this decision. Currently we use cgId because cloudtrail does have an id and we thought it may be confusing to users if the id field for cloudtrail in CG was NOT the actual cloudtrail id. Its a difficult problem with no easy solutions (outside of composite ids which dgraph doesnt have). @ckoning now that this problem has had some time to sit, what do you think would be best here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that's an unneeded restriction, and it's harder to reason about as a user of CG:
contract 1:
id - an opaque string that uniquely identifies a resource across accounts
contract 2:
id - a value that corresponds directly to an aws id. when that value is not unique across accounts refer to cgID
cgID: an opaque string that uniquely represents a resource across accounts, only meaningful when IDs clash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I am not a fan of the cgId paradigm. I would prefer to not have to use it. But idk if the trade off of losing the value of contract 2 (part 1) would be worth moving to contract 1. I preferred only having the cloudtrail show up in the account that owned it (although I understand the drawbacks of that as well). Moving the random string to the id also brings with it another issue. The cgId value is only initialized in the format function, but the connections functions deal with the raw data. So the connections files would have no access to the value needed to form connections to the cloudtrail service. I suppose we could initialize it in the data function... 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer contract 1. It's easier to implement and it follows a clear pattern, especially because of Dgraph limitations. That opaque string could be an ARN for most of the cases and a unique identifier for those services that don't have one. I'm leaning toward standardized how providers behave between them, which would allow keeping our paradigm agnostic, even move forward with the code generators projects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets chat on a huddle for this so we can discuss pros and cons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like we are going to use a generated string as the id for cloudtrail instances after a convo with @mfranceschit and see how that goes.
Release Github Actions pipelines to Beta channel
Releases Github Actions pipelines to main branch
12ddf03 to
477d353
Compare
477d353 to
f0f7cfd
Compare