Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AppSync transform violates DynamoDB rules for sparse index for GSI #262

Closed
arkllc opened this issue Oct 2, 2020 · 16 comments
Closed

AppSync transform violates DynamoDB rules for sparse index for GSI #262

arkllc opened this issue Oct 2, 2020 · 16 comments

Comments

@arkllc
Copy link

arkllc commented Oct 2, 2020

DynamoDB spec allows the GSI sort key to be null. The AppSync transform however forces it to be non-null. This was working fine until 4.18.0 but the bug was introduced thereafter.

Please refer to DynamoDB documentation here - https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html

Here is the section on the page -
"A global secondary index only tracks data items where its key attributes actually exist. For example, suppose that you added another new item to the GameScores table, but only provided the required primary key attributes.
Because you didn't specify the TopScore attribute, DynamoDB would not propagate this item to GameTitleIndex. Thus, if you queried GameScores for all the Comet Quest items, you would get the following four items. A similar query on GameTitleIndex would still return three items, rather than four. This is because the item with the nonexistent TopScore is not propagated to the index."

Also some other documentation here https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-indexes-general-sparse-indexes.html

Amplify CLI Version
Introduced in versions after 4.18.0

Additional context
I had created another issue 4413 but there was no response for a long time. I am stuck until this is implemented hence I created another bug to help get attention.

All the details are in the issue link above.

@nikhname
Copy link
Contributor

nikhname commented Oct 2, 2020

Hello @arkllc, we don't allow nullable fields in the index since if a single field is null, composite key creation will not work. This is because composite keys are created by hashing the first key with the second key (SORT_KEY_1_VAL#SORT_KEY_2_VAL). As an enhancement we will look into using a placeholder for null values.

@arkllc
Copy link
Author

arkllc commented Oct 3, 2020

Thanks, @nikhname for the update. Any idea when this will be fixed? Need to understand if I should change my implementation or should I wait for this fix?

@arkllc
Copy link
Author

arkllc commented Oct 21, 2020

@nikhname @ammarkarachi could you please provide guidance in the meantime until this is fixed. If it's going to be fixed soon then I will wait it out. Else I have to change my implementation. Please let me know ASAP.

@arkllc
Copy link
Author

arkllc commented Nov 28, 2020

Not sure if anyone looking into this issue. All I can see is that many other folks are facing the issue but somehow the owners of this project are not prioritizing this issue. Or not even providing any guidance around what to do in the meanwhile.

Would appreciate any updates.

@frankclaassen
Copy link

I see this as a critical design flaw in the amplify / appsync implementation as the current implementation does not cater for real world data. For a system to enforce a value for something that does not exist or could potentially not exist like a person's middle name is ridiculous and utterly poorly thought out. This seems like a very simple fix, if the supplied value is empty, then it will not cause any errors when constructing the composite key value. I can do this with my own code without issues, can't see why this cannot be fixed in appsync / amplify. This is not an enhancement but something which should be inherently supported by default in a framework and the lack thereof makes it a critical design / implementation flaw

@arkllc
Copy link
Author

arkllc commented Dec 30, 2020

@ammarkarachi @nikhname any guidance here??

@xitanggg
Copy link

This seems to be addressed by the recent merge: aws-amplify/amplify-cli#5153

@arkllc
Copy link
Author

arkllc commented Jan 28, 2021

Thanks, @xitanggg. When will this be released? Also, how do I test this now?

@arkllc
Copy link
Author

arkllc commented May 1, 2021

@RossWilliams Will aws-amplify/amplify-cli#5153 fix this bug too? If yes, I am on 4.50.0 but somehow I still am not able to create data with null secondary keys. So I am unclear if I am missing anything here. Appreciate it in advance, if you could please help clarify

@RossWilliams
Copy link

@arkllc If you provide a code example I can try to help.

@arkllc
Copy link
Author

arkllc commented May 29, 2021

@RossWilliams Thanks for your offer.

Here is the code sample. I have a Metrics table that has belongsTo relationship with Tenant table. Now each tenant could have many JobDefinitions and Job executions. The same Metrics table can also belongTo JobDefinitions and JobExecutions. Now, as JobDefinitions and JobExecutions dont happen at the very creation Tenant itself, the jobdefinitions and jobExecutions are secondary keys and not mandatory!.

type Metrics @model
    @key(name: "byTenantIDByCreatedDate", fields: ["tenantID","createdAt"], queryField: "MetricsByTenantIDByCreatedDate")
    @key(name: "byTenantIDByJobDefinitionID", fields: ["tenantID","jobDefinitionID"], queryField: "MetricsByTenantIDByJobDefinitionID")
    @key(name: "byTenantIDByJobDefinitionIDByJobExecutionID", fields: ["tenantID","jobDefinitionID","jobExecutionID"], queryField: "MetricsByTenantIDByJobDefinitionIDByJobExecutionID")
    @key(name: "byTenantIDByJobDefinitionIDByJobExecutionIDByCreatedDate", fields: ["tenantID","jobDefinitionID","jobExecutionID","createdAt"], queryField: "MetricsByTenantIDByJobDefinitionIDByJobExecutionIDByCreatedDate")
    @auth(rules: [{allow: groups, groups: ["Admin"], operations: [create, read, update, delete]},
                  {allow: groups, groupsField: "groupsCanAccess", operations: [read]},
                  {allow: private, provider: iam, operations: [create,read, update ] }
                  ]) {
  id: ID!
  tenantID: ID!
  tenant: Tenant  @connection(fields: ["tenantID"])
  jobDefinitionID: ID #Note no ! 
  jobExecutionID: ID #Note no ! 
  ## Some For each Job Executions
  ## Some For each Job Definitions 
  ## For each Tenant
  groupsCanAccess : [String]
  owner: String
  createdAt: AWSDateTime
}

Before 4.18 version of appsync I could create Metrics table with no issue. However now when I do the following mutation

mutation CreateTenantsMetrics {
  T1M: createMetrics(input: {tenantID: "T1-T", groupsCanAccess: ["T1-T-Admin", "T1-T-PowerUser", "T1-T-User"]}) {
    id
  }
}

I get the error

"message": "When creating any part of the composite sort key for @key 'byTenantIDByJobDefinitionIDByJobExecutionIDByCreatedDate', you must provide all fields for the key. Missing key: 'jobDefinitionID'."

Appreciate your help in advance

@arkllc
Copy link
Author

arkllc commented May 29, 2021

@RossWilliams Also please note that DynamoDB table design actually encourages the use of sparse indexes and not create different tables. Hence the workaround of creating specific metrics table for like JobDefinitionMetrics and JobExecutionMetrics is really not recommended and I would like to avoid them as much as possible.

@RossWilliams
Copy link

RossWilliams commented Jun 10, 2021

@arkllc You have identified a documentation problem, and I can see how it is confusing.
Amplify supports sparse GSIs, but it does not support sparse GSIs when using a composite sort key (more than 1 sort key property).

The reason why this happens is due to developer choice in the way the transform was made. It is trying ensure the value of the composite key is the same as the underlying data. If a user tries to update a model, but only supplies some of the composite key information, the system does not know how to update the composite key. Not allowing the update operation makes sense in this case.

A way the code could be improved is to make a condition check in the dynamodb update operation to see if the composite key is currently set, and only fail the update operation if both the composite key has previously been set AND the user is trying to update only some of the composite key. This will cause the error message to show up much less often, but might not show up until production.

If you want some workarounds of the existing code:

  1. Manage the GSI keys fields yourself. set a GSI1PK and GSI1SK property on your model, and make sure they are kept up to date yourself. Since it isn't a composite key, one or both can be null.
  2. Set default values for the optional fields as "!!!" or something similar and pay the extra costs for adding items to a GSI that you will never need.

@arkllc
Copy link
Author

arkllc commented Jun 12, 2021

Thanks @RossWilliams for looking into this.

So can AWS improve the code as you outlined based on this issue or does one need to create another issue? Which is really the right way of solving this issue.

I also appreciate the workarounds but aws-amplify/amplify-cli#1 means writing and managing more which is exactly what I am trying to avoid. aws-amplify/amplify-cli#2 is a bit more elegant at some extra cost. However does AppSync have any way that I can declare a default value in schema in case the user does not provide a value?

@RossWilliams
Copy link

2 is harder than it first appears to do in a generic way, you may be better off handling it in your client code. This is because the value is not just a default for when an item is created, but it needs to be supplied to all update operations as well, and you likely do not want your set value to be overridden with the default value. Instead, in your client code, you should always pass in the current value of these fields, and in the create operations, supply the default as well. This is not easy to model in a vtl template.

If you want to keep your client very clean I'd recommend using a lambda mutation resolver instead of relying on the VTL template. But again this means managing more code yourself.

I can't comment on whether the code can or should be changed at this point. It will change the behaviour for users. Currently users are given a friendly warning message by the vtl request template that they must supply specific properties in a mutation. If the condition check was moved to dynamodb, it is not as easy (or possible) to provide a user-friendly warning message, and instead users will receive a generic condition check failed error. You could fork the @key directive and create your own plugin. Change the lines here to insert a condition expression. The condition expression will be added if any of the compound key items are missing from the input list, and will assert that the compound key does not exist. But I understand that is additional work for you to do.

@siegerts siegerts added the feature-request New feature or request label Sep 3, 2021
@josefaidt josefaidt added the p3 label Mar 24, 2022
@alharris-at alharris-at transferred this issue from aws-amplify/amplify-cli May 17, 2022
@renebrandel
Copy link
Contributor

Closing - confirming Ross' statements above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants