Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't differentiate between read and write in ConsumedCapacity #112

Closed
bvisness opened this issue Jan 13, 2020 · 6 comments
Closed

Can't differentiate between read and write in ConsumedCapacity #112

bvisness opened this issue Jan 13, 2020 · 6 comments

Comments

@bvisness
Copy link

The ConsumedCapacity type in guregu is basically the same as the ConsumedCapacity type in the AWS SDK (but "with less pointers"), except that where the Table, GlobalSecondaryIndexes, and LocalSecondaryIndexes fields in the AWS SDK further break down reads and writes...

// some fields omitted
type ConsumedCapacity struct {
	// The amount of throughput consumed on each global index affected by the operation.
	GlobalSecondaryIndexes map[string]*Capacity `type:"map"`

	// The amount of throughput consumed on each local index affected by the operation.
	LocalSecondaryIndexes map[string]*Capacity `type:"map"`

	// The amount of throughput consumed on the table affected by the operation.
	Table *Capacity `type:"structure"`
}

type Capacity struct {
	// The total number of capacity units consumed on a table or an index.
	CapacityUnits *float64 `type:"double"`

	// The total number of read capacity units consumed on a table or an index.
	ReadCapacityUnits *float64 `type:"double"`

	// The total number of write capacity units consumed on a table or an index.
	WriteCapacityUnits *float64 `type:"double"`
}

...guregu has opted to throw away the read/write breakdown and just report the total (corresponding to CapacityUnits above):

// some fields omitted
type ConsumedCapacity struct {
	// GSI is a map of Global Secondary Index names to consumed capacity units.
	GSI map[string]float64
	// GSI is a map of Local Secondary Index names to consumed capacity units.
	LSI map[string]float64
	// Table is the amount of throughput consumed by the table.
	Table float64
}

This makes it impossible to accurately break down consumed capacity for complex queries, and for no reason I can discern. If the goal is to have fewer pointers, you can still have a Capacity struct that breaks down read and write according to the information returned by Dynamo.

I can see a couple ways forward to fix this:

  1. Change the custom ConsumedCapacity type to include all three pieces of information present in the AWS response. This would unfortunately have to be a breaking change, or would require new fields to be introduced.
  2. Expose the raw AWS SDK ConsumedCapacity type, pointers and all. This would not be breaking.
@guregu
Copy link
Owner

guregu commented Jan 14, 2020

I didn't "opt to throw away" the read and write data. It didn't exist as part of the API when this was implemented. It's obvious if you glance at the old Capacity or ConsumedCapacity structs. It looks like read and write data was added when transactions were implemented.

Anyway, this change unfortunately puts us in an awkward position. I don't want to add breaking changes just yet, but I'll take a note of this for potential 2.0 changes. In the meantime, I'll add some fields to ConsumedCapacity with the read/write data.

Believe it or not, this library has been stable for longer than the official SDK. Many changes and additions to the DynamoDB API have happened since then. I would appreciate if you took that into consideration before assuming that I am throwing away your data for no reason.

@guregu
Copy link
Owner

guregu commented Jan 14, 2020

I released v1.6.0 which adds fields to ConsumedCapacity for the indexes and the table read/write info. It's not ideal having a separate map for reads and writes but it's the best we can do until v2.0.
However after doing some testing for this feature I noticed the read and write consumed capacity is only returned by the DynamoDB API for transactions. Other actions only return the total. I added some documentation to indicate this. I can't find any documentation about this behavior from AWS. It doesn't seem to be a bug in the Go AWS SDK because the raw HTTP responses lack this data. If you discover any more details about this please let me know.
I'll close this for now but re-open it if you find any issues.

@guregu guregu closed this as completed Jan 14, 2020
@bvisness
Copy link
Author

Thanks much for that. I’m sorry for assuming the worst when I didn’t have all the context.

That’s strange to hear that the API only returns totals. I also can’t find documentation about that, plus I’m finding some actually inaccurate gems like this:

9C644470-CF50-4430-A8FA-9EB0A8BC7398

I think I’ll ask our AWS rep today for more details and see if they can clarify that in the documentation. Thanks for bringing that to our attention.

Thanks for the quick turnaround, and apologies again for being rude.

@bvisness
Copy link
Author

The documentation is still mostly unclear, but I think things are coming together. I don't think there are actually any DynamoDB operations (besides transactions!) that consume both RCU and WCU.

For example, take a look at these sections, which list the operations that consume RCU and the operations that consume WCU.

See also the ReturnValues parameter of UpdateItem, which states that no matter what data you choose to return, no RCU will be consumed.

I think the most likely explanation now (which I will confirm with our AWS rep) is that the ConsumedCapacity type was intended to report either RCU or WCU, depending on what type of operation you did. But, when they added transactions, you could combine both RCU and WCU operations into a single request, so they needed to be able to differentiate them. However, since they didn't clarify this behavior in the docs, I think the ConsumedCapacity type now suggests a different mode of behavior.

@bvisness
Copy link
Author

bvisness commented Jan 15, 2020

I confirmed with AWS support that there are no operations that consume both RCU and WCU (besides transactions, as you noted).

@guregu
Copy link
Owner

guregu commented Jan 21, 2020

That makes a lot of sense, thanks! I'll improve the godoc for these later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants