Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DELETE method for data API #1132

Merged
merged 12 commits into from
Jan 14, 2023
Merged

Conversation

nguyer
Copy link
Contributor

@nguyer nguyer commented Jan 6, 2023

Signed-off-by: Nicko Guyer <nicko.guyer@kaleido.io>
Signed-off-by: Nicko Guyer <nicko.guyer@kaleido.io>
@codecov-commenter
Copy link

codecov-commenter commented Jan 6, 2023

Codecov Report

Merging #1132 (6398aed) into main (42f5105) will increase coverage by 0.37%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##             main     #1132      +/-   ##
===========================================
+ Coverage   99.62%   100.00%   +0.37%     
===========================================
  Files         303       305       +2     
  Lines       19795     19963     +168     
===========================================
+ Hits        19721     19963     +242     
+ Misses         74         0      -74     
Impacted Files Coverage Δ
internal/apiserver/route_get_data_blob.go 100.00% <ø> (ø)
internal/apiserver/routes.go 100.00% <ø> (ø)
internal/events/event_manager.go 100.00% <ø> (ø)
internal/shareddownload/download_manager.go 100.00% <ø> (ø)
internal/apiserver/route_delete_data.go 100.00% <100.00%> (ø)
internal/broadcast/manager.go 100.00% <100.00%> (ø)
internal/broadcast/operations.go 100.00% <100.00%> (ø)
internal/data/blobstore.go 100.00% <100.00%> (ø)
internal/data/data_manager.go 100.00% <100.00%> (ø)
internal/database/sqlcommon/blob_sql.go 100.00% <100.00%> (ø)
... and 27 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Signed-off-by: Nicko Guyer <nicko.guyer@kaleido.io>
Copy link
Contributor

@peterbroadhurst peterbroadhurst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @nguyer!

A couple of minor changes requested.

DROP TABLE blobs;
ALTER TABLE temp_blobs RENAME TO blobs;
CREATE INDEX blobs_hash_data_id ON blobs(hash, data_id);
CREATE INDEX blobs_payload_ref ON blobs(payload_ref);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for self: Confirm why we need this reverse lookup index

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok - this is for a compatibility check during deletion, that we do not delete a payload ref in DX if it's referred to by another data ID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct 🙂

DROP INDEX blobs_hash;
DROP TABLE blobs;
ALTER TABLE temp_blobs RENAME TO blobs;
CREATE INDEX blobs_hash_data_id ON blobs(hash, data_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this just be namespace, data_id? Not sure the hash adds value, and particularly not first as data_id is the thing we expect to be unique in most practical cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we had discussed this and decided that we should always be looking up blobs by both their hash and data ID to prevent spoofing of different data?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I don't think that changes my suggestion on the index that's most efficient.

I believe the lookup is:
{ ns: "mynamespace", data_id: "id12345", hash: "abce1234" }

I assert the most efficient index to support this lookup is:
namespace, data_id

  • Leaving off namespace feels wrong
  • Adding in a 3rd dimension of hash seems unnecessary (as the scan length should be trivially small as we're covering a very minor edge case there)

internal/apiserver/route_delete_data.go Outdated Show resolved Hide resolved
internal/events/dx_callbacks.go Outdated Show resolved Hide resolved
Signed-off-by: Nicko Guyer <nicko.guyer@kaleido.io>
Signed-off-by: Nicko Guyer <nicko.guyer@kaleido.io>
Signed-off-by: Nicko Guyer <nicko.guyer@kaleido.io>
Signed-off-by: Nicko Guyer <nicko.guyer@kaleido.io>
Signed-off-by: Nicko Guyer <nicko.guyer@kaleido.io>
}
if data.Blob != nil && data.Blob.Hash != nil {

blob, err := dm.database.GetBlob(ctx, dm.namespace.Name, data.ID, data.Blob.Hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we determined there could be multiple blobs at this point @nguyer

e.g. I was expecting to see:

blobs, err := dm.database.GetBlobs(ctx, db.namespace.Name, fb.And(fb.Eq("data_id",data.ID), fb.Eq(data.Blob.Hash))

I'm not sure it even makes sense to have a GetBlob database-level action, because the only unique key for a record is the Sequence key (which we correctly use in delete).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specific case I'm thinking of, is when you receive the same data, with the same ID/Hash from multiple peers in your business network, on different privacy groups, in multiple different messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working that all through 👍

Signed-off-by: Nicko Guyer <nicko.guyer@kaleido.io>
pkg/database/plugin.go Outdated Show resolved Hide resolved
Copy link
Contributor

@awrichar awrichar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks very good. Noted a few spelling items and some questions around namespace scoping in the new landscape.

Signed-off-by: Nicko Guyer <nicko.guyer@kaleido.io>
Signed-off-by: Nicko Guyer <nicko.guyer@kaleido.io>
Signed-off-by: Nicko Guyer <nicko.guyer@kaleido.io>
Copy link
Contributor

@peterbroadhurst peterbroadhurst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️
🔥
🗑️

@peterbroadhurst peterbroadhurst merged commit cac029c into hyperledger:main Jan 14, 2023
@peterbroadhurst peterbroadhurst deleted the delete-data branch January 14, 2023 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants