-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chore/docs #228
Chore/docs #228
Conversation
When data files are submitted to a Gen3 data commons using Sheepdog, the files are automatically indexed into indexd. Sheepdog checks if the file being submitted has a hash & file size that match anything currently in indexd and if so uses the returned document GUID as the object ID reference. If no match is found in Indexd then a new record is created and stored in Indexd. | ||
1) Fence requests blank object from Indexd. Indexd creates an object with no hash, size or URLs, only the `uploader` and optionally `file_name` fields. | ||
2) Indexd listener monitors bucket update, updates Indexd with URL, hash, size. | ||
3) The client application (windmill or gen3-data-client) lists records for data files which the user needs to submit to the graph. The user fills all empty fields and submits the request to Indexd to update the `authz` or `acl`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not super clear/specific but since it's not intended to be a Data Upload user doc, it's probably fine. maybe you can link to https://gen3.org/resources/user/submit-data/#3-map-uploaded-files-to-a-data-file-node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't for the data upload flow though, this is assuming the files already exist in storage
|
||
For existing data in buckets, the SNS or PubSub notifications may be simulated such that the indexing functions are started for each object in the bucket. This is useful because only a single code path is necessary for indexing the contents of an object. | ||
|
||
3. Indexing void object for fully control the bucket structure. | ||
## Indexd REST API for Record Creation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this section could be a subsection of https://github.com/uc-cdis/indexd/tree/chore/docs#i-want-to-associate-indexd-data-to-structured-data-in-a-gen3-data-commons: either create records through sheepdog or directly in indexd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooh, good point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, it doesn't only fit under the use case there though. I think I wanna have this outside for more visibility
Co-Authored-By: Pauline Ribeyre <ribeyre@uchicago.edu>
Co-Authored-By: Pauline Ribeyre <ribeyre@uchicago.edu>
Co-Authored-By: Pauline Ribeyre <ribeyre@uchicago.edu>
Co-Authored-By: Pauline Ribeyre <ribeyre@uchicago.edu>
Co-Authored-By: Pauline Ribeyre <ribeyre@uchicago.edu>
Co-Authored-By: Pauline Ribeyre <ribeyre@uchicago.edu>
Co-Authored-By: Pauline Ribeyre <ribeyre@uchicago.edu>
Co-Authored-By: Pauline Ribeyre <ribeyre@uchicago.edu>
Co-Authored-By: Pauline Ribeyre <ribeyre@uchicago.edu>
Co-Authored-By: Pauline Ribeyre <ribeyre@uchicago.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:D
New Features
Breaking Changes
Bug Fixes
Improvements
Dependency updates
Deployment changes