Mongo 1: Server side implementation#1066
Conversation
BryonLewis
left a comment
There was a problem hiding this comment.
I'm not done reviewing but I wanted to log some comments (in the code) before I complete it.
This is an list for myself for stuff to look back at:
- Need to look a bit more into the patching and seeing if I can break that.
/process_items- I need to look a bit more into that for corner cases regarding the track.json and others.- migration - need to look closer at the script for migration
|
|
||
| # Write the revision to the log | ||
| additions = len(insert_operations) | ||
| deletions = len(update_operations) - additions |
There was a problem hiding this comment.
could this number be negative? update_operations during an overwrite=True would never get a length if the only items were inserted? So the log_entry would have a -additions value? This I think is true during any processing of a imported dataset. I've confirmed this by looking at the revision log after uploading new data.
There was a problem hiding this comment.
yep, I messed this up
| .pagingParams("revision", defaultLimit=20) | ||
| .modelParam("folderId", **DatasetModelParam, level=AccessType.READ) | ||
| ) | ||
| def get_revisions(self, limit: int, offset: int, sort, folder): |
There was a problem hiding this comment.
just wondering if the default sort order should be descending instead of ascending? In actual usage may want the default to be the date/ltime/description of the last 20 instead of the first 20. This is further influenced by the default behavior of get_revisions which I think defaults to descending. So the endpoint defaults to ascending but the function defaults to ascending.
There was a problem hiding this comment.
I see that this was added to the GET /dive_dataset which is probably a good idea. I meant for it to be added to the revisions here GET /dive_annoation/revision
BryonLewis
left a comment
There was a problem hiding this comment.
I don't know how much migrations.py was supposed to be an example vs an actual path to migration. I left a few comments on it.
337d8c6 to
1578509
Compare
|
Responded to comments. |
* Poetry fix * Add check for app root
BryonLewis
left a comment
There was a problem hiding this comment.
Additionally you may want to add in the poetry fix for the web server so this can be docker-composed and run for testing.
| .pagingParams("revision", defaultLimit=20) | ||
| .modelParam("folderId", **DatasetModelParam, level=AccessType.READ) | ||
| ) | ||
| def get_revisions(self, limit: int, offset: int, sort, folder): |
There was a problem hiding this comment.
I see that this was added to the GET /dive_dataset which is probably a good idea. I meant for it to be added to the revisions here GET /dive_annoation/revision
* init * Allowing multicam to write tracks * inut name change * lint fixes * Updates to multicam * fixing import loading * removing multicamImageFiles change * Fixing various issues * mend * mend * switching to every * Example without fetching metadata (#1088) Co-authored-by: Brandon Davis <brandon.davis@kitware.com>
1578509 to
477f794
Compare
BryonLewis
left a comment
There was a problem hiding this comment.
Pulled and checked my comments, did some more basic testing and it looks good.
* Poetry fix (#1087) * Poetry fix * Add check for app root * Desktop/sealion multicam (#1024) * init * Allowing multicam to write tracks * inut name change * lint fixes * Updates to multicam * fixing import loading * removing multicamImageFiles change * Fixing various issues * mend * mend * switching to every * Example without fetching metadata (#1088) Co-authored-by: Brandon Davis <brandon.davis@kitware.com> * Server-side implementation * Include description and timestamp * Respond to comments * Switch to checking mongo results * select sub-element * Unmangle indices Co-authored-by: BryonLewis <61746913+BryonLewis@users.noreply.github.com>
* Poetry fix (#1087) * Poetry fix * Add check for app root * Desktop/sealion multicam (#1024) * init * Allowing multicam to write tracks * inut name change * lint fixes * Updates to multicam * fixing import loading * removing multicamImageFiles change * Fixing various issues * mend * mend * switching to every * Example without fetching metadata (#1088) Co-authored-by: Brandon Davis <brandon.davis@kitware.com> * Server-side implementation * Include description and timestamp * Respond to comments * Switch to checking mongo results * select sub-element * Unmangle indices Co-authored-by: BryonLewis <61746913+BryonLewis@users.noreply.github.com>
* Poetry fix (#1087) * Poetry fix * Add check for app root * Desktop/sealion multicam (#1024) * init * Allowing multicam to write tracks * inut name change * lint fixes * Updates to multicam * fixing import loading * removing multicamImageFiles change * Fixing various issues * mend * mend * switching to every * Example without fetching metadata (#1088) Co-authored-by: Brandon Davis <brandon.davis@kitware.com> * Server-side implementation * Include description and timestamp * Respond to comments * Switch to checking mongo results * select sub-element * Unmangle indices Co-authored-by: BryonLewis <61746913+BryonLewis@users.noreply.github.com>
* Poetry fix (#1087) * Poetry fix * Add check for app root * Desktop/sealion multicam (#1024) * init * Allowing multicam to write tracks * inut name change * lint fixes * Updates to multicam * fixing import loading * removing multicamImageFiles change * Fixing various issues * mend * mend * switching to every * Example without fetching metadata (#1088) Co-authored-by: Brandon Davis <brandon.davis@kitware.com> * Server-side implementation * Include description and timestamp * Respond to comments * Switch to checking mongo results * select sub-element * Unmangle indices Co-authored-by: BryonLewis <61746913+BryonLewis@users.noreply.github.com>
* Mongo 1: Server side implementation (#1066) * Poetry fix (#1087) * Poetry fix * Add check for app root * Desktop/sealion multicam (#1024) * init * Allowing multicam to write tracks * inut name change * lint fixes * Updates to multicam * fixing import loading * removing multicamImageFiles change * Fixing various issues * mend * mend * switching to every * Example without fetching metadata (#1088) Co-authored-by: Brandon Davis <brandon.davis@kitware.com> * Server-side implementation * Include description and timestamp * Respond to comments * Switch to checking mongo results * select sub-element * Unmangle indices Co-authored-by: BryonLewis <61746913+BryonLewis@users.noreply.github.com> * Mongo 2: Rollback API (#1067) * Implement rollback * Switch to POST /dive_annotation/rollback * Mongo 3: Utilize new endpoints in celery (#1068) * Utilize new endpoints in celery * Respond to comments * Linting, formatting, and unit tests * respond to comments * Import shutil * Client changes to support revisions (#1070) * Remove broken summary and report generation (#1071) * Add simple sharing test and new indices * Migraction script updates * Add loading state to clone button * label fetch Co-authored-by: BryonLewis <61746913+BryonLewis@users.noreply.github.com>
Reviewer notes
Design
This is a soft delete implementation that tolerates data duplication in favor of defending against data loss.
rev_createdandrev_deleted, you can check out any point in history. This is the simplest implementation I could come up with, and should be looked at critically. I'm open to changing it.Logic - Creation
revisionis an increasing integer key (per dataset)rev_createdis set as the currentrevisionrev_deletedfield is set, marking that previous version as lazy-deleted.Logic - Checkout
{ DATASET: dsFolder['_id'], REVISION_CREATED: {'$lte': head}, '$or': [{REVISION_DELETED: {'$gt': head}}, {REVISION_DELETED: {'$exists': False}}], }"Get annotations from this dataset where revision_created is less than or equal to target and either deleted is greater than target or deleted is not set."
Deconflicting
There's another concern we could address here: deconflicting multi-user saves. Suppose User A loads, User B Loads, User B saves, and User A saves. Currently, each save is an overwrite and User A might break some of B's work.
Now that we have revisions, each save could require that the user provide the ID they loaded and if the ID has changed since they loaded, the server could attempt to merge and detect a collision. This could be considered in-scope, but I think maintaining the existing behavior for now is best.
There are even advanced options where we could utilize the notification stream to deliver notifications to all users who have a given dataset open. This is well out of scope for this stage of changes, but interesting to think about.