Skip to content

Mongo 1: Server side implementation#1066

Merged
subdavis merged 8 commits into
develop-mongo-tracksfrom
mongo/server-side-implementation
Dec 20, 2021
Merged

Mongo 1: Server side implementation#1066
subdavis merged 8 commits into
develop-mongo-tracksfrom
mongo/server-side-implementation

Conversation

@subdavis
Copy link
Copy Markdown
Contributor

@subdavis subdavis commented Nov 22, 2021

Reviewer notes

  • This is the largest change and will require the most discussion
  • It leaves jobs (pipelines/training/summary) in a broken state fixed by Mongo 3: Utilize new endpoints in celery #1068, but I spearated them because the changes are pretty much completely independent, and I thought it would be easier to review by breaking it apart.

Design

This is a soft delete implementation that tolerates data duplication in favor of defending against data loss.

  • By modeling the annotations with a rev_created and rev_deleted, you can check out any point in history. This is the simplest implementation I could come up with, and should be looked at critically. I'm open to changing it.
  • There is also a revision log collection. This collection is non-critical (doesn't need to be joined in order to reconstruct history). It's just a convenience for attaching metadata to revisions, such as author, description, and timestamp.

Logic - Creation

  • revision is an increasing integer key (per dataset)
  • When a track is saved, the full track is saved as a single mongo record. rev_created is set as the current revision
  • If a previous version of the track existed, its rev_deleted field is set, marking that previous version as lazy-deleted.

Logic - Checkout

{
        DATASET: dsFolder['_id'],
        REVISION_CREATED: {'$lte': head},
        '$or': [{REVISION_DELETED: {'$gt': head}}, {REVISION_DELETED: {'$exists': False}}],
    }

"Get annotations from this dataset where revision_created is less than or equal to target and either deleted is greater than target or deleted is not set."

Deconflicting

There's another concern we could address here: deconflicting multi-user saves. Suppose User A loads, User B Loads, User B saves, and User A saves. Currently, each save is an overwrite and User A might break some of B's work.

Now that we have revisions, each save could require that the user provide the ID they loaded and if the ID has changed since they loaded, the server could attempt to merge and detect a collision. This could be considered in-scope, but I think maintaining the existing behavior for now is best.

There are even advanced options where we could utilize the notification stream to deliver notifications to all users who have a given dataset open. This is well out of scope for this stage of changes, but interesting to think about.

Copy link
Copy Markdown
Collaborator

@BryonLewis BryonLewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not done reviewing but I wanted to log some comments (in the code) before I complete it.
This is an list for myself for stuff to look back at:

  • Need to look a bit more into the patching and seeing if I can break that.
  • /process_items - I need to look a bit more into that for corner cases regarding the track.json and others.
  • migration - need to look closer at the script for migration

Comment thread server/dive_server/crud_annotation.py Outdated

# Write the revision to the log
additions = len(insert_operations)
deletions = len(update_operations) - additions
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this number be negative? update_operations during an overwrite=True would never get a length if the only items were inserted? So the log_entry would have a -additions value? This I think is true during any processing of a imported dataset. I've confirmed this by looking at the revision log after uploading new data.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I messed this up

.pagingParams("revision", defaultLimit=20)
.modelParam("folderId", **DatasetModelParam, level=AccessType.READ)
)
def get_revisions(self, limit: int, offset: int, sort, folder):
Copy link
Copy Markdown
Collaborator

@BryonLewis BryonLewis Dec 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering if the default sort order should be descending instead of ascending? In actual usage may want the default to be the date/ltime/description of the last 20 instead of the first 20. This is further influenced by the default behavior of get_revisions which I think defaults to descending. So the endpoint defaults to ascending but the function defaults to ascending.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this was added to the GET /dive_dataset which is probably a good idea. I meant for it to be added to the revisions here GET /dive_annoation/revision

Comment thread server/dive_server/views_annotation.py
Copy link
Copy Markdown
Collaborator

@BryonLewis BryonLewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how much migrations.py was supposed to be an example vs an actual path to migration. I left a few comments on it.

Comment thread server/scripts/migrations.py
Comment thread server/scripts/migrations.py
@subdavis subdavis force-pushed the mongo/server-side-implementation branch from 337d8c6 to 1578509 Compare December 8, 2021 20:13
@subdavis subdavis requested a review from BryonLewis December 8, 2021 20:15
@subdavis
Copy link
Copy Markdown
Contributor Author

subdavis commented Dec 8, 2021

Responded to comments.

* Poetry fix

* Add check for app root
Copy link
Copy Markdown
Collaborator

@BryonLewis BryonLewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally you may want to add in the poetry fix for the web server so this can be docker-composed and run for testing.

.pagingParams("revision", defaultLimit=20)
.modelParam("folderId", **DatasetModelParam, level=AccessType.READ)
)
def get_revisions(self, limit: int, offset: int, sort, folder):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this was added to the GET /dive_dataset which is probably a good idea. I meant for it to be added to the revisions here GET /dive_annoation/revision

Comment thread server/dive_server/crud_annotation.py Outdated
* init

* Allowing multicam to write tracks

* inut name change

* lint fixes

* Updates to multicam

* fixing import loading

* removing multicamImageFiles change

* Fixing various issues

* mend

* mend

* switching to every

* Example without fetching metadata (#1088)

Co-authored-by: Brandon Davis <brandon.davis@kitware.com>
@subdavis subdavis force-pushed the mongo/server-side-implementation branch from 1578509 to 477f794 Compare December 19, 2021 20:46
@BryonLewis BryonLewis self-requested a review December 20, 2021 13:22
Copy link
Copy Markdown
Collaborator

@BryonLewis BryonLewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulled and checked my comments, did some more basic testing and it looks good.

@subdavis subdavis merged commit d677718 into develop-mongo-tracks Dec 20, 2021
@subdavis subdavis deleted the mongo/server-side-implementation branch December 20, 2021 14:07
subdavis added a commit that referenced this pull request Dec 20, 2021
* Poetry fix (#1087)

* Poetry fix

* Add check for app root

* Desktop/sealion multicam (#1024)

* init

* Allowing multicam to write tracks

* inut name change

* lint fixes

* Updates to multicam

* fixing import loading

* removing multicamImageFiles change

* Fixing various issues

* mend

* mend

* switching to every

* Example without fetching metadata (#1088)

Co-authored-by: Brandon Davis <brandon.davis@kitware.com>

* Server-side implementation

* Include description and timestamp

* Respond to comments

* Switch to checking mongo results

* select sub-element

* Unmangle indices

Co-authored-by: BryonLewis <61746913+BryonLewis@users.noreply.github.com>
subdavis added a commit that referenced this pull request Dec 20, 2021
* Poetry fix (#1087)

* Poetry fix

* Add check for app root

* Desktop/sealion multicam (#1024)

* init

* Allowing multicam to write tracks

* inut name change

* lint fixes

* Updates to multicam

* fixing import loading

* removing multicamImageFiles change

* Fixing various issues

* mend

* mend

* switching to every

* Example without fetching metadata (#1088)

Co-authored-by: Brandon Davis <brandon.davis@kitware.com>

* Server-side implementation

* Include description and timestamp

* Respond to comments

* Switch to checking mongo results

* select sub-element

* Unmangle indices

Co-authored-by: BryonLewis <61746913+BryonLewis@users.noreply.github.com>
subdavis added a commit that referenced this pull request Dec 28, 2021
* Poetry fix (#1087)

* Poetry fix

* Add check for app root

* Desktop/sealion multicam (#1024)

* init

* Allowing multicam to write tracks

* inut name change

* lint fixes

* Updates to multicam

* fixing import loading

* removing multicamImageFiles change

* Fixing various issues

* mend

* mend

* switching to every

* Example without fetching metadata (#1088)

Co-authored-by: Brandon Davis <brandon.davis@kitware.com>

* Server-side implementation

* Include description and timestamp

* Respond to comments

* Switch to checking mongo results

* select sub-element

* Unmangle indices

Co-authored-by: BryonLewis <61746913+BryonLewis@users.noreply.github.com>
subdavis added a commit that referenced this pull request Jan 3, 2022
* Poetry fix (#1087)

* Poetry fix

* Add check for app root

* Desktop/sealion multicam (#1024)

* init

* Allowing multicam to write tracks

* inut name change

* lint fixes

* Updates to multicam

* fixing import loading

* removing multicamImageFiles change

* Fixing various issues

* mend

* mend

* switching to every

* Example without fetching metadata (#1088)

Co-authored-by: Brandon Davis <brandon.davis@kitware.com>

* Server-side implementation

* Include description and timestamp

* Respond to comments

* Switch to checking mongo results

* select sub-element

* Unmangle indices

Co-authored-by: BryonLewis <61746913+BryonLewis@users.noreply.github.com>
subdavis added a commit that referenced this pull request Jan 20, 2022
* Mongo 1: Server side implementation (#1066)

* Poetry fix (#1087)

* Poetry fix

* Add check for app root

* Desktop/sealion multicam (#1024)

* init

* Allowing multicam to write tracks

* inut name change

* lint fixes

* Updates to multicam

* fixing import loading

* removing multicamImageFiles change

* Fixing various issues

* mend

* mend

* switching to every

* Example without fetching metadata (#1088)

Co-authored-by: Brandon Davis <brandon.davis@kitware.com>

* Server-side implementation

* Include description and timestamp

* Respond to comments

* Switch to checking mongo results

* select sub-element

* Unmangle indices

Co-authored-by: BryonLewis <61746913+BryonLewis@users.noreply.github.com>

* Mongo 2: Rollback API (#1067)

* Implement rollback

* Switch to POST /dive_annotation/rollback

* Mongo 3: Utilize new endpoints in celery (#1068)

* Utilize new endpoints in celery

* Respond to comments

* Linting, formatting, and unit tests

* respond to comments

* Import shutil

* Client changes to support revisions (#1070)

* Remove broken summary and report generation (#1071)

* Add simple sharing test and new indices

* Migraction script updates

* Add loading state to clone button

* label fetch

Co-authored-by: BryonLewis <61746913+BryonLewis@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants