Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

summary mode implementation #6

Open
seveibar opened this issue Jun 30, 2020 · 6 comments
Open

summary mode implementation #6

seveibar opened this issue Jun 30, 2020 · 6 comments

Comments

@seveibar
Copy link
Contributor

seveibar commented Jun 30, 2020

In relation to RFC: Supporting Large Datasets. The collaboration server should minimize payloads by returning "sampleSummary" which contains enough information to display aggregate data on samples, but not enough information to view the sample. When the collaborative session is started in summarizeSamples mode, instead of returning udt_json that looks like this:

{
  "interface": { /* ... */ },
  "samples": [ /* ... */ ]
}

It returns a SummarizedUDTObject that looks like the following:

{
  "interface": { /* ... */ },
  "sampleSummary": [
    { "state": "complete", hasAnnotation: false, version: 32 }
    // ...
  ]
}

The summarizeSamples mode should be a column on the session table and a POST body parameter to POST /api/session.

Diffs should be run against sampleSummary instead of samples in summarizeSamples mode.

@Ownmarc
Copy link

Ownmarc commented Jun 30, 2020

Would you want a minimized version of each image in this sample summarization ? Could there be many levels of "summarization" ? What about using some kind of pagination like some database do ?

@seveibar
Copy link
Contributor Author

seveibar commented Jun 30, 2020

  • It would give us a cool browser if we supported a imageThumbnailUrl key or something in the summarization. I wrote more about this in Image Thumbnails in Sample Summarization #9
  • I think it would be possible to have each UDT instance connected to the collaborative session have different views of dataset. This is essentially pagination, proposed below.

With "sampleRange" view

{
  "interface": { /* ... */ },
  "summary": {
    "totalSamples": 0,
    "stateCounts": {
       "complete": 10
    },
    "sampleRange": [0, 50],
    "samples": [
      { "state": "complete", hasAnnotation: false, version: 32 }
      // ...
    ]
  ]
}

It should be possible to apply whole-json diffs intelligently against this type of object, while maintaining a small payload.

If, however, we're looking at paginated views, we might as well just return the full samples:

// BAD, probably will be too big on full image segmentation with image masks
{
  "interface": { /* ... */ },
  "summary": {
    "totalSamples": 0,
    "stateCounts": {
       "complete": 10
    },
  ],
  "sampleRange": [0, 50],
  "samples": [
      { /* full sample with imageUrl etc. */ }
  ]
}

The payload problem becomes a big problem with full pixel segmentation, we're noticing collaborative sessions with 200 samples go above 5mb and slow down everything.

@seveibar
Copy link
Contributor Author

One argument to support summarized samples (i.e. summary.sampleRange with summary.samples) is that storing image masks will be expensive, and eventually only 1-5 samples can be in memory at any given time, but we still want to have the nice grid view containing an overview of 500 samples.

@seveibar seveibar changed the title sampleSummary implementation summary mode implementation Jun 30, 2020
@seveibar
Copy link
Contributor Author

I would not recommend variable summary levels for the first PR, though I think this will be easy to do after the foundation is in place.

@seveibar
Copy link
Contributor Author

seveibar commented Jul 15, 2020

We're continuing to take a look at this:

This Summary Object is a good pick for the first version

{
  "interface": { /* ... */ },
  "summary": {
    "samples": [
      { "state": "complete", version: 32 }
      // ...
    ]
  }
}
{
  "interface": { /* ... */ },
  "summary": {
    "samples": [
      { "state": "complete", version: 32 }
      // ...
    ]
  }
}

@seveibar
Copy link
Contributor Author

seveibar commented Jul 15, 2020

Untitled (1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants