Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for WebGL crash for datasets with many segmentation layers #6995

Merged
merged 6 commits into from
Apr 27, 2023

Conversation

philippotto
Copy link
Member

@philippotto philippotto commented Apr 20, 2023

For a dataset with the following layers

  • 3 * segmentation uint64
  • 2 * segmentation uint32
  • 1 * color uint8

... wk crashed for me with "Medium" hardware utilization. With "Very Low" it did not crash. Through iterative shader editing, I found out that the complexity of the shader code was effectively causing the crash. I assume that a lower HW utilization means that more VRAM is available for the shader code and thus doesn't cause a crash?

For that particular dataset constellation, compiling the shader code for all segmentation layers is not necessary, as we currently can only render one segmentation at a time, anyway. Therefore, I adapted the code to take this into account.

The workaround doesn't change the behavior for color-only datasets, but a dataset with six uint8 color layers doesn't crash for me in the first place.

The downside of this approach is that the shader has to be compiled more often when changing layer visibilities. However, the shader compilation is cached (so, toggling between two layers should always be fast after the first toggle). In the end, it's better than crashing, I'd argue.

URL of deployed dev instance (used for testing):

  • https://___.webknossos.xyz

Steps to test:

  • Set up a dataset with the following datasource-properties.json (the layer folders dont need to exist):
{
  "id": {
    "name": "many_layers",
    "team": "sample_organization"
  },
  "dataLayers": [
    {
      "name": "c2_segmentation",
      "boundingBox": {
        "topLeft": [
          552960,
          325632,
          0
        ],
        "width": 147456,
        "height": 102400,
        "depth": 5312
      },
      "wkwResolutions": [
        {
          "resolution": [
            2,
            2,
            1
          ],
          "cubeLength": 1024
        }
      ],
      "elementClass": "uint64",
      "largestSegmentId": 281474976710656,
      "category": "segmentation",
      "dataFormat": "wkw"
    },
    {
      "name": "c3_types",
      "boundingBox": {
        "topLeft": [
          552960,
          325632,
          0
        ],
        "width": 147456,
        "height": 102400,
        "depth": 5312
      },
      "wkwResolutions": [
        {
          "resolution": [
            16,
            16,
            2
          ],
          "cubeLength": 1024
        }
      ],
      "elementClass": "uint32",
      "largestSegmentId": 2000,
      "category": "segmentation",
      "dataFormat": "wkw"
    },
    {
      "name": "proofread_104",
      "boundingBox": {
        "topLeft": [
          552960,
          325632,
          0
        ],
        "width": 147456,
        "height": 102400,
        "depth": 5312
      },
      "wkwResolutions": [
        {
          "resolution": [
            2,
            2,
            1
          ],
          "cubeLength": 1024
        }
      ],
      "elementClass": "uint64",
      "largestSegmentId": 281474976710656,
      "category": "segmentation",
      "dataFormat": "wkw"
    },
    {
      "name": "segmentation",
      "boundingBox": {
        "topLeft": [
          552960,
          325632,
          0
        ],
        "width": 147456,
        "height": 102400,
        "depth": 5312
      },
      "wkwResolutions": [
        {
          "resolution": [
            2,
            2,
            1
          ],
          "cubeLength": 1024
        }
      ],
      "elementClass": "uint64",
      "largestSegmentId": 281474976710656,
      "category": "segmentation",
      "dataFormat": "wkw"
    },
    {
      "name": "c2_types",
      "boundingBox": {
        "topLeft": [
          552960,
          325632,
          0
        ],
        "width": 147456,
        "height": 102400,
        "depth": 5312
      },
      "wkwResolutions": [
        {
          "resolution": [
            16,
            16,
            2
          ],
          "cubeLength": 1024
        }
      ],
      "elementClass": "uint32",
      "largestSegmentId": 2000,
      "category": "segmentation",
      "dataFormat": "wkw"
    },
    {
      "name": "color",
      "category": "color",
      "boundingBox": {
        "topLeft": [
          0,
          0,
          0
        ],
        "width": 1031461,
        "height": 712623,
        "depth": 5312
      },
      "wkwResolutions": [
        {
          "resolution": [
            1,
            1,
            1
          ],
          "cubeLength": 1024
        },
        {
          "resolution": [
            2,
            2,
            1
          ],
          "cubeLength": 1024
        },
        {
          "resolution": [
            4,
            4,
            1
          ],
          "cubeLength": 1024
        },
        {
          "resolution": [
            8,
            8,
            1
          ],
          "cubeLength": 1024
        },
        {
          "resolution": [
            16,
            16,
            2
          ],
          "cubeLength": 1024
        },
        {
          "resolution": [
            32,
            32,
            4
          ],
          "cubeLength": 1024
        },
        {
          "resolution": [
            64,
            64,
            8
          ],
          "cubeLength": 1024
        },
        {
          "resolution": [
            128,
            128,
            16
          ],
          "cubeLength": 1024
        },
        {
          "resolution": [
            256,
            256,
            32
          ],
          "cubeLength": 1024
        },
        {
          "resolution": [
            512,
            512,
            64
          ],
          "cubeLength": 1024
        },
        {
          "resolution": [
            1024,
            1024,
            128
          ],
          "cubeLength": 1024
        },
        {
          "resolution": [
            2048,
            2048,
            256
          ],
          "cubeLength": 1024
        }
      ],
      "elementClass": "uint8",
      "dataFormat": "wkw"
    }
  ],
  "scale": [
    4,
    4,
    33
  ]
}

Issues:


(Please delete unneeded items, merge only when none are left open)

…ader needs to be compiled more often then, though)
@philippotto philippotto self-assigned this Apr 21, 2023
@philippotto philippotto marked this pull request as ready for review April 21, 2023 08:12
@philippotto
Copy link
Member Author

philippotto commented Apr 21, 2023

@daniel-wer Another idea for improving crash behavior would be to automatically change the hardware utilization after a crash. It could work something like this:

  • webgl context loss happens
  • --> store this info in localStorage
  • (if the context is restored, delete the info from localStorage again)
  • on page load, check whether there was a recent crash. if yes, change the hardware utilization (e.g., decrement by one step or just put it to minimum)
  • communicate this to the user

However, I'm not fully convinced. Sometimes a webgl context loss just happens and a simple page refresh is enough to fix it. On the hand, once webgl is acting up, it can be quite brittle (e.g., chrome completely refused to intantiate webgl after repeated crashes even though I opened a normal dataset).

As an alternative to my above suggestion, one could also show a blocking prompt before setting things up à la "The last time you used webKnossos, webgl crashed, do you want to change the Hardware Utilization from High to Very Low to avoid repeated crashes? You can change this back in the settings in the left sidebar."
Then, it would be opt-in, but the blocking prompt could still be intrusive.

What are your thoughts on this?

Copy link
Member

@daniel-wer daniel-wer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM 👍

Regarding changing the hardware utilization following a WebGL crash, as dicussed, this would be hard to get right and there is no immediate need to implement it. I'd postpone working on that.

Also, regarding my observation that it takes ~10x as long to activate a uint64 segmentation layer than a uint32 segmentation layer in an uncached state: The shaders are essentially equal, only the layer name is different. This suggests that it might actually not be the compilation but the linking of the shader that is taking so long 🤔

@philippotto philippotto merged commit d647b97 into master Apr 27, 2023
2 checks passed
@philippotto philippotto deleted the gpu-mem-improvements branch April 27, 2023 10:25
hotzenklotz added a commit that referenced this pull request Apr 28, 2023
…ove_wkconnect

* 'master' of github.com:scalableminds/webknossos:
  Update docker compose commands + dev install readme (#7002)
  Add segment groups (#6966)
  Add screenshot nightly test for wkorg (#7030)
  Workaround for WebGL crash for datasets with many segmentation layers (#6995)
  Fix download of public annotation, include access ctx in user cache key (#7025)
  Fix that changing a segment color could lead to a crash (#7000)
  Add more error chaining to annotation download (#7023)
  Guard against NaNs in shader (#7018)
  Store editable mappings in multiple fossildb columns+keys (#6903)
  Context action to move tree to group (#7005)
  Release 23.05.0 (#7014)
  Remove vault cache when reloading dataset (#7007)
  Fix viewing of public datasets (#7010)
  Update screenshots scalebar positioning (#7003)
  Update team members (#6999)
philippotto added a commit that referenced this pull request May 2, 2023
…#6995)

* reduce shader complexity when having multiple segmentation layers (shader needs to be compiled more often then, though)

* fix tests

* clean up

* update changelog
philippotto added a commit that referenced this pull request May 3, 2023
* Release 23.05.0

* Guard against NaNs in shader (#7018)

* guard against nan when deciding whether to use performance optimization in shader

* update changelog

* Workaround for WebGL crash for datasets with many segmentation layers (#6995)

* reduce shader complexity when having multiple segmentation layers (shader needs to be compiled more often then, though)

* fix tests

* clean up

* update changelog

* Release 23.05.1

* fix incorrect merge
philippotto added a commit that referenced this pull request May 8, 2023
* Release 23.05.0

* Guard against NaNs in shader (#7018)

* guard against nan when deciding whether to use performance optimization in shader

* update changelog

* Workaround for WebGL crash for datasets with many segmentation layers (#6995)

* reduce shader complexity when having multiple segmentation layers (shader needs to be compiled more often then, though)

* fix tests

* clean up

* update changelog

* Release 23.05.1

* Fix access check in time tracking controller (#7055)

* Release 23.05.2

---------

Co-authored-by: Florian M <fm3@users.noreply.github.com>
hotzenklotz added a commit that referenced this pull request May 17, 2023
…ty-list-drawings

* 'master' of github.com:scalableminds/webknossos: (25 commits)
  Fix issues with styling in dark mode on login page (#7052)
  Fix nightly by setting missing token (#7048)
  Release 23.05.1 (#7042)
  DRY types in update_actions.ts (#7036)
  Remove some spammy logging from backend (#7039)
  Use zarr string fill values (#7017)
  Fix voxel offset for Neuroglancer Precomputed datasets (#7019)
  Log when user is activated (#7027)
  Fix exception in applying UpdateTreeGroupVisibility skeleton action (#7037)
  Fix organization storage layouting (#7034)
  Update docker compose commands + dev install readme (#7002)
  Add segment groups (#6966)
  Add screenshot nightly test for wkorg (#7030)
  Workaround for WebGL crash for datasets with many segmentation layers (#6995)
  Fix download of public annotation, include access ctx in user cache key (#7025)
  Fix that changing a segment color could lead to a crash (#7000)
  Add more error chaining to annotation download (#7023)
  Guard against NaNs in shader (#7018)
  Store editable mappings in multiple fossildb columns+keys (#6903)
  Context action to move tree to group (#7005)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash due to many segmentation layers
2 participants