Why Batch inference is not support currently？ #2644

LovingThresh · 2023-02-24T01:39:41Z

AssertionError: Batch inference is not support currently, as the image size might be different in a batch.

It is Confusing.

xiexinch · 2023-02-24T02:29:08Z

Hi @LovingThresh,
Since the validation or test set images have different shapes, support batch inference at evaluation might affect the result.

edsml-hmc122 · 2023-06-23T09:48:59Z

Hi @LovingThresh, Since the validation or test set images have different shapes, support batch inference at evaluation might affect the result.

See #2965
It would be nice to at least have support for batched inference when all images have the same dimensions.

louan1998 · 2023-06-25T09:15:09Z

I followed the official documentation to run the "unet_s5-d16_deeplabv3_4xb4-40k_chase-db1-128x128.py" module, and reported an error "AssertionError: Batch inference is not support currently, as the image size might be different in a batch" when iterating to 4000 times, and then I use the --resume command to continue training on the weight of 4000 iterations, and report this error again when it reaches 8000 times. This data set is also supported by mmseg. I saw that the size of the pictures inside is the same. why is that?

edsml-hmc122 · 2023-06-26T15:01:06Z

I followed the official documentation to run the "unet_s5-d16_deeplabv3_4xb4-40k_chase-db1-128x128.py" module, and reported an error "AssertionError: Batch inference is not support currently, as the image size might be different in a batch" when iterating to 4000 times, and then I use the --resume command to continue training on the weight of 4000 iterations, and report this error again when it reaches 8000 times. This data set is also supported by mmseg. I saw that the size of the pictures inside is the same. why is that?

@louan1998 I think right now it doesn't matter what the image sizes actually are. MMSegmentation will reject the inference if batch_size != 1, even if the images are all the same size. It's just not implemented yet, from what I understand. :(

louan1998 · 2023-06-27T12:15:16Z

@edsml-hmc122 That's right! I noticed that too! If batch_size = 1 in val_dataloader， there is no problem! Anyway, batch_size doesn't play that big role in validation and testing, right?

chenhuagg · 2023-07-12T08:44:31Z

@louan1998 How did you solve it in the end?

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. ## Motivation #3181 #2965 #2644 #1645 #1444 #1370 #125 ## Modification Remove the assertion at data_preprocessor ## BC-breaking (Optional) Does the modification introduce changes that break the backward-compatibility of the downstream repos? If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR. ## Use cases (Optional) If this PR introduces a new feature, it is better to list some use cases here, and update the documentation. ## Checklist 1. Pre-commit or other linting tools are used to fix the potential lint issues. 2. The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness. 3. If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D. 4. The documentation has been modified accordingly, like docstring or example tutorials.

louan1998 · 2023-07-30T02:44:39Z

@louan1998 How did you solve it in the end?

Set the batch_size in the val_dataloader to 1
val_dataloader = dict( batch_size=1, num_workers=16, persistent_workers=True, sampler=dict(type='DefaultSampler', shuffle=False), dataset=dict( type=dataset_type, data_root=data_root, data_prefix=dict( img_path='images/validation', seg_map_path='annotations/validation'), pipeline=test_pipeline))

edsml-hmc122 · 2023-08-09T21:10:31Z

@edsml-hmc122 That's right! I noticed that too! If batch_size = 1 in val_dataloader， there is no problem! Anyway, batch_size doesn't play that big role in validation and testing, right?

It plays a huge role if you have a lot of validation/testing data. The process is slowed down by a massive amount, and you are under-utilising your GPU. I hope the developers will be able to implement batched inference asap, it's the biggest downside of MMSegmentation in my opinion, compared to other libraries. Also, MMDetection supports batched inference from what I understand, maybe they can port the code.

xiexinch · 2023-08-10T02:23:51Z

Hi @edsml-hmc122,
I removed this limitation, you might have a try. If there are any problems, feel free to create an issue, and we'll fix it asap.
https://github.com/open-mmlab/mmsegmentation/blob/main/mmseg/models/data_preprocessor.py#L135

edsml-hmc122 · 2023-08-10T06:18:07Z

Hi @edsml-hmc122,
I removed this limitation, you might have a try. If there are any problems, feel free to create an issue, and we'll fix it asap.
https://github.com/open-mmlab/mmsegmentation/blob/main/mmseg/models/data_preprocessor.py#L135

Thank you, I will try to test it when I have time! Also, might be worth setting this issue to "open" again.

edsml-hmc122 · 2023-08-20T13:15:43Z

@xiexinch

Hi @edsml-hmc122, I removed this limitation, you might have a try. If there are any problems, feel free to create an issue, and we'll fix it asap. https://github.com/open-mmlab/mmsegmentation/blob/main/mmseg/models/data_preprocessor.py#L135

Hi, I have done some testing.
Seems that in the general sense, batched inference is working. More GPU VRAM is used and the logger shows fewer total iterations (since batches are larger).

However, the larger I make the batch size, the lower also the inference speed, so it doesn't lead to much acceleration from what I can tell.
The GPU usage is also very strange, it is mostly 0 and spikes to 100% every 1-2 seconds.
So I think this is a good start, but it doesn't seem to really accelerate the inference process yet.

Tested on an RTX 3080, using a batch size of 20 it used 5637MiB / 10240MiB VRAM.
It took 1m22s for 1000 samples of size 512x512.
Edit: With a batch size of 10, it took only ~20s.

Using a batch size of 1, the time was around 2m30s, I thought maybe the speedup should be larger.

Please re-open this issue if you think it's worth investigating.

xiexinch · 2023-08-21T02:46:35Z

@xiexinch

Hi @edsml-hmc122, I removed this limitation, you might have a try. If there are any problems, feel free to create an issue, and we'll fix it asap. https://github.com/open-mmlab/mmsegmentation/blob/main/mmseg/models/data_preprocessor.py#L135

Hi, I have done some testing. Seems that in the general sense, batched inference is working. More GPU VRAM is used and the logger shows fewer total iterations (since batches are larger).

However, the larger I make the batch size, the lower also the inference speed, so it doesn't lead to much acceleration from what I can tell. The GPU usage is also very strange, it is mostly 0 and spikes to 100% every 1-2 seconds. So I think this is a good start, but it doesn't seem to really accelerate the inference process yet.

Tested on an RTX 3080, using a batch size of 20 it used 5637MiB / 10240MiB VRAM. It took 1m22s for 1000 samples of size 512x512. Edit: With a batch size of 10, it took only ~20s.

Using a batch size of 1, the time was around 2m30s, I thought maybe the speedup should be larger.

Please re-open this issue if you think it's worth investigating.

Thanks for your feedback! We'll test this case and then find a better solution. Could you provide your config to us if it's available?

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. ## Motivation open-mmlab#3181 open-mmlab#2965 open-mmlab#2644 open-mmlab#1645 open-mmlab#1444 open-mmlab#1370 open-mmlab#125 ## Modification Remove the assertion at data_preprocessor ## BC-breaking (Optional) Does the modification introduce changes that break the backward-compatibility of the downstream repos? If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR. ## Use cases (Optional) If this PR introduces a new feature, it is better to list some use cases here, and update the documentation. ## Checklist 1. Pre-commit or other linting tools are used to fix the potential lint issues. 2. The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness. 3. If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D. 4. The documentation has been modified accordingly, like docstring or example tutorials.

mm-assistant bot assigned xiexinch Feb 24, 2023

LovingThresh closed this as completed Feb 28, 2023

edsml-hmc122 mentioned this issue Jul 11, 2023

Batch inference is not support currently, as the image size might be different in a batch #3181

Closed

xiexinch mentioned this issue Jul 19, 2023

[Enhancement] Remove batch inference assertion #3210

Merged

xiexinch reopened this Aug 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why Batch inference is not support currently？ #2644

Why Batch inference is not support currently？ #2644

LovingThresh commented Feb 24, 2023

xiexinch commented Feb 24, 2023

edsml-hmc122 commented Jun 23, 2023

louan1998 commented Jun 25, 2023 •

edited

Loading

edsml-hmc122 commented Jun 26, 2023

louan1998 commented Jun 27, 2023 •

edited

Loading

chenhuagg commented Jul 12, 2023

louan1998 commented Jul 30, 2023 •

edited

Loading

edsml-hmc122 commented Aug 9, 2023

xiexinch commented Aug 10, 2023

edsml-hmc122 commented Aug 10, 2023 •

edited

Loading

edsml-hmc122 commented Aug 20, 2023 •

edited

Loading

xiexinch commented Aug 21, 2023

Why Batch inference is not support currently？ #2644

Why Batch inference is not support currently？ #2644

Comments

LovingThresh commented Feb 24, 2023

xiexinch commented Feb 24, 2023

edsml-hmc122 commented Jun 23, 2023

louan1998 commented Jun 25, 2023 • edited Loading

edsml-hmc122 commented Jun 26, 2023

louan1998 commented Jun 27, 2023 • edited Loading

chenhuagg commented Jul 12, 2023

louan1998 commented Jul 30, 2023 • edited Loading

edsml-hmc122 commented Aug 9, 2023

xiexinch commented Aug 10, 2023

edsml-hmc122 commented Aug 10, 2023 • edited Loading

edsml-hmc122 commented Aug 20, 2023 • edited Loading

xiexinch commented Aug 21, 2023

louan1998 commented Jun 25, 2023 •

edited

Loading

louan1998 commented Jun 27, 2023 •

edited

Loading

louan1998 commented Jul 30, 2023 •

edited

Loading

edsml-hmc122 commented Aug 10, 2023 •

edited

Loading

edsml-hmc122 commented Aug 20, 2023 •

edited

Loading