New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Doc] Add dataflow document #2403

Merged

MeowZheng merged 6 commits into open-mmlab:dev-1.x from xiexinch:dataflow_doc

Dec 29, 2022

Collaborator

xiexinch commented Dec 12, 2022

Motivation

Add dataflow doc

Modification

docs/en/advanced_guides/data_flow.md

xiexinch added Doc 1.x labels

mm-assistant bot assigned MengzhangLI

xiexinch added 2 commits

December 13, 2022 12:31


          draft

f54dcfd


          update loss

dd1a894

xiexinch force-pushed the dataflow_doc branch from 3539902 to dd1a894 Compare

December 13, 2022 04:33

codecov bot commented Dec 13, 2022 •

edited

Loading

Codecov Report

Base: 83.33% // Head: 83.33% // No change to project coverage 👍

Coverage data is based on head (6293e07) compared to base (6cb64e3).
Patch has no changes to coverable lines.

❗ Current head 6293e07 differs from pull request most recent head ec4dd03. Consider uploading reports for the commit ec4dd03 to get more accurate results

Additional details and impacted files

@@           Coverage Diff            @@
##           dev-1.x    #2403   +/-   ##
========================================
  Coverage    83.33%   83.33%           
========================================
  Files          143      143           
  Lines         8127     8127           
  Branches      1211     1211           
========================================
  Hits          6773     6773           
  Misses        1165     1165           
  Partials       189      189

Flag	Coverage Δ
unittests	`83.33% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

MengzhangLI reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Show resolved Hide resolved

MengzhangLI reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Outdated Show resolved Hide resolved

MengzhangLI reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Show resolved Hide resolved

MengzhangLI reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Outdated Show resolved Hide resolved

MengzhangLI reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Outdated Show resolved Hide resolved

MengzhangLI reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Outdated


		### Data Preprocessor to Model

		Though drawn separately in the diagram [above](#overview-of-dataflow), data_preprocessor is a part of the model and thus can be found in [Model tutorial](./models.md) at Seg DataPreprocessor chapter.

Contributor

MengzhangLI Dec 19, 2022

Add specific http link of models.md.

MengzhangLI reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Outdated Show resolved Hide resolved

MengzhangLI reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Outdated Show resolved Hide resolved

MengzhangLI reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Outdated Show resolved Hide resolved

MengzhangLI reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Outdated Show resolved Hide resolved

MengzhangLI reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Outdated Show resolved Hide resolved

MengzhangLI reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Outdated


		The same as Data Preprocessor, loss function is also a part of the model, it's a property of [decode head](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/models/decode_heads/decode_head.py#L142).

		In MMSegmentation, the method [loss_by_feat](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/models/decode_heads/decode_head.py#L291) of `decode_head` is a unify interface used to compute loss.

Contributor

MengzhangLI Dec 19, 2022

Suggested change

      
            In MMSegmentation, the method [loss_by_feat](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/models/decode_heads/decode_head.py#L291) of `decode_head` is a unify interface used to compute loss.
          
            In MMSegmentation, the method [loss_by_feat](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/models/decode_heads/decode_head.py#L291) of `decode_head` is an unified interface used to compute loss.

MeowZheng added this to the 1.0.0rc3 milestone

MeowZheng reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Outdated

Comment on lines 7 to 11

+              As illustrated in the [Runner document of MMEngine](https://mmengine.readthedocs.io/en/latest/tutorials/runner.html), the following diagram shows the basic dataflow.
+              ![Basic dataflow](https://user-images.githubusercontent.com/112053249/199228350-5f80699e-7fd2-4b4c-ac32-0b16b1922c2e.png)
+              The dashed border, gray filled shapes represent different data formats, while solid boxes represent modules/methods. Due to the great flexibility and extensibility of MMEngine, you can always inherit some key base classes and override their methods, so the above diagram doesn’t always hold. It only holds when you are not customizing your own `Runner` or `TrainLoop`, and you are not overriding `train_step`, `val_step` or `test_step` method in your custom model.

Collaborator

MeowZheng Dec 27, 2022

It should emphasize that Runner controls the dataflow, and add examples of train_cfg, test_cfg and val_cfg in mmseg. might also attach the link for runner design documentation https://github.com/open-mmlab/mmengine/blob/main/docs/en/design/runner.md

it is necessary to explain the train_step, val_step or test_step work for each iteration for training and testing and add links https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#train_step, https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#val_step and https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#test_step

docs/en/advanced_guides/data_flow.md Outdated


		The dashed border, gray filled shapes represent different data formats, while solid boxes represent modules/methods. Due to the great flexibility and extensibility of MMEngine, you can always inherit some key base classes and override their methods, so the above diagram doesn’t always hold. It only holds when you are not customizing your own `Runner` or `TrainLoop`, and you are not overriding `train_step`, `val_step` or `test_step` method in your custom model.

		## Format convention

Collaborator

MeowZheng Dec 27, 2022

Suggested change

      
            ## Format convention
          
            ## Dataflow convention in MMSegmentation

docs/en/advanced_guides/data_flow.md Outdated Show resolved Hide resolved

docs/en/advanced_guides/data_flow.md Show resolved Hide resolved

docs/en/advanced_guides/data_flow.md Outdated


		The return value is the same as `PackSegInputs` except the `inputs` would be transferred to GPU and some additional metainfo like `pad_shape` and `padding_size` would be added to the `data_samples`.

		### Model to Evaluator

Collaborator

MeowZheng Dec 27, 2022

Suggested change

      
            ### Model to Evaluator
          
            ### Model output
          
            #### To Evaluator
          
            #### Optim Wrapper
          
            ####

docs/en/advanced_guides/data_flow.md Outdated


		The return value is the same as `PackSegInputs` except the `inputs` would be transferred to GPU and some additional metainfo like `pad_shape` and `padding_size` would be added to the `data_samples`.

		### Model to Evaluator

Collaborator

MeowZheng Dec 27, 2022

we can add some explanation about the control flow of runner when training and testing, like call train_step: model.forward-> evaluator, call test_step: model.forward -> optim wrapper refer, and attach the link https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md

and highlight the input and output of model and other modules

xiexinch added 3 commits

December 28, 2022 15:39


          update

910b9f5


          add runner

bebead0


          add steps

6293e07

MeowZheng reviewed

View reviewed changes

docs/en/advanced_guides/data_flow.md Show resolved Hide resolved

docs/en/advanced_guides/data_flow.md Outdated


		MMSegmentation defines the default data format at [PackSegInputs](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/datasets/transforms/formatting.py#L12), it's the last component of `train_pipeline` and `test_pipeline`. Please refer to [data transform documentation](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/transforms.html) for more information about data transform `pipeline`.

		Without any modifications, the return value of PackSegInputs is usually a `dict` and has only two keys, `inputs` and `data_samples`. The following pseudo-code shows the data types of the data loader output, `inputs` is the list of input tensors to the model and `data_samples` contains a list of input images' meta information and corresponding ground truth.

Collaborator

MeowZheng Dec 29, 2022

Suggested change

      
            Without any modifications, the return value of PackSegInputs is usually a `dict` and has only two keys, `inputs` and `data_samples`. The following pseudo-code shows the data types of the data loader output, `inputs` is the list of input tensors to the model and `data_samples` contains a list of input images' meta information and corresponding ground truth.
          
            Without any modifications, the return value of PackSegInputs is usually a `dict` and has only two keys, `inputs` and `data_samples`. The following pseudo-code shows the data types of the data loader output in mmseg, which is a batch of fetched data samples from the dataset, and data loader packs them into a dictionary of the list. `inputs` is the list of input tensors to the model and `data_samples` contains a list of input images' meta information and corresponding ground truth.

docs/en/advanced_guides/data_flow.md Outdated


		### Data Preprocessor to Model

		Though drawn separately in the diagram [above](#overview-of-dataflow), data_preprocessor is a part of the model and thus can be found in [Model tutorial](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/models.html) at Seg DataPreprocessor chapter.

Collaborator

MeowZheng Dec 29, 2022

Suggested change

      
            Though drawn separately in the diagram [above](#overview-of-dataflow), data_preprocessor is a part of the model and thus can be found in [Model tutorial](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/models.html) at Seg DataPreprocessor chapter.
          
            Though drawn separately in the diagram [above](#overview-of-dataflow), data_preprocessor is a part of the model and thus can be found in [Model tutorial](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/models.html) at data preprocessor chapter.

docs/en/advanced_guides/data_flow.md Outdated


		Though drawn separately in the diagram [above](#overview-of-dataflow), data_preprocessor is a part of the model and thus can be found in [Model tutorial](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/models.html) at Seg DataPreprocessor chapter.

		The return value of Data Preprocessor is a dict, contains `inputs` and `data_samples`, `inputs` would be transferred to GPU and some additional metainfo like `pad_shape` and `padding_size` would be added to the `data_samples`. When transfer to the network, the dict would be unpacked to two values. The following pseudo-codes show the return value of data preprocessor and the input values of model.

Collaborator

MeowZheng Dec 29, 2022

Suggested change

      
            The return value of Data Preprocessor is a dict, contains `inputs` and `data_samples`, `inputs` would be transferred to GPU and some additional metainfo like `pad_shape` and `padding_size` would be added to the `data_samples`. When transfer to the network, the dict would be unpacked to two values. The following pseudo-codes show the return value of data preprocessor and the input values of model.
          
            The return value of data preprocessor is a dictionary, containing `inputs` and `data_samples`, `inputs` is batched images, a 4D tensor, and some additional meta info used in data preprocesses would be added to the `data_samples`. When transferred to the network, the dictionary would be unpacked to two values. The following pseudo-codes show the return value of the data preprocessor and the input values of model.

docs/en/advanced_guides/data_flow.md Outdated

Comment on lines 48 to 53

+              ```python
+              dict(
+                  inputs=torch.Tensor,
+                  data_samples=Optional[List[SegDataSample], None]
+              )
+              ```

Collaborator

MeowZheng Dec 29, 2022

Suggested change

      
            ```python
          
            dict(
          
                inputs=torch.Tensor,
          
                data_samples=Optional[List[SegDataSample], None]
          
            )
          
            ```
          
            ```python
          
            dict(
          
                inputs=torch.Tensor,
          
                data_samples=List[SegDataSample]
          
            )

I think this tutorial is about data flow under runner control, which contains gt always

docs/en/advanced_guides/data_flow.md Outdated

+              test_cfg = dict(type='TestLoop')
+              ```
+              In the above diagram, the red line indicates the [train_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#train_step). At each training iteration, dataloader loads images from storage and transfer to data preprocessor, data preprocessor would put images to GPU device and stack data to batch, then model accept the batch data as inputs, finally the outputs of the model would be sent to optimizer. The blue line indicates [val_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#val_step) and [test_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#test_step). The dataflow of these two process is similar to the `train_step` except the outputs of model, since model parameters are freezed when doing evaluation, the model output would be transferred to [Evaluator](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/evaluation.md#ioumetric) to compute metrics.

Collaborator

MeowZheng Dec 29, 2022

Suggested change

      
            In the above diagram, the red line indicates the [train_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#train_step). At each training iteration, dataloader loads images from storage and transfer to data preprocessor, data preprocessor would put images to GPU device and stack data to batch, then model accept the batch data as inputs, finally the outputs of the model would be sent to optimizer. The blue line indicates [val_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#val_step) and [test_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#test_step). The dataflow of these two process is similar to the `train_step` except the outputs of model, since model parameters are freezed when doing evaluation, the model output would be transferred to [Evaluator](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/evaluation.md#ioumetric) to compute metrics.
          
            In the above diagram, the red line indicates the [train_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#train_step). At each training iteration, dataloader loads images from storage and transfer to data preprocessor, data preprocessor would put images to the specific device and stack data to batch, then model accepts the batch data as inputs, finally the outputs of the model would be sent to optimizer. The blue line indicates [val_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#val_step) and [test_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#test_step). The dataflow of these two process is similar to the `train_step` except the outputs of model, since model parameters are freezed when doing evaluation, the model output would be transferred to [Evaluator](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/evaluation.md#ioumetric) to compute metrics.

docs/en/advanced_guides/data_flow.md Outdated


		### Model output

		#### To Evaluator

Collaborator

MeowZheng Dec 29, 2022

Suggested change

      
            #### To Evaluator
          
            As [model tutorial](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#forward) mentioned 3 kinds of mode forward with 3 kinds of output. `train_step`and `test_step`(or `val_step`) correspond to `'loss'` and `'predict'` respectively.

docs/en/advanced_guides/data_flow.md Outdated


		#### To Evaluator

		At the evaluation procedure, the inference results would be transferred to `Evaluator`. You might read the [evaluation document](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/evaluation.html) for more information about `Evaluator`.

Collaborator

MeowZheng Dec 29, 2022

Suggested change

      
            At the evaluation procedure, the inference results would be transferred to `Evaluator`. You might read the [evaluation document](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/evaluation.html) for more information about `Evaluator`.
          
            In `test_step` or `val_step`, the inference results would be transferred to `Evaluator`. You might read the [evaluation document](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/evaluation.html) for more information about `Evaluator`.

docs/en/advanced_guides/data_flow.md Outdated


		![SegDataSample](../../../resources/SegDataSample.png)

		#### Optim Wrapper

Collaborator

MeowZheng Dec 29, 2022

Suggested change

#### Optim Wrapper

docs/en/advanced_guides/data_flow.md Show resolved Hide resolved


          update

ec4dd03

MeowZheng approved these changes

View reviewed changes

MeowZheng merged commit 5723d16 into open-mmlab:dev-1.x

MeowZheng pushed a commit to MeowZheng/mmsegmentation that referenced this pull request


          [Doc] Add dataflow document (open-mmlab#2403)

7e148ea

* draft

* update loss

* update

* add runner

* add steps

* update

MeowZheng pushed a commit to MeowZheng/mmsegmentation that referenced this pull request


          [Doc] Add dataflow document (open-mmlab#2403)

55af6b8

* draft

* update loss

* update

* add runner

* add steps

* update

MeowZheng pushed a commit to MeowZheng/mmsegmentation that referenced this pull request


          [Doc] Add dataflow document (open-mmlab#2403)

1ee27cc

* draft

* update loss

* update

* add runner

* add steps

* update

MeowZheng pushed a commit that referenced this pull request


          [Doc] Add dataflow document (#2403)

fce7e6a

* draft

* update loss

* update

* add runner

* add steps

* update

MeowZheng pushed a commit that referenced this pull request


          [Doc] Add dataflow document (#2403)

ba66a07

* draft

* update loss

* update

* add runner

* add steps

* update

MeowZheng pushed a commit that referenced this pull request


          [Doc] Add dataflow document (#2403)

ad99ad1

* draft

* update loss

* update

* add runner

* add steps

* update

aravind-h-v pushed a commit to aravind-h-v/mmsegmentation that referenced this pull request


          Support convert LoRA safetensors into diffusers format (open-mmlab#2403)

63805f8

* add lora convertor

* Update convert_lora_safetensor_to_diffusers.py

* Update README.md

* Update convert_lora_safetensor_to_diffusers.py

nahidnazifi87 pushed a commit to nahidnazifi87/mmsegmentation_playground that referenced this pull request


          [Doc] Add dataflow document (open-mmlab#2403)

4fc4097

* draft

* update loss

* update

* add runner

* add steps

* update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels