Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Add dataflow document #2403

Merged
merged 6 commits into from
Dec 29, 2022
Merged

Conversation

xiexinch
Copy link
Collaborator

Motivation

Add dataflow doc

Modification

  • docs/en/advanced_guides/data_flow.md

@xiexinch xiexinch added Doc 1.x Related issue of 1.x version labels Dec 12, 2022
@codecov
Copy link

codecov bot commented Dec 13, 2022

Codecov Report

Base: 83.33% // Head: 83.33% // No change to project coverage 👍

Coverage data is based on head (6293e07) compared to base (6cb64e3).
Patch has no changes to coverable lines.

❗ Current head 6293e07 differs from pull request most recent head ec4dd03. Consider uploading reports for the commit ec4dd03 to get more accurate results

Additional details and impacted files
@@           Coverage Diff            @@
##           dev-1.x    #2403   +/-   ##
========================================
  Coverage    83.33%   83.33%           
========================================
  Files          143      143           
  Lines         8127     8127           
  Branches      1211     1211           
========================================
  Hits          6773     6773           
  Misses        1165     1165           
  Partials       189      189           
Flag Coverage Δ
unittests 83.33% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.


### Data Preprocessor to Model

Though drawn separately in the diagram [above](#overview-of-dataflow), data_preprocessor is a part of the model and thus can be found in [Model tutorial](./models.md) at Seg DataPreprocessor chapter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add specific http link of models.md.


The same as Data Preprocessor, loss function is also a part of the model, it's a property of [decode head](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/models/decode_heads/decode_head.py#L142).

In MMSegmentation, the method [loss_by_feat](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/models/decode_heads/decode_head.py#L291) of `decode_head` is a unify interface used to compute loss.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In MMSegmentation, the method [loss_by_feat](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/models/decode_heads/decode_head.py#L291) of `decode_head` is a unify interface used to compute loss.
In MMSegmentation, the method [loss_by_feat](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/models/decode_heads/decode_head.py#L291) of `decode_head` is an unified interface used to compute loss.

@MeowZheng MeowZheng added this to the 1.0.0rc3 milestone Dec 22, 2022
Comment on lines 7 to 11
As illustrated in the [Runner document of MMEngine](https://mmengine.readthedocs.io/en/latest/tutorials/runner.html), the following diagram shows the basic dataflow.

![Basic dataflow](https://user-images.githubusercontent.com/112053249/199228350-5f80699e-7fd2-4b4c-ac32-0b16b1922c2e.png)

The dashed border, gray filled shapes represent different data formats, while solid boxes represent modules/methods. Due to the great flexibility and extensibility of MMEngine, you can always inherit some key base classes and override their methods, so the above diagram doesn’t always hold. It only holds when you are not customizing your own `Runner` or `TrainLoop`, and you are not overriding `train_step`, `val_step` or `test_step` method in your custom model.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should emphasize that Runner controls the dataflow, and add examples of train_cfg, test_cfg and val_cfg in mmseg. might also attach the link for runner design documentation https://github.com/open-mmlab/mmengine/blob/main/docs/en/design/runner.md

it is necessary to explain the train_step, val_step or test_step work for each iteration for training and testing and add links https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#train_step, https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#val_step and https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#test_step


The dashed border, gray filled shapes represent different data formats, while solid boxes represent modules/methods. Due to the great flexibility and extensibility of MMEngine, you can always inherit some key base classes and override their methods, so the above diagram doesn’t always hold. It only holds when you are not customizing your own `Runner` or `TrainLoop`, and you are not overriding `train_step`, `val_step` or `test_step` method in your custom model.

## Format convention
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Format convention
## Dataflow convention in MMSegmentation

docs/en/advanced_guides/data_flow.md Outdated Show resolved Hide resolved
docs/en/advanced_guides/data_flow.md Show resolved Hide resolved

The return value is the same as `PackSegInputs` except the `inputs` would be transferred to GPU and some additional metainfo like `pad_shape` and `padding_size` would be added to the `data_samples`.

### Model to Evaluator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Model to Evaluator
### Model output
#### To Evaluator
#### Optim Wrapper
####


The return value is the same as `PackSegInputs` except the `inputs` would be transferred to GPU and some additional metainfo like `pad_shape` and `padding_size` would be added to the `data_samples`.

### Model to Evaluator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can add some explanation about the control flow of runner when training and testing, like call train_step: model.forward-> evaluator, call test_step: model.forward -> optim wrapper refer, and attach the link https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md

and highlight the input and output of model and other modules

docs/en/advanced_guides/data_flow.md Show resolved Hide resolved

MMSegmentation defines the default data format at [PackSegInputs](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/datasets/transforms/formatting.py#L12), it's the last component of `train_pipeline` and `test_pipeline`. Please refer to [data transform documentation](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/transforms.html) for more information about data transform `pipeline`.

Without any modifications, the return value of PackSegInputs is usually a `dict` and has only two keys, `inputs` and `data_samples`. The following pseudo-code shows the data types of the data loader output, `inputs` is the list of input tensors to the model and `data_samples` contains a list of input images' meta information and corresponding ground truth.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Without any modifications, the return value of PackSegInputs is usually a `dict` and has only two keys, `inputs` and `data_samples`. The following pseudo-code shows the data types of the data loader output, `inputs` is the list of input tensors to the model and `data_samples` contains a list of input images' meta information and corresponding ground truth.
Without any modifications, the return value of PackSegInputs is usually a `dict` and has only two keys, `inputs` and `data_samples`. The following pseudo-code shows the data types of the data loader output in mmseg, which is a batch of fetched data samples from the dataset, and data loader packs them into a dictionary of the list. `inputs` is the list of input tensors to the model and `data_samples` contains a list of input images' meta information and corresponding ground truth.


### Data Preprocessor to Model

Though drawn separately in the diagram [above](#overview-of-dataflow), data_preprocessor is a part of the model and thus can be found in [Model tutorial](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/models.html) at Seg DataPreprocessor chapter.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Though drawn separately in the diagram [above](#overview-of-dataflow), data_preprocessor is a part of the model and thus can be found in [Model tutorial](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/models.html) at Seg DataPreprocessor chapter.
Though drawn separately in the diagram [above](#overview-of-dataflow), data_preprocessor is a part of the model and thus can be found in [Model tutorial](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/models.html) at data preprocessor chapter.


Though drawn separately in the diagram [above](#overview-of-dataflow), data_preprocessor is a part of the model and thus can be found in [Model tutorial](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/models.html) at Seg DataPreprocessor chapter.

The return value of Data Preprocessor is a dict, contains `inputs` and `data_samples`, `inputs` would be transferred to GPU and some additional metainfo like `pad_shape` and `padding_size` would be added to the `data_samples`. When transfer to the network, the dict would be unpacked to two values. The following pseudo-codes show the return value of data preprocessor and the input values of model.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The return value of Data Preprocessor is a dict, contains `inputs` and `data_samples`, `inputs` would be transferred to GPU and some additional metainfo like `pad_shape` and `padding_size` would be added to the `data_samples`. When transfer to the network, the dict would be unpacked to two values. The following pseudo-codes show the return value of data preprocessor and the input values of model.
The return value of data preprocessor is a dictionary, containing `inputs` and `data_samples`, `inputs` is batched images, a 4D tensor, and some additional meta info used in data preprocesses would be added to the `data_samples`. When transferred to the network, the dictionary would be unpacked to two values. The following pseudo-codes show the return value of the data preprocessor and the input values of model.

Comment on lines 48 to 53
```python
dict(
inputs=torch.Tensor,
data_samples=Optional[List[SegDataSample], None]
)
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```python
dict(
inputs=torch.Tensor,
data_samples=Optional[List[SegDataSample], None]
)
```
```python
dict(
inputs=torch.Tensor,
data_samples=List[SegDataSample]
)
I think this tutorial is about data flow under runner control, which contains gt always

test_cfg = dict(type='TestLoop')
```

In the above diagram, the red line indicates the [train_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#train_step). At each training iteration, dataloader loads images from storage and transfer to data preprocessor, data preprocessor would put images to GPU device and stack data to batch, then model accept the batch data as inputs, finally the outputs of the model would be sent to optimizer. The blue line indicates [val_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#val_step) and [test_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#test_step). The dataflow of these two process is similar to the `train_step` except the outputs of model, since model parameters are freezed when doing evaluation, the model output would be transferred to [Evaluator](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/evaluation.md#ioumetric) to compute metrics.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In the above diagram, the red line indicates the [train_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#train_step). At each training iteration, dataloader loads images from storage and transfer to data preprocessor, data preprocessor would put images to GPU device and stack data to batch, then model accept the batch data as inputs, finally the outputs of the model would be sent to optimizer. The blue line indicates [val_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#val_step) and [test_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#test_step). The dataflow of these two process is similar to the `train_step` except the outputs of model, since model parameters are freezed when doing evaluation, the model output would be transferred to [Evaluator](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/evaluation.md#ioumetric) to compute metrics.
In the above diagram, the red line indicates the [train_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#train_step). At each training iteration, dataloader loads images from storage and transfer to data preprocessor, data preprocessor would put images to the specific device and stack data to batch, then model accepts the batch data as inputs, finally the outputs of the model would be sent to optimizer. The blue line indicates [val_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#val_step) and [test_step](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#test_step). The dataflow of these two process is similar to the `train_step` except the outputs of model, since model parameters are freezed when doing evaluation, the model output would be transferred to [Evaluator](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/evaluation.md#ioumetric) to compute metrics.


### Model output

#### To Evaluator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### To Evaluator
As [model tutorial](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/models.md#forward) mentioned 3 kinds of mode forward with 3 kinds of output. `train_step`and `test_step`(or `val_step`) correspond to `'loss'` and `'predict'` respectively.


#### To Evaluator

At the evaluation procedure, the inference results would be transferred to `Evaluator`. You might read the [evaluation document](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/evaluation.html) for more information about `Evaluator`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
At the evaluation procedure, the inference results would be transferred to `Evaluator`. You might read the [evaluation document](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/evaluation.html) for more information about `Evaluator`.
In `test_step` or `val_step`, the inference results would be transferred to `Evaluator`. You might read the [evaluation document](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/evaluation.html) for more information about `Evaluator`.


![SegDataSample](../../../resources/SegDataSample.png)

#### Optim Wrapper
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Optim Wrapper

docs/en/advanced_guides/data_flow.md Show resolved Hide resolved
@MeowZheng MeowZheng merged commit 5723d16 into open-mmlab:dev-1.x Dec 29, 2022
MeowZheng pushed a commit to MeowZheng/mmsegmentation that referenced this pull request Dec 30, 2022
* draft

* update loss

* update

* add runner

* add steps

* update
MeowZheng pushed a commit to MeowZheng/mmsegmentation that referenced this pull request Dec 30, 2022
* draft

* update loss

* update

* add runner

* add steps

* update
MeowZheng pushed a commit to MeowZheng/mmsegmentation that referenced this pull request Dec 30, 2022
* draft

* update loss

* update

* add runner

* add steps

* update
MeowZheng pushed a commit that referenced this pull request Dec 30, 2022
* draft

* update loss

* update

* add runner

* add steps

* update
MeowZheng pushed a commit that referenced this pull request Dec 30, 2022
* draft

* update loss

* update

* add runner

* add steps

* update
MeowZheng pushed a commit that referenced this pull request Dec 30, 2022
* draft

* update loss

* update

* add runner

* add steps

* update
aravind-h-v pushed a commit to aravind-h-v/mmsegmentation that referenced this pull request Mar 27, 2023
* add lora convertor

* Update convert_lora_safetensor_to_diffusers.py

* Update README.md

* Update convert_lora_safetensor_to_diffusers.py
nahidnazifi87 pushed a commit to nahidnazifi87/mmsegmentation_playground that referenced this pull request Apr 5, 2024
* draft

* update loss

* update

* add runner

* add steps

* update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.x Related issue of 1.x version Doc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants