Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Rec Seq2seq components and configure file, change all config files due to interface change #253

Merged
merged 7 commits into from
May 5, 2023

Conversation

zhtmike
Copy link
Collaborator

@zhtmike zhtmike commented May 4, 2023

Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:

Motivation

(Write your motivation for proposed changes here.)

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)

@zhtmike zhtmike changed the title Add Rec Seq2seq components ane configure file, change all config files due to interface change Add Rec Seq2seq components and configure file, change all config files due to interface change May 4, 2023
@@ -185,7 +186,7 @@ predict:
- ToCHWImage:
# the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
output_columns: [ 'img_path', 'image', 'raw_img_shape' ]
num_columns_to_net: 1 # num inputs for network forward func
columns_indices_for_net: [0] # input indices for network forward func in output_columns
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

img_path 作为网络输入看起来有些问题

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

经确认pred pipeline目前是不会用到num_columns_to_net的,所以这个参数配置已删除。

- ToCHWImage:
# the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
output_columns: ["image", "text_seq"] #, 'length'] #'img_path']
columns_indices_for_net: [0, 1] # input indices for network forward func in output_columns
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里不用指定column_indices_for_label?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

column_indices_for_label如果不提供会 implicitly 用indices = [1]做为label。现已添加。

@@ -280,3 +377,13 @@ def __call__(self, data):
assert trans.num_classes==96
assert out['length'] ==len(text), 'Not equal: {}, {}'.format(out['length'], text) # use_space_char=False, but the dict contains space, % is also in dict
print(seq)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些中间的print是否必要? 如仅调试用,应删去。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已删去。

@@ -33,13 +33,16 @@ def __init__(self, config: dict):

self.model_name = f'{backbone_name}_{neck_name}_{head_name}'

def construct(self, x):
def construct(self, x, y=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要补充说明入参,x和y。并更新mindocr/models/README.md相应的接口描述。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已补充docstring和更新接口描述。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how we will deal with the problem if the backbone or neck will require an extra input.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current we don't support this feature.

README.md Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议优化命名:
columns_indices_for_net -> net_input_column_index
columns_indices_for_label -> label_column_index
columns_indices_for_meta_data -> meta_data_column_index
另外,meta_data_column_index 应避免手动指定,output_columns中除去net_input_column_index 和 label_column_index部分剩下都是meta data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已更新命名。meta_data_column_index已按这逻辑更新。

@SamitHuang SamitHuang requested a review from HaoyangLee May 4, 2023 04:51
mindocr/losses/builder.py Outdated Show resolved Hide resolved
if self.label_indices is not None:
gt = [data[x] for x in self.label_indices]
else:
gt = [data[1]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
gt = [data[1]]
start_idx = 0 if self.input_indices is None else len(self.input_indices)
gt = [data[start_idx:]]

I suggest using the rest of the axes not used in self.input_indices as labels when self.label_indices is None.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest of the axes will be marked as meta_data.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be gt = [data[start_idx:end_idx]] where end_idx is calculated based on self.meta_data_indices the same way as start_idx does?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use indicies here explicitly instead of range. Since the column property can be overlapped, for example, label_1 can be used as network input, and label, as well as marked as meta data in the same time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can assume that the indices are continuous for the default case, so we can avoid putting default values into config files.

num_columns_to_net: 1 # num inputs for network forward func
num_columns_of_labels: 2 # num labels
net_input_column_index: [0] # input indices for network forward func in output_columns
label_column_index: [1, 2] # input indices marked as label
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like our config files are rapidly increasing, it's easy to make mistakes when config files are long. These fields have default values and can be safely disregarded from the config. Please see my suggestion in mindocr/utils/callbacks.py.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, they have default value. net_input_column_index is default to be [0], label_column_index is default to [1]. We can create a PR later to clean the all unecessary settings.

tools/train.py Show resolved Hide resolved
@@ -33,13 +33,16 @@ def __init__(self, config: dict):

self.model_name = f'{backbone_name}_{neck_name}_{head_name}'

def construct(self, x):
def construct(self, x, y=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how we will deal with the problem if the backbone or neck will require an extra input.

Copy link
Collaborator

@SamitHuang SamitHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! Some things to be supplemented next

  1. readme file for crnn_seq2seq.
  2. avoid setting meta_data_indices as input for simplicity, this can be inferred by set(all_indices) - set(input_indices) - set(label_indices).

@zhtmike zhtmike merged commit 7607dda into mindspore-lab:main May 5, 2023
@zhtmike zhtmike deleted the seq2seq_merge branch May 8, 2023 02:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants