-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update conll05.py #8495
Update conll05.py #8495
Conversation
the label file in the conll05 dataset has a wrong order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR! It seems that it is fixing a bug -- the format of label file differs from that of the vocabulary file. Do you mind adding an issue describing this bug, and mark this PR as fixing this issue by writing "Fix #xxx" in the description of this PR? (#xxx is the id of the issue you just created.) . Thanks!
python/paddle/v2/dataset/conll05.py
Outdated
tag_dict.add(line[2:]) | ||
elif line.startswith("I-"): | ||
tag_dict.add(line[2:]) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a Python expert, but it seems that we can remove L54-L55?
Fix #8079 and remove the redundant code
I mentioned this problem in issue #8079 before, and it is now marked in description. And in the new update, the redundant two lines of code are deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
the label file in the conll05 dataset has a wrong order. The pull request let the label dict in right order. Fix #8079 。