Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add conll2003 ner,pos,chunk task. #226

Closed
wants to merge 11 commits into from
Closed

Add conll2003 ner,pos,chunk task. #226

wants to merge 11 commits into from

Conversation

sbmaruf
Copy link
Contributor

@sbmaruf sbmaruf commented Jun 12, 2021

Prompt Description:

  1. flat_question_with_label : Regular task. Label are normalized label in-case of POS tagging.
  2. flat_question_with_random_label : It is not expected that user will always provide labels strictly from the dataset. They may provide a subset of labels. So here we provided subset of labels. If the gold labels in the sample are not available in the subset we replace the gold label with "O". In case of choosing random tags, We always include "O" tag for ner, pos and chunk labels.
  3. flat_question_without_label : Regular task. No label is provided.

POS label Normalization

Both NER and Chunk task contains "O" tags. But POS doesn't contain "O" tag.

In case of parts-of-speech tags, there are few labels that are weird in natural sense. For example see a prompt with all pos labels,

Generate parts of speech from the following sentence. The parts of speech tags are ", '', #, $, (, ), ,, ., :, ``, CC, CD, DT, EX, FW, IN, JJ, JJR, JJS, LS, MD, NN, NNP, NNPS, NNS, NN|SYM, PDT, POS, PRP, PRP$, RB, RBR, RBS, RP, SYM, TO, UH, VB, VBD, VBG, VBN, VBP, VBZ, WDT, WP, WP$, WRB

Here first 9 labels are normalized to "O" tag.

Earlier Pull

Earlier zip was not available so I wrote the a brute force code in O(n^2) complexity. But now that zip is available, I wrote the code with simpler notation and loop (with O(n) complexity). While merging I messed up in previous pull #170 . So I closed that and created the new pull.

@sbmaruf
Copy link
Contributor Author

sbmaruf commented Jun 12, 2021

@srush Can you please take a look here. I closed the previous pull #170

@VictorSanh VictorSanh self-assigned this Jun 15, 2021
Comment on lines 19 to 20
WDT\",\n44:\"WP\",\n45:\"WP$\",\n46:\"WRB\"\n}) %}\n{% set _task = [\"named\
\ entities\", \"chunk tag\", \"parts of speech\"] | choice %}\n{% set _label_dict\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you separate all the template per task ("named entities", "chunk tag", "parts of speech")? it's very hard to read or review right now...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Separating.

@VictorSanh
Copy link
Member

Please separate the templates (don't try to squeeze multiple templates into one).
This will remove the need for complex things like do which do not compile right now on my side
Capture d’écran 2021-06-15 à 4 57 18 PM

@sbmaruf
Copy link
Contributor Author

sbmaruf commented Jun 15, 2021

@VictorSanh Can you take a look? do extension requires to update a dictionary without creating a variable. Otherwise this test would not pass.
So I have added the do extension and shuffle filter here.

@srush
Copy link
Collaborator

srush commented Jun 15, 2021

Hmm, I don't think we should add do or shuffle. Can you explain why we need them?

@sbmaruf
Copy link
Contributor Author

sbmaruf commented Jun 16, 2021

@srush

Requirement of do:
I can add content to a dictionary in two ways,
Method 1: {% set _dummy=_random_label_dict.update({k:v}) %}
Method 2: {% do _random_label_dict.update({k:v}) %}

If I use Method 1 it causes an error in check_templates because of the unused variable _dummy

Requirement of shuffle:
It is not expected that the user will always provide labels in the exact same order following the dataset. To mimic this behavior, I shuffle the labels in the prompt description.

@srush
Copy link
Collaborator

srush commented Jun 16, 2021

Let me be more clear. I get what these do, but I don't understand why they are necessary.

We are aiming to be stateless and deterministic when possible. For "randomness" we added a choice filter. But it is not clear to me why this dataset needs shuffle.

(Happy to talk on slack if that is easier)

@sbmaruf
Copy link
Contributor Author

sbmaruf commented Jun 16, 2021

Poked you on bigscience slack yesterday. here is the message.
I wanted to add shuffle because in conll2003 dataset, in case of pos tagging, it has more than 40 class. Now at the inference time may be an user wants to include the labels in the prompt. But he/she includes the labels in different order or may be he/she doesn’t know the original order of the labels that the model was trained on. If the model was trained on labels that are always in the same order, it might memorize it. That’s why I was thinking why not add the labels(in prompt) in different order in diffrent samples so that the model becomes aware of this.

@srush
Copy link
Collaborator

srush commented Jun 17, 2021

I get what you are doing now, and it is clever. But I am not going to approve the do extension or the use of random (please use choice instead). I am okay with using shuffle if you can do it without do. These are not meant to be so code heavy. (I know you got a hard one).

@sbmaruf
Copy link
Contributor Author

sbmaruf commented Jun 17, 2021

@srush Actually I gave some thoughts over this during my free time. I understand your concern over deterministic prompts. I am more open towards your suggestion. Do you think adding
(i) “shuffle” (require shuffle filter)
(ii) “sub-set of labels” (require ‘do’ extension)
would substantially improve supervised prompting? If you think so, I would put some more effort to find a work-around. Otherwise, I will just remove them. Please let me know.

@srush
Copy link
Collaborator

srush commented Jun 18, 2021

I would prefer that we just don't include these things. This seems to be the only dataset that uses them.

@sbmaruf
Copy link
Contributor Author

sbmaruf commented Jun 21, 2021

@srush
I removed those prompt. Please take a look.

Note: Not sure why check_code_quality and show_new_templates failed. locally it passed. Let me know if I need to do anything more.

@VictorSanh VictorSanh closed this Jun 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants