-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hashtagify transformation #246
Hashtagify transformation #246
Conversation
tasks = [ | ||
TaskType.TEXT_CLASSIFICATION, | ||
TaskType.TEXT_TO_TEXT_GENERATION, | ||
TaskType.TEXT_TAGGING, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TEXT_TAGGING will not be applicable here due to the difference in the number of tokens between the input sentence and the transformed sentence.(New Delhi --> #NewDelhi)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. It can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I remove it and commit again?
…r/NL-Augmenter into hashtagify_transformation
nltkdl("maxent_ne_chunker") | ||
nltkdl("punkt") | ||
nltkdl("averaged_perceptron_tagger") | ||
nltkdl("stopwords") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to move this to the constructor.
|
||
def __init__(self, seed=666, max_outputs=1): | ||
super().__init__(seed, max_outputs=max_outputs) | ||
self.nlp = spacy.load("en_core_web_sm") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use spacy like this
#NewDelhi is among the many famous places in India. | ||
``` | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to add the robustness evaluation section too.
return perturbed_texts | ||
|
||
|
||
class HashtagifyTransformation(SentenceOperation): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add docstring for the class HashtagifyTransformation.
This transformation adapts an input sentence by identifying named entities and other common words and turning them into hashtags, as often used in social media.