Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Augmentation #4

Closed
5 of 12 tasks
KennethEnevoldsen opened this issue Mar 21, 2021 · 3 comments
Closed
5 of 12 tasks

Augmentation #4

KennethEnevoldsen opened this issue Mar 21, 2021 · 3 comments
Labels
enhancement New feature or request
Milestone

Comments

@KennethEnevoldsen
Copy link
Collaborator

KennethEnevoldsen commented Mar 21, 2021

  • Entity augmentation
    • Gender augmentation (awareness of gender)
    • Second order person augmentation (Lastname, Firstname)
    • Usernames (autogenerates e.g. WhiteTruffle101 or Kenneth Enevoldsen -> KennethEnevoldsen)
  • Mispellings Augmentations, se e.g. this repo
    • Keystroke error based on keyboard distance
  • Historic augmentations
    •   æ->ae, å -> aa (and a), ø->oe
    • uppercasing of nouns
  • Social media
    • Adding hashtags augmentation
  • Others, potentially see this tweet or this kaggle summary
@martincjespersen
Copy link

martincjespersen commented May 3, 2021

  • Add entity augmentation for abbreviated first names e.g., Martin Jespersen -> M. Jespersen or Jespersen, M.

@KennethEnevoldsen KennethEnevoldsen added this to the first release milestone Jun 24, 2021
@KennethEnevoldsen
Copy link
Collaborator Author

Hei @martincjespersen entity augmentation has just been added in the 1.0.0 update. I recommend you check out the documentation for robustness and biases in Danish NLP models. There will also be a paper coming out soon (probably Monday) I will make sure to send it your way. Parts of this work have very much been inspired by your fairness benchmark 🙏 - so I very much hope you find the results interesting.

@KennethEnevoldsen
Copy link
Collaborator Author

As seen in issue #59 augmenters will be replaced with using the augmenty package.

@martincjespersen you might be especially interested in this project. It is still under development, but already looking very promising.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
Development

No branches or pull requests

2 participants