Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve NLP Bugs #76

Open
2 tasks
pmitros opened this issue May 15, 2023 · 0 comments
Open
2 tasks

Resolve NLP Bugs #76

pmitros opened this issue May 15, 2023 · 0 comments
Projects

Comments

@pmitros
Copy link
Contributor

pmitros commented May 15, 2023

  • Fix informal language indicator.
    • Right now the function for this indicator is actually looking for a list of features that appear more often in informal language, but each feature is equally weighted, and all detections are happening at the word level.
    • We need to add other features (e.g. sentence fragments and some slang elements like like used as an adjective or interjection),
    • It would also be better if we had weights for informality, so that words like THIS or THAT were less heavly weighted than like, like. To do this, we'd have to train a classifier on a labeled corpus. Good job for a grad student sometime?
    • in many cases, it would be better to tag entire sentences rather than words. This might be the easiest rewrite of the current indicator. If we added a layer to the indicator so that it returned offsets for whole sentences only if the proportion of informal words was above a threshold, we'd get more plausible results for highlighting from teachers' points of view for relatively small effort.
  • Fix transition words. Right now the temporal transition word indicator is looking to temporal noun phrases but placing no limits on the length or syntactic position of those noun phrases. It should be restricted (a) to limit the length of the noun phrases it recognizes, and (b) to exclude phrases that are functioning as predicates (spacy dependencies like attr)
@pmitros pmitros created this issue from a note in Backlog (Icebox) May 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Backlog
Icebox
Development

No branches or pull requests

1 participant