Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add automatic file tag suggestions by file content using Machine Learning #9

Open
SeanPedersen opened this issue Jan 2, 2021 · 1 comment
Assignees
Labels
enhancement New feature or request
Projects

Comments

@SeanPedersen
Copy link
Member

SeanPedersen commented Jan 2, 2021

When a new file is added, automatically infer tags from semantically similar existing files tags.

Depends on #24

@SeanPedersen SeanPedersen added the enhancement New feature or request label Jan 2, 2021
@SeanPedersen SeanPedersen changed the title Add automatic file tagging by file content using Machine Learning Add automatic file tag suggestions by file content using Machine Learning Jan 2, 2021
@SeanPedersen
Copy link
Member Author

This feature is non trivial and hard to get right. False positives (wrongly added tags) will be really annoying.

Some possible mitigation's:

  • be really conservative, e.g. only suggest few top matches (with high confidence)
  • Analyze cluster variance for each tag and prefer low variance (high density / coherence)
    • This will prevent generic tags like eBooks being suggested for e.g. a scientific paper. Instead specific tags will be preferred, increasing signal.

@SeanPedersen SeanPedersen self-assigned this Jan 9, 2021
@SeanPedersen SeanPedersen added this to Low Priority in TODO Feb 1, 2021
@SeanPedersen SeanPedersen moved this from Low Priority to High Priority in TODO Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
TODO
High Priority
Development

No branches or pull requests

1 participant