Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Annif corpus #560

Open
3 tasks
acka47 opened this issue Dec 1, 2020 · 2 comments
Open
3 tasks

Create Annif corpus #560

acka47 opened this issue Dec 1, 2020 · 2 comments
Assignees
Projects

Comments

@acka47
Copy link
Contributor

acka47 commented Dec 1, 2020

  • create a Short text document corpus (TSV file)
  • document the process
  • publish the corpus data (could also be useful as a test set for others who learn/teach Annif with German-language data)

Based on this, we'd be able to approach further steps in the long run:

  • train Annif with the data & provide a suggest API
  • use Annif for adding subjects to NWBib predecessor bibliographies
  • Use/build an Annif plugin for Alma
  • enable suggestions in Alma for NWBib editors
@acka47 acka47 assigned fsteeg and dr0i Dec 1, 2020
@acka47 acka47 added this to Backlog in lobid board via automation Dec 1, 2020
@acka47 acka47 moved this from Backlog to Ready in lobid board Dec 1, 2020
@acka47 acka47 assigned acka47 and unassigned fsteeg and dr0i Dec 1, 2020
@fsteeg
Copy link
Member

fsteeg commented Dec 1, 2020

Here is what I did with NWBib data in/after the Annif workshop at SWIB19:

https://github.com/fsteeg/python-data-analysis/tree/master/annif

@acka47
Copy link
Contributor Author

acka47 commented Dec 3, 2020

See also hbz/lobid-resources#681

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

4 participants