Skip to content

This is setup for Estonian text use of keyword extraction with KeyBERT. The setup features provided languages with special MMR setup diversity.

License

Notifications You must be signed in to change notification settings

hermanpetrov/KeyBERT-Estonian-setup

Repository files navigation

KeyBERT-Estonian-setup

This is setup for Estonian text use of keyword extraction with KeyBERT library. The setup features provided languages with special MMR setup diversity. The chosen provided settings are based on research in a Master Thesis. KeyBERT Library: https://maartengr.github.io/KeyBERT/#about-the-project The Analytical and research based scripts are provided in the folder RESEARCH where all the necessary componentes were used in testing the tool accuracy based on F1 score for single keyword extraction with simple maths Sketch Engine and PageRanks TextRank.

The accuracy yielded good results against corpa based simple maths and TextRank therefore the current based setup has the highest possible result setup for single keyword extraction.

As an experiment the project researched the search of single keywords from KeyBERT ngram setting and detected that keyphrase extraction a lot more keywords rather than single KeyWORDS.

About

This is setup for Estonian text use of keyword extraction with KeyBERT. The setup features provided languages with special MMR setup diversity.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published