Yake.NET

A pure C# implementation of YAKE (Yet Another Keyword Extractor) — an unsupervised, lightweight, and language-agnostic keyword extraction algorithm for single documents.

Lower score = more relevant keyword.

Features

✅ No external dependencies — 100% pure C#
✅ Unsupervised — no training data or models needed
✅ Language-agnostic — works with any language when paired with custom stopwords
✅ N-gram support — extracts single words and multi-word keyphrases
✅ Built-in English stopwords
✅ Deduplication of near-duplicate candidates
✅ Fully configurable via YakeOptions
✅ .NET 10

Installation

dotnet add package Yake.NET

Quick Start

using Yake.NET;

var extractor = new YakeExtractor();
var keywords = extractor.Extract("Your document text goes here...");

foreach (var kw in keywords)
    Console.WriteLine(kw); // e.g. "keyword extraction (score: 0.001234)"

Configuration

var extractor = new YakeExtractor(new YakeOptions
{
    MaxNGramSize          = 3,    // include up to trigrams
    TopN                  = 10,   // return top 10 keywords
    DeduplicationThreshold = 0.9, // similarity threshold for dedup (0–1)
    WindowSize            = 2,    // co-occurrence window
    Stopwords             = null  // null = use built-in English list
});

Custom Stopwords

// Extend the built-in English stopwords
var customStopwords = StopwordsEn.Words.Concat(["custom", "words"]);

var extractor = new YakeExtractor(new YakeOptions
{
    Stopwords = customStopwords
});

How YAKE Works

YAKE scores each candidate keyword using 5 statistical features:

Feature	Description
T_Casing	Ratio of capitalised (non-sentence-start) occurrences
T_Position	Inverse log of the sentence where the word first appears
T_Frequency	Normalised term frequency
T_Relatedness	Diversity of co-occurring words relative to frequency
T_DifferentSentence	Fraction of sentences containing the word

The final score formula for n-grams:

score(candidate) = ∏(word_scores) / (n × (n + Σ word_scores))

API Reference

`YakeExtractor`

Member	Description
`YakeExtractor()`	Creates extractor with default English options
`YakeExtractor(YakeOptions)`	Creates extractor with custom options
`IReadOnlyList<KeywordResult> Extract(string text)`	Extracts keywords from text

`KeywordResult`

Property	Type	Description
`Keyword`	`string`	The extracted keyword or keyphrase
`Score`	`double`	YAKE score (lower = more relevant)

`YakeOptions`

Property	Default	Description
`MaxNGramSize`	`3`	Maximum n-gram length
`TopN`	`10`	Number of keywords to return
`DeduplicationThreshold`	`0.9`	Similarity cutoff for deduplication
`WindowSize`	`2`	Co-occurrence context window size
`Stopwords`	`null`	Custom stopwords (null = English built-ins)

References

Campos, R., Mangaravite, V., Pasquali, A., Jorge, A., Nunes, C., & Jatowt, A. (2020). YAKE! Keyword extraction from single documents using multiple local features. Information Sciences, 509, 257–289.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
README.md		README.md
Yake.NET.sln		Yake.NET.sln

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Yake.NET

Features

Installation

Quick Start

Configuration

Custom Stopwords

How YAKE Works

API Reference

`YakeExtractor`

`KeywordResult`

`YakeOptions`

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Yake.NET

Features

Installation

Quick Start

Configuration

Custom Stopwords

How YAKE Works

API Reference

YakeExtractor

KeywordResult

YakeOptions

References

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`YakeExtractor`

`KeywordResult`

`YakeOptions`

Packages