add benchmark docs

alexandrainst · Sep 29, 2020 · 398c353 · 398c353
1 parent c758e74
commit 398c353
Show file tree

Hide file tree

Showing 3 changed files with 48 additions and 5 deletions.
diff --git a/docs/imgs/chunk_features.png b/docs/imgs/chunk_features.png
diff --git a/docs/models/chunking.md b/docs/models/chunking.md
@@ -0,0 +1,39 @@
+Noun-Phrase (NP) Chunking
+====================
+
+Chunking is the task of grouping words of a sentence into syntactic phrases (e.g. noun-phrase, verb phrase). 
+Here, we focus on the prediction of noun-phrases. Noun phrases can be pronouns, proper nouns or nouns (potentially bound with adjectives or verbs). 
+In sentences, noun phrases are generally used as subjects or objects (or complements of prepositions).
+Examples of noun-phrases :
+ * en `bog` (NOUN)
+ * en `god bog` (ADJ+NOUN)
+ * en `reserveret bog` (VERB+NOUN)
+
+NP-chunks can be deduced from dependencies. 
+The NP-chunking "model" is thus a convertion function depending on a [dependency model](https://github.com/alexandrainst/danlp/blob/master/docs/models/dependency.md) which has been trained on the Danish UD treebank.
+
+| Model | Train Data | License | Trained by | Tags | DaNLP |
+|-------|-------|-------|-------|-------|-------|
+| [SpaCy](https://github.com/alexandrainst/danlp/blob/master/docs/models/chunking.md#spacy) | [Danish Dependency Treebank](<https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane>) | MIT | Alexandra Institute | NP | ✔️ |
+
+
+
+
+##### :wrench:SpaCy
+
+Read more about the SpaCy model in the dedicated [SpaCy docs](<https://github.com/alexandrainst/danlp/blob/master/docs/spacy.md>) , it has also been trained using the [Danish Dependency Treebank](<https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane>) dataset. 
+
+![](../imgs/chunk_features.png)
+
+
+
+
+## 📈 Benchmarks
+
+NP chunking scores (F1) are reported below :
+
+| Model | Precision | Recall | F1    |
+|-------|-----------|--------|-------|
+| SpaCy | 91.32     | 91.79  | 91.56 |
+
+See detailed scoring of the benchmarks in the [example](<https://github.com/alexandrainst/danlp/tree/master/examples>) folder.
diff --git a/examples/benchmarks/README.md b/examples/benchmarks/README.md
@@ -3,21 +3,25 @@ Benchmarks scripts
 
 This  folder contains scripts to reproduce the benchmark results reported for the different models in the docs in this project.  
 
-The benchmark script  evaluate models implemented in the danlp package but also models implemented in other frameworks. Therefore more installation of packages is need to run some of the scripts. You can either look in the specific scripts to check which packages is needed, or you can install all the packages which is needed by `pip install requirements_benchmark.py`.  For running the `sentiment_benchmakrs_twiiter` you need a twitter development account and setting the keys as environment variable.  
+The benchmark script  evaluate models implemented in the danlp package but also models implemented in other frameworks.
+Therefore more installation of packages is needed to run some of the scripts.
+You can either look in the specific scripts to check which packages is needed, or you can install all the packages which is needed by `pip install requirements_benchmark.py`.
+For running the `sentiment_benchmarks_twitter` you need a twitter development account and setting the keys as environment variable.
 
 #### List of currently benchmark scripts
 
-- Benchmark scripts for word embeddings in `wordembeddings_benchmarks.py`
+- Benchmark script for word embeddings in `wordembeddings_benchmarks.py`
 
--  Benchmark scripts on the
+- Benchmark script on the
    [DaNE](https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank) 
    NER dataset in `ner_benchmarks.py`
 
 - Benchmark script for sentiment classification on [LCC Sentiment](https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#lcc-sentiment)  and [Europarl Sentiment](https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#europarl-sentiment) using the tools [AFINN](https://github.com/alexandrainst/danlp/blob/master/docs/models/sentiment_analysis.md#afinn) and [Sentida](https://github.com/alexandrainst/danlp/blob/master/docs/models/sentiment_analysis.md#sentida) where the scores are converted to three class problem. It also includes benchmark of [BERT Tone (polarity)](https://github.com/alexandrainst/danlp/blob/master/docs/models/sentiment_analysis.md#wrenchbert-tone)  `sentiment_benchmarks.py`
 
 - `sentiment_benchmarks_twitter.py` show evaluation on a small twitter dataset for both polarity and subjective/objective classification
 
-- Benchmark script of [Part of Speech tagging](<https://github.com/alexandrainst/danlp/blob/master/docs/models/pos.md>) on [Danish Dependency Treebank](<https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane>). A spcay, flair and polyglot model is benchmarked `pos_benchmarks.py`
+- Benchmark script of [Part of Speech tagging](<https://github.com/alexandrainst/danlp/blob/master/docs/models/pos.md>) on [Danish Dependency Treebank](<https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane>). A spaCy, flair and polyglot model is benchmarked `pos_benchmarks.py`
 
+- Benchmark script of [Dependency Parsing](<https://github.com/alexandrainst/danlp/blob/master/docs/models/dependency.md>) on [Danish Dependency Treebank](<https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane>). A spaCy  model is benchmarked `dependency_benchmarks.py`
 
-
+- Benchmark script of [Noun-phrase Chunking](<https://github.com/alexandrainst/danlp/blob/master/docs/models/chunking.md>) on [Danish Dependency Treebank](<https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane>). A spaCy model is benchmarked `chunking_benchmarks.py`