Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
ikuyamada committed Jan 11, 2024
1 parent 76b2281 commit 3fc51ac
Show file tree
Hide file tree
Showing 4 changed files with 28 additions and 179 deletions.
45 changes: 1 addition & 44 deletions docs/css/extra.css
@@ -1,4 +1,4 @@
#main_title {
h1 {
font-family: 'Raleway', sans-serif;
font-weight: 500;
}
Expand All @@ -8,46 +8,3 @@
font-size: 18px;
font-weight: 400;
}

pre code {
font-family: Menlo, Monaco, Consolas, "Courier New", monospace;
}

code.no-highlight {
color: black;
}

div.col-md-9 h1:first-of-type .headerlink {
display: none;
}

div.col-md-9 h1:first-of-type {
font-size: 40px;
font-weight: 300;
}

@media screen and (min-width: 768px) and (max-width: 999px) {
.nav>li>a {
padding-left: 15px;
padding-right: 15px;
}
}

@media screen and (min-width: 768px) and (max-width: 991px) {
.nav>li>a[href="#"] {
display: none;
}
}

@media screen and (min-width: 768px) and (max-width: 1199px) {
.nav>li>a[href^="https://github.com/"] {
display: none;
}
.nav>li>a[rel="prev"] {
display: none;
}

.nav>li>a[rel="next"] {
display: none;
}
}
110 changes: 0 additions & 110 deletions docs/custom_theme/nav.html

This file was deleted.

45 changes: 23 additions & 22 deletions docs/index.md
@@ -1,9 +1,10 @@
<h1 id="main_title">Wikipedia2Vec</h1>
# Wikipedia2Vec

---

<a class="github-button" href="https://github.com/wikipedia2vec/wikipedia2vec" data-size="large" data-show-count="true" aria-label="Star wikipedia2vec/wikipedia2vec on GitHub">Star</a>

### Introduction
## Introduction

Wikipedia2Vec is a tool used for obtaining embeddings (or vector representations) of words and entities (i.e., concepts that have corresponding pages in Wikipedia) from Wikipedia.
It is developed and maintained by [Studio Ousia](http://www.ousia.jp).
Expand All @@ -15,31 +16,31 @@ This tool implements the [conventional skip-gram model](https://en.wikipedia.org

An empirical comparison between Wikipedia2Vec and existing embedding tools (i.e., FastText, Gensim, RDF2Vec, and Wiki2vec) is available [here](https://arxiv.org/abs/1812.06280).

### Pretrained Embeddings
## Pretrained Embeddings

Pretrained embeddings for 12 languages (i.e., English, Arabic, Chinese, Dutch, French, German, Italian, Japanese, Polish, Portuguese, Russian, and Spanish) can be downloaded from [this page](pretrained.md).

### Use Cases
## Use Cases

Wikipedia2Vec has been applied to the following tasks:

* Entity linking: [Yamada et al., 2016](https://arxiv.org/abs/1601.01343), [Eshel et al., 2017](https://arxiv.org/abs/1706.09147), [Chen et al., 2019](https://arxiv.org/abs/1911.03834), [Poerner et al., 2020](https://arxiv.org/abs/1911.03681), [van Hulst et al., 2020](https://arxiv.org/abs/2006.01969).
* Named entity recognition: [Sato et al., 2017](http://www.aclweb.org/anthology/I17-2017), [Lara-Clares and Garcia-Serrano, 2019](http://ceur-ws.org/Vol-2421/eHealth-KD_paper_6.pdf).
* Question answering: [Yamada et al., 2017](https://arxiv.org/abs/1803.08652), [Poerner et al., 2020](https://arxiv.org/abs/1911.03681).
* Entity typing: [Yamada et al., 2018](https://arxiv.org/abs/1806.02960).
* Text classification: [Yamada et al., 2018](https://arxiv.org/abs/1806.02960), [Yamada and Shindo, 2019](https://arxiv.org/abs/1909.01259), [Alam et al., 2020](https://link.springer.com/chapter/10.1007/978-3-030-61244-3_9).
* Relation classification: [Poerner et al., 2020](https://arxiv.org/abs/1911.03681).
* Paraphrase detection: [Duong et al., 2018](https://ieeexplore.ieee.org/abstract/document/8606845).
* Knowledge graph completion: [Shah et al., 2019](https://aaai.org/ojs/index.php/AAAI/article/view/4162), [Shah et al., 2020](https://www.aclweb.org/anthology/2020.textgraphs-1.9/).
* Fake news detection: [Singh et al., 2019](https://arxiv.org/abs/1906.11126), [Ghosal et al., 2020](https://arxiv.org/abs/2010.10836).
* Plot analysis of movies: [Papalampidi et al., 2019](https://arxiv.org/abs/1908.10328).
* Novel entity discovery: [Zhang et al., 2020](https://arxiv.org/abs/2002.00206).
* Entity retrieval: [Gerritse et al., 2020](https://link.springer.com/chapter/10.1007%2F978-3-030-45439-5_7).
* Deepfake detection: [Zhong et al., 2020](https://arxiv.org/abs/2010.07475).
* Conversational information seeking: [Rodriguez et al., 2020](https://arxiv.org/abs/2005.00172).
* Query expansion: [Rosin et al., 2020](https://arxiv.org/abs/2012.12065).

### References
- Entity linking: [Yamada et al., 2016](https://arxiv.org/abs/1601.01343), [Eshel et al., 2017](https://arxiv.org/abs/1706.09147), [Chen et al., 2019](https://arxiv.org/abs/1911.03834), [Poerner et al., 2020](https://arxiv.org/abs/1911.03681), [van Hulst et al., 2020](https://arxiv.org/abs/2006.01969).
- Named entity recognition: [Sato et al., 2017](http://www.aclweb.org/anthology/I17-2017), [Lara-Clares and Garcia-Serrano, 2019](http://ceur-ws.org/Vol-2421/eHealth-KD_paper_6.pdf).
- Question answering: [Yamada et al., 2017](https://arxiv.org/abs/1803.08652), [Poerner et al., 2020](https://arxiv.org/abs/1911.03681).
- Entity typing: [Yamada et al., 2018](https://arxiv.org/abs/1806.02960).
- Text classification: [Yamada et al., 2018](https://arxiv.org/abs/1806.02960), [Yamada and Shindo, 2019](https://arxiv.org/abs/1909.01259), [Alam et al., 2020](https://link.springer.com/chapter/10.1007/978-3-030-61244-3_9).
- Relation classification: [Poerner et al., 2020](https://arxiv.org/abs/1911.03681).
- Paraphrase detection: [Duong et al., 2018](https://ieeexplore.ieee.org/abstract/document/8606845).
- Knowledge graph completion: [Shah et al., 2019](https://aaai.org/ojs/index.php/AAAI/article/view/4162), [Shah et al., 2020](https://www.aclweb.org/anthology/2020.textgraphs-1.9/).
- Fake news detection: [Singh et al., 2019](https://arxiv.org/abs/1906.11126), [Ghosal et al., 2020](https://arxiv.org/abs/2010.10836).
- Plot analysis of movies: [Papalampidi et al., 2019](https://arxiv.org/abs/1908.10328).
- Novel entity discovery: [Zhang et al., 2020](https://arxiv.org/abs/2002.00206).
- Entity retrieval: [Gerritse et al., 2020](https://link.springer.com/chapter/10.1007%2F978-3-030-45439-5_7).
- Deepfake detection: [Zhong et al., 2020](https://arxiv.org/abs/2010.07475).
- Conversational information seeking: [Rodriguez et al., 2020](https://arxiv.org/abs/2005.00172).
- Query expansion: [Rosin et al., 2020](https://arxiv.org/abs/2012.12065).

## References

If you use Wikipedia2Vec in a scientific publication, please cite the following paper:

Expand Down Expand Up @@ -86,6 +87,6 @@ Ikuya Yamada, Hiroyuki Shindo, [Neural Attentive Bag-of-Entities Model for Text
}
```

### License
## License

[Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0)
7 changes: 4 additions & 3 deletions mkdocs.yml
Expand Up @@ -3,19 +3,20 @@ site_description: Wikipedia2Vec is a tool used for obtaining embeddings (vector
repo_url: https://github.com/wikipedia2vec/wikipedia2vec
edit_uri: ""
site_author: Studio Ousia
pages:
nav:
- Home: index.md
- Introduction: intro.md
- User Guide:
- Installation: install.md
- Learning Embeddings: commands.md
- API Usage: usage.md
- Pretrained Embeddings: pretrained.md
- Embeddings: pretrained.md
- Demo: "https://wikipedia2vec.github.io/demo/"
extra_css:
- css/extra.css
theme:
name: mkdocs
custom_dir: docs/custom_theme
markdown_extensions:
- toc:
permalink:
permalink: "#"

0 comments on commit 3fc51ac

Please sign in to comment.