## Installation

To install the latest version from pip, run the following

In [None]:
!pip install worde4mde

In [None]:
Alternatively, to use the source code version, run:

In [None]:
import sys
sys.path.append("../src")

## Use

The library simplify the loading of the different models trained as part of WordE4MDE. In particular, the following models are available:
* **sgram-mde**: A word2vec model trained with modeling texts. It is the smaller model but performs similarly to the others.
* **sgram-mde-so**. A similar model but trained also with posts from StackOverlfow.
* **glove-mde**: A GloVe model trained with modeling texts. Also a small model.
* **fasttext-mde**: A [FastText](https://fasttext.cc/) model which solves the out-of-vocabulary problem by including subword information. This model is much larger than the others (~2GB).
* **fasttext-mde-so**: A similar model but trained also with posts from StackOverflow.

Loading a model is very simple using the `load_embeddings` function, which takes care of downloading the model and storing it in a the `.worde4mde` folder in the user home.

In [None]:
import worde4mde

In [None]:
model_id = 'sgram-mde'
# model_id = 'glove-mde'
# model_id = 'fasttext-mde'
# model_id = 'fasttext-mde-so'
model_id = 'sgram-mde-so'

model = worde4mde.load_embeddings(embedding_model=model_id)

## Example

As a simple example of using the model, let's build a function to compute the words that are most similar to a given one.

In [None]:
def similar_words_to(model, term, topn = 10):
    "Returns the top n most similar words using gensim facilities"
    
    words = []
    similar = model.most_similar(positive=[term], topn = topn)
    for word, score in similar:
       words.append(word)
    return words

In [None]:
similar_words_to(model, 'transformation', topn = 20)

# If the model is FastText, then it has to be model.wv to pass a gensim model