Skip to content
document embedding and machine learning script for beginners
Branch: master
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information.
data Update requirements.txt Apr 3, 2019
model add model directory Feb 17, 2017
scripts add script directory Dec 10, 2017
COPYING Update Jun 30, 2018
requirements.txt Update requirements.txt Apr 3, 2019


The repository contains some corpus(Korean), python scripts for training and inferring test document vectors using doc2vec.

Demo Site

Raw Corpus

PreTrained Doc2vec Model

Korean word2vec-api / doc2vec-api

Simple web service providing a word embedding API. The methods are based on Gensim Word2Vec / Doc2Vec implementation. Models are passed as parameters and must be in the Word2Vec / Doc2Vec text or binary format. This web2vec-api script is forked from this word2vec-api github and get minor update to support Korean word2vec models.

  • Install Dependencies
pip2 install -r requirements.txt
  • Launching the service
python word2vec-api --model path/to/the/model [--host host --port 1234]
ex) python /home/ --model /home/model/all_terms_50vectors --path /word2vec --host --port 4000

python doc2vec-api --model path/to/the/model [--host host --port 1234]
ex) python /home/ --model /home/model/all_terms_50vectors --path /doc2vec --host --port 4000

  • Example calls
You can’t perform that action at this time.