C++ R
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.Rproj.user
R
inst
man
src
tests
.Rbuildignore
.Rhistory
.travis.yml
DESCRIPTION
NAMESPACE
NEWS.md
README.md
appveyor.yml
codecov.yml
fastTextR.Rproj

README.md

Travis-CI Build Status codecov.io AppVeyor build status

fastTextR


The fastTextR package is an R wrapper (only) for the skipgram and cbow functions of the fastText library. fastText is a library for efficient learning of word representations and sentence classification. Since it uses C++11 features, it requires a compiler with good C++11 support. These include : (gcc-4.6.3 or newer) or (clang-3.3 or newer). More information about the fastText library can be found in https://github.com/facebookresearch/fastText. COPYRIGHTS, LICENSE and PATENTS files can be found in the inst folder of the R package.

A detailed example can be found in my blog-post about text processing, in section 'word vectors'.


To install the package from Github use the install_github function of the devtools package,

devtools::install_github('mlampros/fastTextR')


Use the following link to report bugs/issues (for the R wrapper),

https://github.com/mlampros/fastTextR/issues


Example usage


# example input data ---> 'dat.txt'



library(fastTextR)



#--------------------------
# skipgram or cbow methods
#--------------------------


res = skipgram_cbow(input_path = "/data_fasttext/dat.txt",

                    output_path = "/data_fasttext/model", 
                    
                    method = "skipgram", lr = 0.1, 
                    
                    lrUpdateRate = 100, dim = 100,
                    
                    ws = 5, epoch = 5, minCount = 1, 
                    
                    neg = 5, wordNgrams = 1, loss = "ns", 
                    
                    bucket = 2000000, minn = 0,
                    
                    maxn = 0, thread = 6, t = 0.0001, 
                    
                    verbose = 2)
                    
                    
                
#-------------------------------------------------------------
# prediction of unknown words for the skipgram and cbow models
#-------------------------------------------------------------


res = predict_unknown_words(skipgram_cbow_model_output = "/data_fasttext/model.bin",

                            unknown_words_path = "/data_fasttext/queries.txt",
                            
                            output_path = "/data_fasttext/NEW_VEC",
                            
                            verbose = TRUE)

More information about the parameters of each function can be found in the package documentation.