Model cannot be output if the raise the dimension of word vector #17

chuangys · 2017-10-01T22:41:59Z

Everything is okay within the default parameter setting. But when I raised the dimension of word vector too 200, or 300. The model training is still fast but hang at model output. Could you help to check it?

pommedeterresautee · 2017-10-03T10:56:23Z

Hi, can you provide some code? (one I can test, with some data)

I just tried

library(fastrtext)

data("train_sentences")
data("test_sentences")

# prepare data
tmp_file_model <- tempfile()

train_labels <- paste0("__label__", train_sentences[,"class.text"])
train_texts <- tolower(train_sentences[,"text"])
train_to_write <- paste(train_labels, train_texts)
train_tmp_file_txt <- tempfile()
writeLines(text = train_to_write, con = train_tmp_file_txt)

test_labels <- paste0("__label__", test_sentences[,"class.text"])
test_texts <- tolower(test_sentences[,"text"])
test_to_write <- paste(test_labels, test_texts)

# learn model
execute(commands = c("supervised", "-input", train_tmp_file_txt,
                     "-output", tmp_file_model, "-dim", 200, "-lr", 1,
                     "-epoch", 20, "-wordNgrams", 2, "-verbose", 1))

model <- load_model(tmp_file_model)
predict(model, sentences = test_sentences[1, "text"])

And had no issue...

Can you try -verbose 1 in your command line?

chuangys · 2017-10-26T03:31:24Z

@pommedeterresautee
Your code is running well at my environment. So I have to correct my problem.
Apply the same example data, and I use the pre-trained vector, than can reproduce the hang at model output issue.

Source code below:

library(fastrtext)
data("train_sentences")
data("test_sentences")
tmp_file_model <- tempfile(); print(tmp_file_model);
train_labels <- paste0("label", train_sentences[,"class.text"])
train_texts <- tolower(train_sentences[,"text"])
train_to_write <- paste(train_labels, train_texts)
train_tmp_file_txt <- tempfile(); print(train_tmp_file_txt);
writeLines(text = train_to_write, con = train_tmp_file_txt)
execute(commands = c("supervised", "-input", train_tmp_file_txt,
"-output", tmp_file_model, "-dim", 300, "-lr", 1,
"-epoch", 300, "-wordNgrams", 2, "-verbose", 1,
"-pretrainedVectors", "e:/baproject/data/pretrainedword2vec/wiki-news-300d-1M.vec"))

The wiki-news-300d-1M.vec download from facebookresearch pre-trained vector at below website.
https://fasttext.cc/docs/en/english-vectors.html

pommedeterresautee · 2017-11-20T21:36:52Z

it may be related to RAM issue. Did you fixed it?

dockstreet · 2018-01-08T21:01:42Z

Hi - I'm having the same issue as @chuangys, it seems to hang on the larger vec file ? I have 16GB of RAM

pommedeterresautee · 2018-01-08T22:15:16Z

Have you some test code? Did you checked the RAM (model trained by Facebook are quite big).

dockstreet · 2018-01-09T17:50:04Z

I do.

execute(commands = c("supervised", "-input", "C:/Users/xxx/R/fasttext_test/train.txt",
"-output", "C:/Users/xxx/R/fasttext_test/train.bin","-lr", 1,
"-epoch", 50,"-wordNgrams", 2, "-verbose", 1 ))

This worked (while the Facebook one would not) - however I'm using pre trained vectors :
https://github.com/jazzyarchitects/fasttext-node/raw/master/train.txt

Here is the RAM size

memory.limit()
[1] 16204

Would you know of a larger example I could try with fastrtext to try that you know works with a pretrained vec from an external source? It may help clarify if it's my environment or not

datalee · 2018-01-22T09:35:50Z

hi，i have a question: the arguments of ' pretrainedVectors does not support the vec products by gensim ?thks

pommedeterresautee · 2018-01-22T10:53:43Z

@datalee what is the feature you are referring to?

datalee · 2018-01-22T13:12:08Z

@pommedeterresautee classification.

pommedeterresautee · 2018-01-22T16:12:53Z

pretrainedVectors is the text file produced by fasttext when you learn a model, whatever it is. I don't know the format of gensim but should not be hard to convert (word\tvector where each value is separated by a space).

dockstreet mentioned this issue Jan 5, 2018

Question - how to load vec & bin file from external source? #22

Closed

pommedeterresautee closed this as completed Mar 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model cannot be output if the raise the dimension of word vector #17

Model cannot be output if the raise the dimension of word vector #17

chuangys commented Oct 1, 2017

pommedeterresautee commented Oct 3, 2017 •

edited

Loading

chuangys commented Oct 26, 2017 •

edited

Loading

pommedeterresautee commented Nov 20, 2017

dockstreet commented Jan 8, 2018

pommedeterresautee commented Jan 8, 2018

dockstreet commented Jan 9, 2018 •

edited

Loading

datalee commented Jan 22, 2018 •

edited

Loading

pommedeterresautee commented Jan 22, 2018

datalee commented Jan 22, 2018

pommedeterresautee commented Jan 22, 2018

Model cannot be output if the raise the dimension of word vector #17

Model cannot be output if the raise the dimension of word vector #17

Comments

chuangys commented Oct 1, 2017

pommedeterresautee commented Oct 3, 2017 • edited Loading

chuangys commented Oct 26, 2017 • edited Loading

pommedeterresautee commented Nov 20, 2017

dockstreet commented Jan 8, 2018

pommedeterresautee commented Jan 8, 2018

dockstreet commented Jan 9, 2018 • edited Loading

datalee commented Jan 22, 2018 • edited Loading

pommedeterresautee commented Jan 22, 2018

datalee commented Jan 22, 2018

pommedeterresautee commented Jan 22, 2018

pommedeterresautee commented Oct 3, 2017 •

edited

Loading

chuangys commented Oct 26, 2017 •

edited

Loading

dockstreet commented Jan 9, 2018 •

edited

Loading

datalee commented Jan 22, 2018 •

edited

Loading