Questions on the input format for charRNN model (in R) #2774

rajafan · 2016-07-20T06:04:54Z

I am trying to develop the charRNN model using mx.lstm in R. The following questions arose, and more specifically, on the input data format. Any help or clarification would be highly appreciated. Sorry for this lengthy question, as I am trying to make my question as clear as possible.

Suppose a text file contains M unique characters and has a total length of N:

From what I saw from the mxnet/R-package/vignettes/CharRnnModel.Rmd file, there is no step doing the 1-of-k encoding. From what I understand, 1-of-k encoding will transform each character to a vector of length M, and the values are all 0 except using 1 at the position denoting the character. However, in the vignette file provided, each character is represented as a single value from 0 to 64 (given that there are 65 unique characters in the text file, i.e., M = 65) and fed directly to the mx.lstm function. Is this correct? Also, why using values from '0 to 64' to represents 65 uniques characters, rather than using '1 to 65'? The latter question arises as the model performance is observed to be poor when '1 to 65' is used for encoding.
I noticed that the input data are prepared in different ways as described in file mxnet/R-package/vignettes/CharRnnModel.Rmd and dmlc/mxnet-gtc-tutorial/char-lstm/char_lstm.R. In the first file, the characters are prepared as a matrix where each column represents a sequence and the total row number equals to a model parameter seq.len, while in the latter file each row represents as a sequence and the total row number equals to a model parameter batch.size. I am not sure which one to use when preparing the input data. To me, these two approaches are quite different and I am not sure which one to follow.

Thanks in advance for any help.

rajafan · 2016-07-20T08:09:51Z

@ziyeqinghan @piiswrong , Would be great to see your comments. Thx.

ankkhedia · 2018-08-16T23:50:53Z

Hi @rajafan MXNet R package RNN and LSTM API has undergone significant change in the last year and the API mx.lstm no more exist. However, there are various new functionalities added and you can check out nice examples of time series, language modeling and few other use cases here.
https://jeremiedb.github.io/mxnet_R_bucketing/LanguageModel_GPU.html

Please try out the latest API for your use case and let us know if you still need assistance regarding anything. Also please provide sample code snippet or links so that we can help you better.

@sandeep-krishnamurthy @nswamy Could you please add the tag Pending Requester Info to this issue

ankkhedia · 2018-08-31T16:49:39Z

@sandeep-krishnamurthy @nswamy Could you please close this issue due to inactivity

@rajafan Please feel free to reopen if closed in error

thirdwing added the R label Jul 18, 2017

nswamy added the Pending Requester Info label Aug 17, 2018

sandeep-krishnamurthy closed this as completed Aug 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions on the input format for charRNN model (in R) #2774

Questions on the input format for charRNN model (in R) #2774

rajafan commented Jul 20, 2016

rajafan commented Jul 20, 2016

ankkhedia commented Aug 16, 2018

ankkhedia commented Aug 31, 2018

Questions on the input format for charRNN model (in R) #2774

Questions on the input format for charRNN model (in R) #2774

Comments

rajafan commented Jul 20, 2016

rajafan commented Jul 20, 2016

ankkhedia commented Aug 16, 2018

ankkhedia commented Aug 31, 2018