Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Questions on the input format for charRNN model (in R) #2774

Closed
rajafan opened this issue Jul 20, 2016 · 3 comments
Closed

Questions on the input format for charRNN model (in R) #2774

rajafan opened this issue Jul 20, 2016 · 3 comments

Comments

@rajafan
Copy link

rajafan commented Jul 20, 2016

I am trying to develop the charRNN model using mx.lstm in R. The following questions arose, and more specifically, on the input data format. Any help or clarification would be highly appreciated. Sorry for this lengthy question, as I am trying to make my question as clear as possible.

Suppose a text file contains M unique characters and has a total length of N:

  1. From what I saw from the mxnet/R-package/vignettes/CharRnnModel.Rmd file, there is no step doing the 1-of-k encoding. From what I understand, 1-of-k encoding will transform each character to a vector of length M, and the values are all 0 except using 1 at the position denoting the character. However, in the vignette file provided, each character is represented as a single value from 0 to 64 (given that there are 65 unique characters in the text file, i.e., M = 65) and fed directly to the mx.lstm function. Is this correct? Also, why using values from '0 to 64' to represents 65 uniques characters, rather than using '1 to 65'? The latter question arises as the model performance is observed to be poor when '1 to 65' is used for encoding.
  2. I noticed that the input data are prepared in different ways as described in file mxnet/R-package/vignettes/CharRnnModel.Rmd and dmlc/mxnet-gtc-tutorial/char-lstm/char_lstm.R. In the first file, the characters are prepared as a matrix where each column represents a sequence and the total row number equals to a model parameter seq.len, while in the latter file each row represents as a sequence and the total row number equals to a model parameter batch.size. I am not sure which one to use when preparing the input data. To me, these two approaches are quite different and I am not sure which one to follow.

Thanks in advance for any help.

@rajafan
Copy link
Author

rajafan commented Jul 20, 2016

@ziyeqinghan @piiswrong , Would be great to see your comments. Thx.

@thirdwing thirdwing added the R label Jul 18, 2017
@ankkhedia
Copy link
Contributor

Hi @rajafan MXNet R package RNN and LSTM API has undergone significant change in the last year and the API mx.lstm no more exist. However, there are various new functionalities added and you can check out nice examples of time series, language modeling and few other use cases here.
https://jeremiedb.github.io/mxnet_R_bucketing/LanguageModel_GPU.html

Please try out the latest API for your use case and let us know if you still need assistance regarding anything. Also please provide sample code snippet or links so that we can help you better.

@sandeep-krishnamurthy @nswamy Could you please add the tag Pending Requester Info to this issue

@ankkhedia
Copy link
Contributor

@sandeep-krishnamurthy @nswamy Could you please close this issue due to inactivity

@rajafan Please feel free to reopen if closed in error

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants