Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

refactor lstm model and add documentations in R#2329

Merged
thirdwing merged 8 commits intoapache:masterfrom
ziyeqinghan:master
Jun 7, 2016
Merged

refactor lstm model and add documentations in R#2329
thirdwing merged 8 commits intoapache:masterfrom
ziyeqinghan:master

Conversation

@ziyeqinghan
Copy link
Contributor

@ziyeqinghan ziyeqinghan commented Jun 3, 2016

I refactor the LSTM model in [#1673] to some high-level reusable API for R users. The high-level API include:

  • mx.lstm Training LSTM Unrolled Model
  • mx.lstm.inference Create a LSTM Inference Model
  • mx.lstm.forward Forward function for LSTM inference model

Also, I add a char-rnn example and documentation to show how to use lstm model to build a char level language model and generate text from it.

The training result is:

Epoch [31] Train: NLL=3.47213018872144, Perp=32.2052727363657
...
Epoch [961] Train: NLL=2.32060007657895, Perp=10.181782322355
Iter [1] Train: Time: 186.397065639496 sec, NLL=2.31135356537961, Perp=10.0880702804858
Iter [1] Val: NLL=1.94184484060012, Perp=6.97160060607419
Epoch [992] Train: NLL=1.84784553299322, Perp=6.34613225095329
...
Epoch [1953] Train: NLL=1.70175791172558, Perp=5.48357857093351
Iter [2] Train: Time: 188.929051160812 sec, NLL=1.70103940328978, Perp=5.47963998859367
Iter [2] Val: NLL=1.74979316010449, Perp=5.75341251767988
...
Epoch [2914] Train: NLL=1.54738185300295, Perp=4.69915099483974
Iter [3] Train: Time: 185.425321578979 sec, NLL=1.54604189517013, Perp=4.69285854740519
Iter [3] Val: NLL=1.67780240235925, Perp=5.35377758479576
Epoch [2945] Train: NLL=1.48868466087876, Perp=4.43126307034767
...
Iter [4] Train: Time: 185.487086296082 sec, NLL=1.4744973925858, Perp=4.36883940994296
Iter [4] Val: NLL=1.64488167325603, Perp=5.18039689118454
Epoch [3937] Train: NLL=1.46355541021581, Perp=4.32129622881604
...
Epoch [4898] Train: NLL=1.42900458455642, Perp=4.17454171976281
Iter [5] Train: Time: 185.070136785507 sec, NLL=1.42909226256273, Perp=4.17490775130428
Iter [5] Val: NLL=1.62716655804022, Perp=5.08943365437187

The result of generating a sequence of 75 chars is:

ah not a drobl greens
Settled asing lately sistering sounted to their hight

@thirdwing
Copy link
Contributor

The testing failed because you didn't provide the input.txt.

There are other minor issues. I will look into it.

@terrytangyuan
Copy link
Member

@ziyeqinghan Awesome! Could you also post some results here from the added example?

@ziyeqinghan
Copy link
Contributor Author

ziyeqinghan commented Jun 3, 2016

@thirdwing I have added download function to get the input.txt.

@ziyeqinghan
Copy link
Contributor Author

@terrytangyuan I updated my comment to post some results here from the added example. :)

@@ -0,0 +1,589 @@
require(mxnet)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line

@terrytangyuan
Copy link
Member

@ziyeqinghan Thanks! Could you perhaps add some simple unit tests (mini-example) so it's easier for other people to contribute in the future?


```{r}
model <- mx.lstm(X.train, X.val,
ctx=mx.gpu(0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use CPU instead.

@terrytangyuan
Copy link
Member

@ziyeqinghan A few comments:

  1. Try use apply family methods instead of slow for loop if a lot of iterations are required
  2. Even though the current code base does not have consistent code style and lint check, let's try keep it more consistent (this can be a separate follow-up PR), e.g. variable naming, spaces between math expressions, etc.
  3. Some regex can be combined, conditions can be combined using functions like, e.g. all() or any()
  4. Good to have some small unit tests (can be separate PR).

@thirdwing
Copy link
Contributor

We can add lint if we really want https://github.com/jimhester/lintr

@terrytangyuan
Copy link
Member

Yeah let's add that and have some important linters first.
On Jun 3, 2016 3:10 PM, "Qiang Kou (KK)" notifications@github.com wrote:

We can add lint if we really want https://github.com/jimhester/lintr


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2329 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AEEnSpD8j_BBLdMdujMT1BEVn0bCMtM4ks5qIIowgaJpZM4Iteaz
.

@thirdwing
Copy link
Contributor

This is a built-in function in RStudio and can be added into travis testing.

On Fri, Jun 3, 2016 at 4:14 PM, Yuan (Terry) Tang notifications@github.com
wrote:

Yeah let's add that and have some important linters first.
On Jun 3, 2016 3:10 PM, "Qiang Kou (KK)" notifications@github.com wrote:

We can add lint if we really want https://github.com/jimhester/lintr


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2329 (comment), or
mute
the thread
<
https://github.com/notifications/unsubscribe/AEEnSpD8j_BBLdMdujMT1BEVn0bCMtM4ks5qIIowgaJpZM4Iteaz

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2329 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ABebVQd12omBcLZGQ1vuBs3anO5Uioncks5qIIsOgaJpZM4Iteaz
.

Qiang Kou
qkou@umail.iu.edu
School of Informatics and Computing, Indiana University

@terrytangyuan
Copy link
Member

@thirdwing I added that for xgboost before but haven't enabled yet, here's the example you could borrow.

@ziyeqinghan ziyeqinghan force-pushed the master branch 3 times, most recently from f46145e to 3ff8f0f Compare June 5, 2016 12:24
@ziyeqinghan
Copy link
Contributor Author

I have modified the codes and have several questions.

  1. I don't know which regex can be combined using all or any.
  2. I'm not clear about how to have unit test on models, maybe check if the training error is deceasing or something else?

Could you give me some suggestions or examples? Thank you so much!

@terrytangyuan
Copy link
Member

@ziyeqinghan The changes look good to me. Yea assert on training error would be good for now. Make sure you are using a very small sample of data and train just enough steps so it won't take too long.

download.file(url='https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt',
destfile='input.txt', method='wget')
}
setwd("..")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this a little bit? Please don't change the working directory, or it might fail other tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed it. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will merge this if all tests pass.

On Mon, Jun 6, 2016 at 9:47 AM, Yuqi Li notifications@github.com wrote:

In R-package/vignettes/CharRnnModel.Rmd
#2329 (comment):

+num.round = 5
+learning.rate= 0.1
+wd=0.00001
+clip_gradient=1
+update.period = 1
+ +download the data. +{r}
+download.data <- function(data_dir) {

  • dir.create(data_dir, showWarnings = FALSE)
  • setwd(data_dir)
  • if (!file.exists('input.txt')) {
  •    download.file(url='https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt',
    
  •                  destfile='input.txt', method='wget')
    
  • }
  • setwd("..")

I have changed it. :)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/dmlc/mxnet/pull/2329/files/ace088ea4cba974a18f13978009044c231eecead#r65903056,
or mute the thread
https://github.com/notifications/unsubscribe/ABebVf58crQ7KFXmw8S7D6hX9YUXiXiWks5qJDMQgaJpZM4Iteaz
.

Qiang Kou
qkou@umail.iu.edu
School of Informatics and Computing, Indiana University

@thirdwing
Copy link
Contributor

https://travis-ci.org/dmlc/mxnet/jobs/135611995#L2930

The testing need too much time. Maybe we need to use another sets of parameters, or just skip the RNN example.

@ziyeqinghan
Copy link
Contributor Author

I changed the parameter num.round smalller to reduce the running time.

@thirdwing
Copy link
Contributor

It is quite weird the mnist test failed. This is not your fault. Let's comment out related lines and merge your PR. I will fix that later.

In
https://github.com/ziyeqinghan/mxnet/blob/master/R-package/vignettes/mnistCompetition.Rmd#prediction-and-submission

You should find

submission <- data.frame(ImageId=1:ncol(test), Label=pred.label)
#write.csv(submission, file='submission.csv', row.names=FALSE, quote=FALSE)

Please also comment out the sub.mission line

@ziyeqinghan
Copy link
Contributor Author

The reason why the mnist test failed is that I modified mx.io.arrayiter to let label support multi-dimension and made a mistake.

I'm sorry and have fixed the bug.

@thirdwing thirdwing merged commit 55ced9e into apache:master Jun 7, 2016
CharlesMei pushed a commit to CharlesMei/mxnet that referenced this pull request Jun 7, 2016
* modify mx.io.arrayiter to let label support multi-dimension

* refactor lstm model to some high-level reusable API in R

* add the rnn example and documentations of Char-RNN model in R

* modify some codes of lstm model in R

* add unit-test for lstm model in R

* modify the code in CharRnnModel.Rmd to unchange the working directory

* change num.round smaller to decrease the running time

* fix a bug on mx.io.arrayiter
This was referenced Jun 14, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants