workable example needed #7

nicolas-ivanov · 2015-11-16T15:52:59Z

Dear developers, thank you for your work!
It would be of great help to use your seq2seq implementation, however after a considerable amount of time and efforts I still can't do it due to the lack of documentation and examples. Keras docs don't tell much about seq2seq mapping either. So please add a simple workable code example that would demostrate the usage of your library: how to prepare the input data and how to implement the training and predicting procedures. Would be greatly appriciated.

farizrahman4u · 2015-11-17T13:53:20Z

This will be done as soon as possible. I am having university exams right now, and there are no other contributors involved. In the mean time, you could post links to datasets appropriate for seq2seq model here in this issue, and I would write examples for training on the same.

viveksck · 2015-11-17T16:45:03Z

@farizrahman4u: As a simple example could you post an example to do POS Tagging using SeqToSeq. One can use standard datasets : http://www.cnts.ua.ac.be/conll2000/chunking/ as data. This should be a good starting point for people as POS tagging is easily understood and there are tons of baselines to compare against. Besides NLTK also has support for CONLL2000 tasks.

placebokkk · 2015-11-19T05:53:18Z

@viveksck I think the seq2seq is not designed for Tagging task. For the tagging task needs that the output sequence is aligned with the input sequence. For seq2seq, we need a QA or MT dataset.

viveksck · 2015-11-19T15:51:46Z

So synced input output is not supported in seq2seq keras-team/keras#168. Is that correct ?

farizrahman4u · 2015-11-19T16:07:29Z

@viveksck keras-team/keras#168 can be done using a single LSTM layer.

avisingh599 · 2015-11-23T13:42:28Z

One suggestion for dataset: VQA

In this dataset, the aim is to answer open-ended questions about images. The answer length is variable (though about 80% of answers are in the 1-3 words range). As an example for seq2seq, we can just ignore the image, and try to answer using only the question (this should give a performance of about 40-45% on the validation set). If this works out, I can later create an example that looks at the image, and uses seq2seq for modeling the text.

The dataset is fairly large (about 248K train and 121K val examples), so we should be able to learn some meaningful structure using LSTM encoder/decoder pairs.

I have written some helper scripts for working with this dataset, and can be used for converting their json files to a simpler (and faster to load) plain text format. You can find them here, if needed.

farizrahman4u · 2015-11-23T16:18:30Z

@avisingh599 Cool! I will check it out.

libraua · 2015-11-29T10:24:08Z

Here is an interesting topic about datasets for sequence to sequence.
Seems like dataset of subtitles can be useful. link

nicolas-ivanov · 2015-11-30T10:13:45Z

@libraua, thank you! The reddit post has quite useful links in the comments.

libraua · 2015-12-01T20:13:53Z

@nicolas-ivanov would you be so kind to share with us your findings? Does it work?

nicolas-ivanov · 2015-12-01T21:44:52Z

@libraua at least I managed to run fit() and predict() functions :)

Unfortunately, I get only rubbish predictions so far, and I'm willing to concede that there is some bug in my interpretation of predicted vectors, however the weird thing is that it takes enormous amount of time to make a prediction of one sentence - up to a minute. For comparison, my sequence to word model did this job in less than a second. Regarding this slowness I can only assume that there's a defect in Fariz's seq2seq implementation...

I'll try to structure my code to some readable form and then share the link to it here on Thursday.

farizrahman4u · 2015-12-02T14:12:57Z

@nicolas-ivanov Can you please post your dataset here?

nicolas-ivanov · 2015-12-02T18:40:17Z

@farizrahman4u sure! tomorrow

nicolas-ivanov · 2015-12-03T15:23:45Z

@libraua here comes my usage example of seq2seq: https://github.com/nicolas-ivanov/debug_seq2seq
There is quite a lot of code but I hope it's still readable.

The model trains and now even predicts some sequences that have a potential to turn into meaningful statements after a substantial number of iterations...
The only misfortune so far is that the model can't be saved and restored properly, supposedly due to some layers inconsistency in seq2seq. I'll report this issue now.

@farizrahman4u you can find the dataset for training here. However the problem I mentioned before about slow predictions has gone as soon as started using a simple argmax() for restoring words from vectors instead of trying to diversify the generated results.
So if you have time, please take a look at the issue with model.save_weights().
Thanks

farizrahman4u · 2015-12-03T19:41:40Z

I will fix it soon. Meanwhile, you can save your model as is using CPickle.

nicolas-ivanov · 2015-12-03T20:33:47Z

Ok, thank you

viksit · 2015-12-04T01:31:13Z

@nicolas-ivanov in your system, it appears that all predictions are " . . . . . . " - is this something you've seen before? I haven't debugged your code just yet, so not sure whats going on.

nicolas-ivanov · 2015-12-04T07:02:50Z

@viksit right, I got the same, and after a while the model starts to return sequences like " i , . $$$ $$$ . . . " where $$$ represents the end of sentence symbol. That looks a bit more similar to what we should get - at least the end of the sentence is present, - but that's the best I could get on my laptop using the tiny parameters from config.py
Running this example on GPU with larger params (i.e. more hidden layers with larger dimensionalities, might bring better results.

lkluo · 2017-04-25T06:27:33Z

This library is good, however, i strongly suggest to write your code at least based on Keras or tensorflow, which are more flexible for your special need.

amelmusic · 2018-09-05T09:09:40Z

It would be great if someone could convert this example with prediction as well: https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html to use seq2seq library.

I tried, but couldn't get results anywhere near those when using example from keras.

Thnx!

nicolas-ivanov closed this as completed Sep 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

workable example needed #7

workable example needed #7

nicolas-ivanov commented Nov 16, 2015

farizrahman4u commented Nov 17, 2015

viveksck commented Nov 17, 2015

placebokkk commented Nov 19, 2015

viveksck commented Nov 19, 2015

farizrahman4u commented Nov 19, 2015

avisingh599 commented Nov 23, 2015

farizrahman4u commented Nov 23, 2015

libraua commented Nov 29, 2015

nicolas-ivanov commented Nov 30, 2015

libraua commented Dec 1, 2015

nicolas-ivanov commented Dec 1, 2015

farizrahman4u commented Dec 2, 2015

nicolas-ivanov commented Dec 2, 2015

nicolas-ivanov commented Dec 3, 2015

farizrahman4u commented Dec 3, 2015

nicolas-ivanov commented Dec 3, 2015

viksit commented Dec 4, 2015

nicolas-ivanov commented Dec 4, 2015

lkluo commented Apr 25, 2017

amelmusic commented Sep 5, 2018

workable example needed #7

workable example needed #7

Comments

nicolas-ivanov commented Nov 16, 2015

farizrahman4u commented Nov 17, 2015

viveksck commented Nov 17, 2015

placebokkk commented Nov 19, 2015

viveksck commented Nov 19, 2015

farizrahman4u commented Nov 19, 2015

avisingh599 commented Nov 23, 2015

farizrahman4u commented Nov 23, 2015

libraua commented Nov 29, 2015

nicolas-ivanov commented Nov 30, 2015

libraua commented Dec 1, 2015

nicolas-ivanov commented Dec 1, 2015

farizrahman4u commented Dec 2, 2015

nicolas-ivanov commented Dec 2, 2015

nicolas-ivanov commented Dec 3, 2015

farizrahman4u commented Dec 3, 2015

nicolas-ivanov commented Dec 3, 2015

viksit commented Dec 4, 2015

nicolas-ivanov commented Dec 4, 2015

lkluo commented Apr 25, 2017

amelmusic commented Sep 5, 2018