Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workable example needed #7

Closed
nicolas-ivanov opened this issue Nov 16, 2015 · 20 comments
Closed

workable example needed #7

nicolas-ivanov opened this issue Nov 16, 2015 · 20 comments

Comments

@nicolas-ivanov
Copy link
Contributor

Dear developers, thank you for your work!
It would be of great help to use your seq2seq implementation, however after a considerable amount of time and efforts I still can't do it due to the lack of documentation and examples. Keras docs don't tell much about seq2seq mapping either. So please add a simple workable code example that would demostrate the usage of your library: how to prepare the input data and how to implement the training and predicting procedures. Would be greatly appriciated.

@farizrahman4u
Copy link
Owner

This will be done as soon as possible. I am having university exams right now, and there are no other contributors involved. In the mean time, you could post links to datasets appropriate for seq2seq model here in this issue, and I would write examples for training on the same.

@viveksck
Copy link

@farizrahman4u: As a simple example could you post an example to do POS Tagging using SeqToSeq. One can use standard datasets : http://www.cnts.ua.ac.be/conll2000/chunking/ as data. This should be a good starting point for people as POS tagging is easily understood and there are tons of baselines to compare against. Besides NLTK also has support for CONLL2000 tasks.

@placebokkk
Copy link

@viveksck I think the seq2seq is not designed for Tagging task. For the tagging task needs that the output sequence is aligned with the input sequence. For seq2seq, we need a QA or MT dataset.

@viveksck
Copy link

So synced input output is not supported in seq2seq keras-team/keras#168. Is that correct ?

@farizrahman4u
Copy link
Owner

@viveksck keras-team/keras#168 can be done using a single LSTM layer.

@avisingh599
Copy link

One suggestion for dataset: VQA

In this dataset, the aim is to answer open-ended questions about images. The answer length is variable (though about 80% of answers are in the 1-3 words range). As an example for seq2seq, we can just ignore the image, and try to answer using only the question (this should give a performance of about 40-45% on the validation set). If this works out, I can later create an example that looks at the image, and uses seq2seq for modeling the text.

The dataset is fairly large (about 248K train and 121K val examples), so we should be able to learn some meaningful structure using LSTM encoder/decoder pairs.

I have written some helper scripts for working with this dataset, and can be used for converting their json files to a simpler (and faster to load) plain text format. You can find them here, if needed.

@farizrahman4u
Copy link
Owner

@avisingh599 Cool! I will check it out.

@libraua
Copy link
Contributor

libraua commented Nov 29, 2015

Here is an interesting topic about datasets for sequence to sequence.
Seems like dataset of subtitles can be useful. link

@nicolas-ivanov
Copy link
Contributor Author

@libraua, thank you! The reddit post has quite useful links in the comments.

@libraua
Copy link
Contributor

libraua commented Dec 1, 2015

@nicolas-ivanov would you be so kind to share with us your findings? Does it work?

@nicolas-ivanov
Copy link
Contributor Author

@libraua at least I managed to run fit() and predict() functions :)

Unfortunately, I get only rubbish predictions so far, and I'm willing to concede that there is some bug in my interpretation of predicted vectors, however the weird thing is that it takes enormous amount of time to make a prediction of one sentence - up to a minute. For comparison, my sequence to word model did this job in less than a second. Regarding this slowness I can only assume that there's a defect in Fariz's seq2seq implementation...

I'll try to structure my code to some readable form and then share the link to it here on Thursday.

@farizrahman4u
Copy link
Owner

@nicolas-ivanov Can you please post your dataset here?

@nicolas-ivanov
Copy link
Contributor Author

@farizrahman4u sure! tomorrow

@nicolas-ivanov
Copy link
Contributor Author

@libraua here comes my usage example of seq2seq: https://github.com/nicolas-ivanov/debug_seq2seq
There is quite a lot of code but I hope it's still readable.

The model trains and now even predicts some sequences that have a potential to turn into meaningful statements after a substantial number of iterations...
The only misfortune so far is that the model can't be saved and restored properly, supposedly due to some layers inconsistency in seq2seq. I'll report this issue now.

@farizrahman4u you can find the dataset for training here. However the problem I mentioned before about slow predictions has gone as soon as started using a simple argmax() for restoring words from vectors instead of trying to diversify the generated results.
So if you have time, please take a look at the issue with model.save_weights().
Thanks

@farizrahman4u
Copy link
Owner

I will fix it soon. Meanwhile, you can save your model as is using CPickle.

@nicolas-ivanov
Copy link
Contributor Author

Ok, thank you

@viksit
Copy link

viksit commented Dec 4, 2015

@nicolas-ivanov in your system, it appears that all predictions are " . . . . . . " - is this something you've seen before? I haven't debugged your code just yet, so not sure whats going on.

@nicolas-ivanov
Copy link
Contributor Author

@viksit right, I got the same, and after a while the model starts to return sequences like " i , . $$$ $$$ . . . " where $$$ represents the end of sentence symbol. That looks a bit more similar to what we should get - at least the end of the sentence is present, - but that's the best I could get on my laptop using the tiny parameters from config.py
Running this example on GPU with larger params (i.e. more hidden layers with larger dimensionalities, might bring better results.

@lkluo
Copy link

lkluo commented Apr 25, 2017

This library is good, however, i strongly suggest to write your code at least based on Keras or tensorflow, which are more flexible for your special need.

@amelmusic
Copy link

It would be great if someone could convert this example with prediction as well: https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html to use seq2seq library.

I tried, but couldn't get results anywhere near those when using example from keras.

Thnx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants