New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
workable example needed #7
Comments
This will be done as soon as possible. I am having university exams right now, and there are no other contributors involved. In the mean time, you could post links to datasets appropriate for seq2seq model here in this issue, and I would write examples for training on the same. |
@farizrahman4u: As a simple example could you post an example to do POS Tagging using SeqToSeq. One can use standard datasets : http://www.cnts.ua.ac.be/conll2000/chunking/ as data. This should be a good starting point for people as POS tagging is easily understood and there are tons of baselines to compare against. Besides NLTK also has support for CONLL2000 tasks. |
@viveksck I think the seq2seq is not designed for Tagging task. For the tagging task needs that the output sequence is aligned with the input sequence. For seq2seq, we need a QA or MT dataset. |
So synced input output is not supported in seq2seq keras-team/keras#168. Is that correct ? |
@viveksck keras-team/keras#168 can be done using a single LSTM layer. |
One suggestion for dataset: VQA In this dataset, the aim is to answer open-ended questions about images. The answer length is variable (though about 80% of answers are in the 1-3 words range). As an example for seq2seq, we can just ignore the image, and try to answer using only the question (this should give a performance of about 40-45% on the validation set). If this works out, I can later create an example that looks at the image, and uses seq2seq for modeling the text. The dataset is fairly large (about 248K train and 121K val examples), so we should be able to learn some meaningful structure using LSTM encoder/decoder pairs. I have written some helper scripts for working with this dataset, and can be used for converting their json files to a simpler (and faster to load) plain text format. You can find them here, if needed. |
@avisingh599 Cool! I will check it out. |
@libraua, thank you! The reddit post has quite useful links in the comments. |
@nicolas-ivanov would you be so kind to share with us your findings? Does it work? |
@libraua at least I managed to run fit() and predict() functions :) Unfortunately, I get only rubbish predictions so far, and I'm willing to concede that there is some bug in my interpretation of predicted vectors, however the weird thing is that it takes enormous amount of time to make a prediction of one sentence - up to a minute. For comparison, my sequence to word model did this job in less than a second. Regarding this slowness I can only assume that there's a defect in Fariz's seq2seq implementation... I'll try to structure my code to some readable form and then share the link to it here on Thursday. |
@nicolas-ivanov Can you please post your dataset here? |
@farizrahman4u sure! tomorrow |
@libraua here comes my usage example of seq2seq: https://github.com/nicolas-ivanov/debug_seq2seq The model trains and now even predicts some sequences that have a potential to turn into meaningful statements after a substantial number of iterations... @farizrahman4u you can find the dataset for training here. However the problem I mentioned before about slow predictions has gone as soon as started using a simple argmax() for restoring words from vectors instead of trying to diversify the generated results. |
I will fix it soon. Meanwhile, you can save your model as is using CPickle. |
Ok, thank you |
@nicolas-ivanov in your system, it appears that all predictions are " . . . . . . " - is this something you've seen before? I haven't debugged your code just yet, so not sure whats going on. |
@viksit right, I got the same, and after a while the model starts to return sequences like " i , . $$$ $$$ . . . " where $$$ represents the end of sentence symbol. That looks a bit more similar to what we should get - at least the end of the sentence is present, - but that's the best I could get on my laptop using the tiny parameters from config.py |
This library is good, however, i strongly suggest to write your code at least based on Keras or tensorflow, which are more flexible for your special need. |
It would be great if someone could convert this example with prediction as well: https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html to use seq2seq library. I tried, but couldn't get results anywhere near those when using example from keras. Thnx! |
Dear developers, thank you for your work!
It would be of great help to use your seq2seq implementation, however after a considerable amount of time and efforts I still can't do it due to the lack of documentation and examples. Keras docs don't tell much about seq2seq mapping either. So please add a simple workable code example that would demostrate the usage of your library: how to prepare the input data and how to implement the training and predicting procedures. Would be greatly appriciated.
The text was updated successfully, but these errors were encountered: