New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some code missing? #3
Comments
Did you change |
Thanks. I found the reason just now. |
@ruotianluo |
Sadly no, because I feel like it will be a little bit messy to add it. |
@ruotianluo |
Sorry, what I means alpha is the attention map which is named as weight in my code. You can save the weights at each timestep, and visualize it using the last block of the alpha_visualization.ipynb |
@ruotianluo
here? |
@brisker In principle it allows you to use GRU instead of LSTM, however I forgot if I tested or not. |
@ruotianluo
|
@brisker Why not? |
@ruotianluo |
schedule sampling |
@ruotianluo |
@brisker yes. Note that weight is flattened, you should first resize to 7x7. (I forgot to mention, this show_attend_tell is not exactly the same as described in the paper, it's simplified a little bit.) |
@ruotianluo
you seems to just perform average pooling on the conv features as the fc_feats |
@brisker This is because I'm using resnet. |
@ruotianluo |
No, schedule sampling is another thing which is not mentioned in the show attend tell paper; you can google the schedule sampling paper; |
|
It's designed to solve the problem of test training discrepancy. |
@ruotianluo |
Did you set model to evaluate? |
@ruotianluo |
How different are the results |
@ruotianluo |
Yes, as long as its mathematically equivalent. |
@ruotianluo |
Idea is you can look at different part of the image at each time step. |
@ruotianluo |
We don't use fc_feats to compute the weight, we use att_feats. hidden state is changing. |
thanks for all your replies :) |
Att_feats are changing over locations, hidden states are changing over time. And the output of the attention module is a weighted summation of att_feats |
Technically I don't concat but mathematical it's equivalent. I wrote in this way to avoid duplicate computation. |
@ruotianluo
xt is output of each time step, and att_res is a weighted sum of att_feats, right? |
Ok, it seems that I misunderstood your question. Yes, if you don't concat att_res here, it's not an attention model, and there's no visualization either, because there's no training signal to the attention module. |
thanks for all you replies :) |
There are a lot of different fusion types proposed in VQA literature. You can check it out. One easiest alternative is elementwise product. |
python scripts/prepro_labels.py --input_json .../dataset_coco.json --output_json data/cocotalk.json --output_h5 data/cocotalk
failed. Here are the errors:It seems that some code is missing.
The text was updated successfully, but these errors were encountered: