ValueError: Cannot feed value of shape (32, 15) for Tensor 'Placeholder_1:0', which has shape '(?,) #3

kbkreddy · 2018-09-27T05:56:27Z

I am getting above error when I run bilstm attention program on DBpedia dataset

Traceback (most recent call last):
File "attn_bi_lstm.py", line 112, in
return_dict = run_train_step(classifier, sess, (x_batch, y_batch))
File "/home/kbk/Desktop/BudddiHealth/higher models/Text-Classification-master/models/utils/model_helper.py", line 26, in run_train_step
return sess.run(to_return, feed_dict)
File "/home/kbk/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 877, in run
run_metadata_ptr)
File "/home/kbk/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1076, in _run
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (32, 15) for Tensor 'Placeholder_1:0', which has shape '(?,)'

TobiasLee · 2018-09-27T07:45:29Z

Well, it's a shape mismatch. The tf.nn.sparse_softmax_cross_entropy_with_logits() requires the label shape like [batch_size]. I've updated prepare_data.py, you can use load_data with argument one_hot=False to load data, which doesn't convert label to one-hot format.

kbkreddy · 2018-09-27T07:59:12Z

thanks for that ....but i am getting another problem ...while training its showing val accuracy is 0 and after training test accuracy is also zero

Train Epoch time: 8.773 s
validation accuracy: 0.000
Epoch 17 start !
Train Epoch time: 8.756 s
validation accuracy: 0.000
Epoch 18 start !
Train Epoch time: 8.755 s
validation accuracy: 0.000
Epoch 19 start !
Train Epoch time: 9.088 s
validation accuracy: 0.000
Epoch 20 start !
Train Epoch time: 9.034 s
validation accuracy: 0.000
Training finished, time consumed : 181.54535388946533 s
Start evaluating:

Test accuracy : 0.000000 %

can u help me where i was getting wrong

kbkreddy · 2018-09-27T08:07:48Z

thanks it solved after using loadv2

TobiasLee · 2018-09-27T12:44:44Z

@kbkreddy The information you provided is not enough to get some ideas. You might print the loss on training set to see if the model is work correctly and check the accuracy calculation.

kbkreddy · 2018-09-28T10:55:42Z

when i change the code to work on my dataset its not getting trained(val accuracy is zero) can u help me where i might get wrong..i didnt change much ur code to work on my dataset ,i just changed number of classes to 5 .. one irony is if keep the number of classes to 15 and run on my dataset wcode works perfectly,but problem comes when i change n_classes to 5 .....

kbkreddy · 2018-10-02T18:11:34Z

is there any way that i can visualize the attention vector ... if can how

TobiasLee · 2018-10-03T05:01:27Z

@kbkreddy You can print the attention weight by fetching the variable when evaluating the model.

kbkreddy · 2018-10-03T05:05:50Z

i am really confused which varibale i need to fetch (is it alpha ??)...and can u help me with the code how to get attention vector .. i want see if the attention is really working or not . does the attention vector static for all the inputs or it changes according to input?

TobiasLee · 2018-10-06T01:45:58Z

@kbkreddy You asked a great question, which helps me find a bug!
Here is the detail about how to see attention is working or not:
We need to fetch alpha during training, so I made alpha become a member of the classifier and define a simple help function in model_helper:

def get_attn_weight(model, sess, batch):
    feed_dict = make_train_feed_dict(model, batch)
    return sess.run(model.alpha, feed_dict)

Then, fetch it when training or evaluating:

        for x_batch, y_batch in fill_feed_dict(x_train, y_train, config["batch_size"]):
            return_dict = run_train_step(classifier, sess, (x_batch, y_batch))
            attn = get_attn_weight(classifier, sess, (x_batch, y_batch))
            print(np.reshape(attn, (config["batch_size"], config["max_len"])))

Well, the result I got is that all the weight is equal to 1, like [ 1, 1, , .. , 1, 1, ], obviously, it's wrong.
I check the code then, the problem is the dimension of softmax operated on, so I modify the code and get the supposed result like below:

[[0.07075115 0.09335954 0.09848893 0.12299125 0.12147289 0.17489243
0.22785075 0.09019314]
[0.10558738 0.15054257 0.11275174 0.07762329 0.09574074 0.09661827
0.12752177 0.23361419]
[0.08490831 0.13731416 0.16969529 0.18511814 0.14524785 0.11838076
0.09544718 0.06388821]
[0.07429677 0.07080048 0.6452198 0.05168323 0.04661963 0.04235588
0.02698826 0.04203588]]

To keep the result short, I made max_len = 8 and batch_size = 4. I've push the new code, you can try and test it. If you find something wrong, feel free to ask a question. Thank you very much ~

kbkreddy · 2018-10-08T11:41:11Z

@TobiasLee yeah i too came across that error ,thanks for the correction ,its gving some values now ....i got one doubt.. i got below alpha values

Susan Su Ryden is a legislator in the U.S. state of Colorado. Elected to the Colorado House of Representatives as a Democrat in 2008 Ryden represents House District 36 which encompasses eastern Aurora Colorado.
[(0.24889317, 'a'), (0.17085373, 'in'), (0.11460834, 'the'), (0.07909548, 'legislator'), (0.06332039, 'is'), (0.05122245, 'Colorado.'), (0.037922077, 'Su'), (0.03440894, 'U.S.'), (0.025408622, 'Ryden'), (0.022541974, 'Susan')]
true value-->5pred value-->5

how come it is able predict correctly even it got highest 3 softmax probabilities 'a' 'in' 'the' .. i dont think so ''a in the'' alone are able to predict class 5

kbkreddy · 2018-10-08T12:25:46Z

may i know what is diff bw attention mechanism in modules folder and the one in attn bilstm (this code) thanks ..
anyways this is really clearing my doubts on attention mechanism thanks @TobiasLee

TobiasLee · 2018-10-09T01:15:54Z

To my understand, the attention mechanism is just a weighted sum of hidden states, thus can provide more information for prediction. The example you showed may be a bad case, which I can not explain either. As for the differences between modules and in attn_bi_lstm.py, the calculations of attention weight are different. Computing attention weight behaves like figuring out the relationship between input and output, here, the input and output are both hidden states, thus we call it self-attention (you can refer to 《Attention is All You Need》 for more details). Usually this relation is computed by a forward neural networks, and the implementation of this neural networks differs, but the idea is still same.

kbkreddy · 2018-10-10T06:36:59Z

ohk thanks for that ,now i am getting good at this ... one more doubt i came across pls see output below

Marynin [maˈrɨnin] is a village in the administrative district of Gmina Siedliszcze within Chełm County Lublin Voivodeship in eastern Poland.
sorted attention values (alphas):-
[(0.9983796, 'village'), (0.001056958, 'the'), (0.00024786787, 'administrative'), (0.00012790732, 'in'), (7.457812e-05, 'a'), (4.7753467e-05, 'Marynin'), (4.2701886e-05, '[maˈrɨnin]'), (6.1858645e-06, 'is'), (6.0190782e-06, 'district'), (4.3025084e-06, 'Voivodeship')]
sorted  outupt of BiLSTM with attention(rnn_output*attention vector) :-
[(0.75726455, 'County'), (0.60007113, 'a'), (0.42976815, 'in'), (0.39802772, 'Gmina'), (0.25964364, 'the'), (0.20377639, 'of'), (0.20130117, 'village'), (0.10621535, 'Poland.'), (0.10221334, 'in'), (0.09738086, 'is')]
 true value-->9pred value-->9 (class 9 =village)

i used the attention mechanism in modules folder, alpha vector generation seems to be correct in this example but check that above output of rnn_output*attention vector is not i expected all values in the second array are similar but what i am expecting is the BiLSTM node of village should be having more priority right?
i think the mistake lies in matrix multiplication of attention vector with rnn outputs ..
output = tf.reduce_sum(inputs * tf.reshape(alphas, [-1, sequence_length, 1]), 1) #from attention.py line 77
as of my understanding first index of attention vector should be multiplied with rnn_output of 1 node (which length is 64(hidden units in lstm cell)) ...
can u pls help me get out of this....

TobiasLee · 2018-10-10T09:16:42Z

Well, actually I'm not clear what's your problem. Do you mean the implementation of the attention module may be wrong? I check the origin repo and find the author updated the attention module, so I updated it, too. You can try the new implementation.

freshforlife · 2019-10-09T07:09:02Z

when i change the code to work on my dataset its not getting trained(val accuracy is zero) can u help me where i might get wrong..i didnt change much ur code to work on my dataset ,i just changed number of classes to 5 .. one irony is if keep the number of classes to 15 and run on my dataset wcode works perfectly,but problem comes when i change n_classes to 5 .....

@TobiasLee : I too ran into the same issue as @kbkreddy above. I'm running the adversarial_abblstm.py ( adversarial training example ) In my dataset the number of classes = 7 , I get the validation accuracy = 0, but if I change the n_class = 15, I do get some decent accuracy values.

Can you look into the above ? In order to fetch the attention vector while training i did modify the alpha in my code to self.alpha

self.alpha = tf.nn.softmax(tf.matmul(tf.reshape(M, [-1, self.hidden_size]),
                                                tf.reshape(W, [-1, 1])))

in the cal_loss_logit function.

kbkreddy closed this as completed Sep 27, 2018

kbkreddy reopened this Sep 28, 2018

kbkreddy closed this as completed Sep 28, 2018

kbkreddy reopened this Oct 2, 2018

TobiasLee closed this as completed Oct 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: Cannot feed value of shape (32, 15) for Tensor 'Placeholder_1:0', which has shape '(?,) #3

ValueError: Cannot feed value of shape (32, 15) for Tensor 'Placeholder_1:0', which has shape '(?,) #3

kbkreddy commented Sep 27, 2018

TobiasLee commented Sep 27, 2018 •

edited

kbkreddy commented Sep 27, 2018

kbkreddy commented Sep 27, 2018

TobiasLee commented Sep 27, 2018

kbkreddy commented Sep 28, 2018

kbkreddy commented Oct 2, 2018

TobiasLee commented Oct 3, 2018

kbkreddy commented Oct 3, 2018 •

edited

TobiasLee commented Oct 6, 2018 •

edited

kbkreddy commented Oct 8, 2018

kbkreddy commented Oct 8, 2018

TobiasLee commented Oct 9, 2018

kbkreddy commented Oct 10, 2018 •

edited

TobiasLee commented Oct 10, 2018 •

edited

freshforlife commented Oct 9, 2019 •

edited

ValueError: Cannot feed value of shape (32, 15) for Tensor 'Placeholder_1:0', which has shape '(?,) #3

ValueError: Cannot feed value of shape (32, 15) for Tensor 'Placeholder_1:0', which has shape '(?,) #3

Comments

kbkreddy commented Sep 27, 2018

TobiasLee commented Sep 27, 2018 • edited

kbkreddy commented Sep 27, 2018

kbkreddy commented Sep 27, 2018

TobiasLee commented Sep 27, 2018

kbkreddy commented Sep 28, 2018

kbkreddy commented Oct 2, 2018

TobiasLee commented Oct 3, 2018

kbkreddy commented Oct 3, 2018 • edited

TobiasLee commented Oct 6, 2018 • edited

kbkreddy commented Oct 8, 2018

kbkreddy commented Oct 8, 2018

TobiasLee commented Oct 9, 2018

kbkreddy commented Oct 10, 2018 • edited

TobiasLee commented Oct 10, 2018 • edited

freshforlife commented Oct 9, 2019 • edited

TobiasLee commented Sep 27, 2018 •

edited

kbkreddy commented Oct 3, 2018 •

edited

TobiasLee commented Oct 6, 2018 •

edited

kbkreddy commented Oct 10, 2018 •

edited

TobiasLee commented Oct 10, 2018 •

edited

freshforlife commented Oct 9, 2019 •

edited