-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
anyway to get "Confidence" metric? #19
Comments
so dug in a little bit here and wanted to report back. still seeing some funkiness, but here's where i'm at: Looking at this code: attention-ocr/aocr/model/model.py Lines 177 to 179 in 4505228
This seems to be where we get the predictions back for each AttentionDecoder (this list looks to be equal in size to MAX_PREDICTION which makes sense, and each list in that list looks to be of size TARGET_VOCAB_SIZE which also makes sense). To me this seems like the prediction values for each character for each decoder. so i extended this code to add a node with this entire list so i could get it at predict time:
num_feed = []
allProbabilities = []
for l in xrange(len(self.attention_decoder_model.output)):
outputs = self.attention_decoder_model.output[l]
guess = tf.argmax(outputs, axis=1)
num_feed.append(guess)
allProbabilities.append(outputs)
all_probs_output = tf.convert_to_tensor(allProbabilities, name = "allProbabilities") THEN, during predict time, i'm able to get the output of this tensor, softmax each list to turn it into probabilities, take the max probability for each AttentionDecoder, and either take the mean or the product of all the value, e.g.: allProbs = graph.get_tensor_by_name('prefix/allProbabilities:0')
np.mean(softmax(allProps).max(axis=2))
with tf.Session(graph=graph) as sess:
(y_out, probs_output) = sess.run([y,allProbs], feed_dict={
x: [img]
})
return {
"predictions": [{
"ocr": str(y_out),
"confidence": str(np.mean(softmax(probs_output).max(axis=2)))
}]
}; HOWEVER, it seems like everything i'm getting is around 50%-60% which is wildly unhelpful. Any ideas on where I'm going wrong here? I tried to use |
The logic seems right (when using the product). I have a different implementation that I believe/hope is working that might serve as a useful (though perhaps inefficient) reference. I basically populate a probability tensor in parallel with the prediction tensor: num_feed = []
prb_feed = []
for l in range(len(self.attention_decoder_model.output)):
guess = tf.argmax(self.attention_decoder_model.output[l], axis=1)
proba = tf.reduce_max(
tf.nn.softmax(self.attention_decoder_model.output[l]), axis=1)
num_feed.append(guess)
prb_feed.append(proba)
# Join the predictions into a single output string.
trans_output = tf.transpose(num_feed)
trans_output = tf.map_fn(
lambda m: tf.foldr(
lambda a, x: tf.cond(
tf.equal(x, DataGen.EOS_ID),
lambda: '',
lambda: table.lookup(x) + a
),
m,
initializer=''
),
trans_output,
dtype=tf.string
)
# Calculate the total probability of the output string.
trans_outprb = tf.transpose(prb_feed)
trans_outprb = tf.gather(trans_outprb, tf.range(tf.size(trans_output)))
trans_outprb = tf.map_fn(
lambda m: tf.foldr(
lambda a, x: tf.multiply(tf.cast(x, tf.float64), a),
m,
initializer=tf.cast(1, tf.float64)
),
trans_outprb,
dtype=tf.float64
)
self.prediction = tf.cond(
tf.equal(tf.shape(trans_output)[0], 1),
lambda: trans_output[0],
lambda: trans_output,
)
self.probability = tf.cond(
tf.equal(tf.shape(trans_outprb)[0], 1),
lambda: trans_outprb[0],
lambda: trans_outprb,
)
self.prediction = tf.identity(self.prediction, name='prediction')
self.probability = tf.identity(self.probability, name='probability') I then add it to the output feed at each step: if not forward_only:
output_feed += [
self.summaries_by_bucket[0],
self.updates[0],
self.prediction,
]
else:
output_feed += [
self.prediction,
self.probability,
]
if self.visualize:
output_feed += self.attention_decoder_model.attention_weights_history
outputs = self.sess.run(output_feed, input_feed)
res = {
'loss': outputs[0],
}
if not forward_only:
res['summaries'] = outputs[1]
res['prediction'] = outputs[3]
else:
res['prediction'] = outputs[1]
res['probability'] = outputs[2]
if self.visualize:
res['attentions'] = outputs[3:] Apologies for the long code snippets. |
thanks for the feedback! just implemented yours and it is working fine, but some of the numbers still seem funny to me. for instance, on values that match 100%, i see probabilities that range from 18%-99% (e.g. 45%, 27%, 61%, 34%, 99%). and on values that are fairly wrong, i see anything from 17% - 99.8% (e.g. 54%, 99%, 95%, 32%, 70%). it's possible it's just my dataset, but i'm surprised to see such disparity. have you been able to run this with your dataset and feel confident in the results? theoretically everything makes sense, so i don't think it's an issue with the implementation. just surprised to see such wide range of results. i guess i don't have confidence in the confidence score. |
just piggybacking off my comment, i have my max prediction length set for 12, which gives me 12 probability lists. however, the overwhelming majority of my training set is ~6 characters, which is potentially why i'm seeing such crazy probability values. perhaps it's those last 7-12 tensors that are just "not confident" dragging my prediction down? if i down max_prediction_length to 8, i get some more bearable values |
Interesting. I can confirm that the probabilities make sense on my dataset and task (which has a max prediction length of 20). It might also be that the model simply hasn't run for long enough, so the probability range is still quite wide for any given label, but would narrow with additional training. |
It would be great to get the confidence per character exposed in the outputs via a PR, if you guys don't mind sharing the great work! It's cool that it seems we have a few of us actively using and improving this repo... let's help each other out :). |
Thank you guys for that, your help on this project is really appreciated. ❤️ I don't have too much time for it, since it's just a side project for me, but I'll help out where I can — and of course happily review contributions and publish the new versions to pip. |
can submit a PR for overall confidence today. would love to get confidence per character at some point, because honestly i'd love to give a list of possible solutions instead of just the most probable. e.g. here are 20 potential OCRs with their confidence score. but that would be a later thing unless someone wants to jump on it |
PR submitted! |
Totally agreed about the list of possible solutions being ideal--I'm running into a lot of '0' vs 'O' and '1' vs 'l' choices that don't always come out correctly. I'm looking forward to trying out the overall confidence, though too. |
Merged the overall probability, so I'll close this issue, and the multiple guesses are going to be tracked in #28. |
Hello,
I'm interested in knowing a "confidence" for a given prediction. Does anyone have any ideas on the best way to tackle this? I assume there is some output in the graph (potentially for each character?) that I could tap into to calculate this. Hope to play with this later this week but wanted to see if anyone had any ideas first.
The text was updated successfully, but these errors were encountered: