-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prediction different between TF Serving 1.4 and TF 1.4 #656
Comments
I'm seeing something very similar at the moment, although the numerical difference is a bit more dramatic. Notably, I have an embedding layer which has some 0 vectors (for e.g. padding / OOV). Broken down in terms of what it takes to export a keras model to TF serving:
I'm using 1.4 here but can't confirm that I wasn't seeing this in 1.3. I guess I could try downgrading if that helps? Forgot to mention: this is all on CPU, using the default builds (i.e. none of the available CPU optimizations are being used). If needed I could probably put together a reproducible test case but there's a lot of moving parts :) Also: the servable I'm testing also has ops that have boolean or int32 outputs -- all of those come out fine! However the float outputs are all funky. |
Further note: I tried to determine if the corruption is happening somewhere by quickly abusing a ThresholdedReLU keras layer to zero out the embeddings and then add them back in, and then compare the original embedding layer to the one with zeros added to it. If the zeros are broken w/in the graph, I'd see different numbers between them -- however it looks like they're the same. What I did notice on a second run is that I have two embedding vectors that are all zeros (to distinguish between OOV and pad tokens -- don't worry about it) and they're coming out as different (garbage) vectors. So, e.g. the following sequence:
after the embedding layer, but from TF serving I get
I'm working on putting together a minimal repro -- currently what I have relies on a bunch of weird custom code / keras layers that's not worth including. I've also noticed that between versions (not between requests) the vectors change, so my previous comment about what the 0 gets changed to is inaccurate. Also, if I inspect the output of the ThresholdedReLU I do see all zeros (though sometimes |
Here's a gist that should (at least, on my system) reproduce this issue: https://gist.github.com/zmjjmz/64cf9771922aa6cf58da6233e022f056 |
I was initially encountering this issue in a servable that used a lookup table, hence when I call So, I think I can narrow down that something funky is going on in that main_op that's causing this. |
Ok so playing with this a bit more, I think the issue is specifically with Currently
If I remove just that last initializer, this issue goes away, and I'm able to use the model as normal! Definitely something strange going on in the |
Thanks for reporting back @zmjjmz. Resolving since this seems export specific for now. |
Should I open this as a separate issue on the main tensorflow repo then? |
@zmjjmz Is this to follow-up regarding the global variable initializer? If so, sure. |
for me, the difference here disappeared in a recent TF / keras update ... as far as I could understand it it was a difference in TF prediction and TF Serving prediction that existed briefly and was fixed in a recent release |
after updating our TF and Keras, and TF Serving, I'm seeing a difference in prediction values on the same model and images between TF and Keras, and Serving. I updated to TF 1.4, Keras 2.0.9 and built TF Serving from 1.4 branch (tried master too). Then prediction on some random images gives-
Keras, TensorFlow, TensorFlowServing, TrueLabel
0.294510304928, 0.294510304928, 0.306598514318, 1
0.973454713821, 0.973454713821, 0.974921882153, 1
0.0169313177466, 0.0169313177466, 0.109000883996, 0
0.969210922718, 0.969210922718, 0.964440405369, 1
0.996860027313, 0.996860027313, 0.998536705971, 1
0.996983230114, 0.996983230114, 0.994152128696, 1
0.259784668684, 0.259784668684, 0.300680160522, 0
0.989252388477, 0.989252388477, 0.97792416811, 1
ie. Keras and TF predict the same, but TF Serving gives different numbers. Its possible we didn't upgrade our TF Serving correctly (although didn't see any errors).
Is anyone else getting this? We didn't get this on TF 1.3
The text was updated successfully, but these errors were encountered: