Deploy tensorflow graphs for faster evaluation and export to tensorflow-less environments running numpy.
import tfdeploy as td
import numpy as np
model = td.Model("/path/to/model.pkl")
inp, outp = model.get("input", "output")
batch = np.random.rand(10000, 784)
result = outp.eval({inp: batch})
Via pip
pip install tfdeploy
or by simply copying the file into your project.
Currently, all math ops and a selection of nn ops are implemented. The remaining ops will follow within a few days, so there might be some UnknownOperationException
's during conversion. See milestone v0.2.0.
Working with tensorflow is awesome. Model definition and training is simple yet powerful, and the range of built-in features is just striking.
However, when it comes down to model deployment and evaluation, things get a bit more cumbersome than they should be. You either export your graph to a new file and save your trained variables in a separate file, or you make use of tensorflow's serving system. Wouldn't it be great if you could just export your model to a simple numpy-based callable? Of course it would. And this is exactly what tfdeploy does for you.
To boil it down, tfdeploy
- is lightweight. A single file with < 150 lines of core code. Just copy it to your project.
- faster then using tensorflow's
Tensor.eval
. - does not need tensorflow during evaluation.
- only depends on numpy.
- can load one or more models from a single file.
- does not support GPUs (maybe gnumpy is worth a try here).
The central class is tfdeploy.Model
. The following two examples demonstrate how a model can be created from a tensorflow graph, saved to and loaded from disk, and eventually evaluated.
import tensorflow as tf
import tfdeploy as td
# build your graph
sess = tf.Session()
# use names for input and output layers
x = tf.placeholder("float", shape=[None, 784], name="input")
W = tf.Variable(tf.truncated_normal([784, 100], stddev=0.05))
b = tf.Variable(tf.zeros([100]))
y = tf.nn.softmax(tf.matmul(x, W) + b, name="output")
sess.run(tf.initialize_all_variables())
# ... training ...
# create a tfdeploy model and save it to disk
model = td.Model()
model.add(y, sess) # y and all its ops and related tensors are added recursively
model.save("model.pkl")
import numpy as np
import tfdeploy as td
model = td.Model("model.pkl")
# shorthand to x and y
x, y = model.get("input", "output")
# evaluate
batch = np.random.rand(10000, 784)
result = y.eval({x: batch})
tfdeploy supports most of the Operation
's implemented in tensorflow. However, if you miss one (in that case, submit a PR or an issue ;) ) or if you're using custom ops, you might want to extend tfdeploy by defining a new class op that inherits from tfdeploy.Operation
:
import tensorflow as tf
import tfdeploy as td
import numpy as np
# ... write you model here ...
# let's assume your final tensor "y" relies on an op of type "InvertedSoftmax"
# before creating the td.Model, you should add that op to tfdeploy
class InvertedSoftmax(td.Operation):
@staticmethod
def func(a):
e = np.exp(-a)
# ops should return a tuple
return np.divide(e, np.sum(e, axis=-1, keepdims=True)),
# this is equivalent to
@td.Operation.factory
def InvertedSoftmax(a):
e = np.exp(-a)
return np.divide(e, np.sum(e, axis=-1, keepdims=True)),
# now we're good to go
model = td.Model()
model.add(y, sess)
model.save("model.pkl")
When writing new ops, three things are important:
- Try to avoid loops, prefer numpy vectorization.
- Return a tuple.
- Don't change incoming tensors/arrays in-place, always work on and return copies.
tfdeploy is lightweight (1 file, < 150 lines of core code) and fast. Internal evaluation calls have only very few overhead and tensor operations use numpy vectorization. The actual performance depends on the ops in your graph. While most of the tensorflow ops have a numpy equivalent or can be constructed from numpy functions, a few ops require additional Python-based loops (e.g. BatchMatMul
). But in many cases it's potentially faster than using tensorflow's Tensor.eval
.
This is a comparison for a basic graph where all ops are vectorized (basically Add
, MatMul
and Softmax
):
> ipython -i tests/perf.py
In [1]: %timeit -n 100 test_tf()
100 loops, best of 3: 109 ms per loop
In [2]: %timeit -n 100 test_td()
100 loops, best of 3: 60.5 ms per loop
If you want to contribute with new ops and features, I'm happy to receive pull requests. Just make sure to add a new test case to tests/core.py
or tests/ops.py
and run it via:
> python -m unittest tests
- Source hosted at GitHub
- Report issues, questions, feature requests on GitHub Issues
The MIT License (MIT)
Copyright (c) 2016 Marcel R.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.