-
-
Notifications
You must be signed in to change notification settings - Fork 69
Analyse (and enhance) performances on multiple predictions #67
Comments
Hello, TL;DR: no you are not, onnx-go is not (yet) optimized for performances. Longer answer: Anyway, this is pure speculation and to enhance we need to know where the bottleneck is and to measure the performances. Your example could be a perfect starting point to start to analyze and to do some measurements. Do you mind sharing your complete example with us? On top of that, we can also try to run your tests in concurrency via some goroutines to see how it behaves. The concurrency is also a goal as @blackrez and I would like to be able to run onnx-go inside web service. |
@owulveryck , it's ok to share the model and data.It's only some study.Give me a email address or something? |
Is it huge? can you copy/paste the python code here? |
I write a simple one
Hope helps |
A basic performance analysis has started in issue #68 |
I made a very simple test on my machine without actually checking the results; Gorgonia is winning against Tensorflow (if my test is ok): Pythonimport keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
import keras.engine.saving
import keras.models
import math
import numpy as np
import time
model = Sequential()
import onnxmltools
data_size = 100000
x_train = np.array(np.random.rand(data_size,39),dtype='float32')
y_train = np.zeros(shape=(data_size,2),dtype='float32')
model.add(Dense(units=22,input_shape=(39,), activation='tanh'))
model.add(Dense(units=22, activation='tanh'))
model.add(Dense(units=2, activation='tanh'))
sgd = SGD(lr=0.01, momentum=0.9)
model.compile(loss='categorical_crossentropy',optimizer=sgd,)
model.train_on_batch(x_train, y_train)
total = 0
for i in range(100):
x_test = np.array(np.random.rand(data_size,39),dtype='float32')
start = time.time()
y_predict=model.predict(x_test)
end = time.time()
total += (end - start)
print(total / (i+1))
onnx_model = onnxmltools.convert_keras(model, target_opset=7)
onnxmltools.save_model(onnx_model,"model.onnx") The prediction is on average 870ms GoThis code should run something similar: datasize := 100000
backend := gorgonnx.NewGraph()
model := onnx.NewModel(backend)
b, err := ioutil.ReadFile(os.Args[1])
if err != nil {
log.Fatal(err)
}
err = model.UnmarshalBinary(b)
if err != nil {
log.Fatal(err)
}
var d time.Duration
for i := 0; i < 100; i++ {
input := tensor.New(tensor.WithShape(datasize, 39), tensor.Of(tensor.Float32), tensor.WithBacking(tensor.Random(tensor.Float32, datasize*39)))
model.SetInput(0, input)
t := time.Now()
err = backend.Run()
if err != nil {
log.Fatal(err)
}
d += time.Since(t)
fmt.Println(time.Duration(float64(d) / float64(i+1)))
} Gives on average 320ms (and around 200ms with a patch that i will commit to the tensor package soon). |
try this keras predict ,it's faster |
A patch has been commited to the |
onnx-go improved.my ratio is 0.07s(keras) vs 0.3s |
@bitnick10 , would you be kind enough to provide the python code and the Go code you are actually executing to get your results? This would allow us to run the exact same test as you do. Maybe a gist.github.com can do the job. Thanks |
I copied keras of mine and goalng of yours.I changed this line that your keras code without it
|
You're right, setting the I made profiling, and the next move will be to enhance the broadcasting (see issue #68): Or maybe we can try to generate a model that does not "compact the tensor" and use broadcasting for testing. But I don't know if it's possible. |
This PR from the tensor package gives good results; |
I have a demo model with 39 inputs.it takes 0.5s to predict 10000 data using keras.with onnx-go it takes 5s to predict.
Am I make some mistake here?
The text was updated successfully, but these errors were encountered: