-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
recover TF inference for deepJet and deepDoubleX and MXNET inf for deepBoosted (migrated to ONNX in #28112) #28959
Comments
assign reconstruction |
A new Issue was created by @slava77 Slava Krutelyov. @Dr15Jones, @smuzaffar, @silviodonato, @makortel, @davidlange6, @fabiocos can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
@slava77 Thanks for opening the issue. Are you essentially suggesting to continue use ONNX inference on x86 and ARM, and TF on POWER (i.e. not using TF for all architectures e.g. after #28711)? Technically we could do that with SwitchProducer, but we would need to solve #28576 first (and that will likely take a while). @mrodozov @smuzaffar We should probably anyway disable building the ONNX-dependent libraries and plugins on POWER. |
@makortel I'm not sure if #28576 fully applies here. IIUC, it is possible to build ONNX on Power and load libraries without anything exploding. |
I just want to remind that in that case the configuration hash would be different between x86 and POWER (which motivated the SwitchProducer).
At the moment the POWER build fails Do I interpret correctly this means that cms-externals/onnxruntime#4 (or something like it) is needed to build ONNX for POWER in the first place? |
right, that's what I implied by "it's possible to build ONNX on power" |
Thanks for the clarification. I'm not really fond of "hacking an external to build" as a mid/long-term solution. I suppose the added complexity might be justified in the (very) short term if the ONNX continues to be "much faster" than TF for these models even after #28711. I'd prefer to find a general solution (#28576) because I'm afraid the situation will only get worse in the future (but I'm happy to be proven wrong). |
cms-externals/onnxruntime#4 only allows us to build but it does not run. ONNX unit tests failed on PowerPC |
I just realized that this issue should extend to the MXNET as well (used in deepBoosted taggers) |
#29172 is open to address this issue. Feedback is welcome! |
I'm closing this for now, after cms-sw/cmsdist#5743 was integrated |
[This issue is contingent on making ONNX properly working on PPC. ]
As a part of #28112 the support for running DeepFlavour and DeepDoubleX inference using TensorFlow was removed; it was replaced with the ONNX-based inference.
Trained model is the same in both cases.
The migration was done to minimize code replication and to bypass implementation/support of an adaptor layer that would allow a consistent support of both inference engines.
Similarly, deepBosted tagger was migrated from MXNet.
However, it was later discovered that ONNX lacks support for PPC architecture.
Some significant effort was made on CMS side to add support, e.g. in cms-externals/onnxruntime#4 but this did not give numerically correct performance at run time.
There is some somewhat significant motivation to provide support for PPC with numerically correct performance.
So, the more straightforward solution now is to keep support for TF-based [and MXNet] inference on PPC.
@hqucms @makortel @mrodozov
The text was updated successfully, but these errors were encountered: