Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MXNet prediction tool #4167

Merged

Conversation

hqucms
Copy link
Contributor

@hqucms hqucms commented Jul 3, 2018

This is to add MXNet to CMSSW externals following the discussions in cms-sw/cmssw#21314. It is needed for the integration of the DeepAK8 tagger into CMSSW.

This includes both the C and C++ APIs. MXNet is built in prediction-only mode (MXNET_PREDICT_ONLY=1) here, as so far we only intend to use it for the evaluation of the neural networks, not for training (which is typically done on a GPU w/ the python API). By compiling in the prediction-only mode, MXNet is also set to run in Native Engine mode such that it runs in the master thread instead of creating its own thread pool.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 3, 2018

A new Pull Request was created by @hqucms for branch IB/CMSSW_10_2_X/gcc630.

@cmsbuild, @smuzaffar, @gudrutis, @mrodozov can you please review it and eventually sign? Thanks.
You can sign-off by replying to this message having '+1' in the first line of your reply.
You can reject by replying to this message having '-1' in the first line of your reply.

@hqucms
Copy link
Contributor Author

hqucms commented Jul 3, 2018

@gouskos who would also like to follow this thread.

@davidlange6
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 3, 2018

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/28993/console

@smuzaffar
Copy link
Contributor

@hqucms , please also update cmssw-tool-conf.spec to depend on mxnet-predict-toolfile. This is needed to get it within cmssw env.

@smuzaffar
Copy link
Contributor

@mrodozov , can you please check if this builds correctly on slc7/aarch64 and , gcc7/gcc8?

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 3, 2018

Pull request #4167 was updated.

@hqucms
Copy link
Contributor Author

hqucms commented Jul 3, 2018

@smuzaffar Done.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 3, 2018

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 3, 2018

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 3, 2018

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4167/28993/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 31
  • DQMHistoTests: Total histograms compared: 2899480
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2899289
  • DQMHistoTests: Total skipped: 190
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 30 files compared)
  • Checked 128 log files, 14 edm output root files, 31 DQM output files

@mrodozov
Copy link
Contributor

mrodozov commented Jul 5, 2018

tested on slc6_amd64_gcc630,700 slc7_amd64_gcc700,810 slc7_aarch64_gcc700

  • builds fine.

@mrodozov
Copy link
Contributor

mrodozov commented Jul 5, 2018

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 5, 2018

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/29017/console

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 5, 2018

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 5, 2018

Comparison job queued.

@smuzaffar
Copy link
Contributor

+externals

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 5, 2018

This pull request is fully signed and it will be integrated in one of the next IB/CMSSW_10_2_X/gcc630 IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 5, 2018

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4167/29017/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 31
  • DQMHistoTests: Total histograms compared: 2899480
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2899288
  • DQMHistoTests: Total skipped: 190
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 30 files compared)
  • Checked 128 log files, 14 edm output root files, 31 DQM output files

@slava77
Copy link
Contributor

slava77 commented Jul 10, 2018

@smuzaffar @fabiocos
Please let me know what is the plan for merging this.
Thank you.

@fabiocos
Copy link
Contributor

@slava77 as the build looks functional I think that we have no technical obstacle to integrate it, @smuzaffar please comment in case.

Anyway, looking at the discussions at the root of this request, I would like to better understand the overall strategy, as this is the 3rd deep learning-oriented external tool that we are integrating to my knowledge, besides tensorflow and lwtnn. From the slides of @Dr15Jones and @makortel at the november O&C meeting I understand that the idea is to use the DeepAK8 tagger based on MXNet as a testbed for that approach. As I did not take part to that discussion, I would like to understand whether this request is part of an overall strategy, as it seems. Do we want to have the 3 of them within CMSSW for comparison, with the idea of possibly moving in the longer term towards a single approach?

@smuzaffar
Copy link
Contributor

@fabiocos , yes there are not technical obstacle integrating it. As this is new tool, so we can integrate it now OR we can wait for #4185 (which also include this PR changes) to be fully tested.

@makortel
Copy link
Contributor

My feeling (could be wrong) is that at the moment we don't have enough experience on these tools on the inference side to make a clear choice (are there any other downsides than supporting yet another external?). Also, if we start restricting the inference tools, it should be clearly communicated to the developer community.

@davidlange6
Copy link
Contributor

davidlange6 commented Jul 11, 2018 via email

@fabiocos
Copy link
Contributor

@davidlange6 I agree that general long term strategies goes beyond the single PR discussion. But the problem is practically posed through PRs. This tool was already discussed and mentioned, so ok, but in case more arrives, I think we need to understand where we want to go

@fabiocos
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit e10e505 into cms-sw:IB/CMSSW_10_2_X/gcc630 Jul 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants