Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify cudacompat layer to use a 1-dimensional grid #586

Conversation

fwyzard
Copy link

@fwyzard fwyzard commented Nov 30, 2020

Remove the possibility of changing the grid size used by the cms::cudacompat layer, and make it a constant equal to {1, 1, 1}.

This avoids a thread-related problem caused by TBB using worker threads where the grid size had not been initialised.

The kernel for pixel clustering need to be rewritten to support a one-dimensional grid to run on the CPU.
Currently they are only used on the GPU in the Patatrack workflows, but they are exercised on the CPU by the gpuClustering_t tests; those tests have been commented out until the kernels can be updated.

Remove the possibility of changing the grid size used by the
cms::cudacompat layer, and make it a constant equal to {1, 1, 1}.

This avoids a thread-related problem caused by TBB using worker threads
where the grid size had not been initialised.

The kernel for pixel clustering need to be rewritten to support a
one-dimensional grid to run on the CPU.
Currently they are only used on the GPU in the Patatrack workflows, but
they are exercised on the CPU by the gpuClustering_t tests; those tests
have been commented out until the kernels can be updated.
@fwyzard fwyzard requested a review from VinInn November 30, 2020 10:56
@fwyzard
Copy link
Author

fwyzard commented Nov 30, 2020

Validation summary

Reference release CMSSW_11_2_0_pre10 at 6c149b2
Development branch cms-patatrack/CMSSW_11_2_X_Patatrack at e454ee0
Testing branch cms-patatrack/CMSSW_11_2_X_Patatrack at e454ee0 with PRs:

Validation plots

/RelValTTbar_14TeV/CMSSW_11_2_0_pre7-PU_112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

  • tracking validation plots and summary for workflow 11634.5
  • tracking validation plots and summary for workflow 11634.501
  • tracking validation plots and summary for workflow 11634.502
  • tracking validation plots and summary for workflow 11634.505
  • tracking validation plots and summary for workflow 11634.506

/RelValZMM_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v2/GEN-SIM-DIGI-RAW

  • tracking validation plots and summary for workflow 11634.5
  • tracking validation plots and summary for workflow 11634.501
  • tracking validation plots and summary for workflow 11634.502
  • tracking validation plots and summary for workflow 11634.505
  • tracking validation plots and summary for workflow 11634.506

/RelValZEE_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

  • tracking validation plots and summary for workflow 11634.5
  • tracking validation plots and summary for workflow 11634.501
  • tracking validation plots and summary for workflow 11634.502
  • tracking validation plots and summary for workflow 11634.505
  • tracking validation plots and summary for workflow 11634.506

Validation plots (CPU vs GPU)

/RelValTTbar_14TeV/CMSSW_11_2_0_pre7-PU_112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

  • tracking validation plots and summary for workflows 11634.502 and 11634.501
  • tracking validation plots and summary for workflows 11634.506 and 11634.505

/RelValZMM_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v2/GEN-SIM-DIGI-RAW

  • tracking validation plots and summary for workflows 11634.502 and 11634.501
  • tracking validation plots and summary for workflows 11634.506 and 11634.505

/RelValZEE_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

  • tracking validation plots and summary for workflows 11634.502 and 11634.501
  • tracking validation plots and summary for workflows 11634.506 and 11634.505

Throughput plots

/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53

scan-136.885502.png
zoom-136.885502.png
scan-136.885512.png
zoom-136.885512.png
scan-136.885522.png
zoom-136.885522.png

logs and nvprof/nvvp profiles

/RelValTTbar_14TeV/CMSSW_11_2_0_pre7-PU_112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

  • reference release, workflow 11634.5
  • development release, workflow 11634.5
  • development release, workflow 11634.501
  • development release, workflow 11634.502
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • development release, workflow 11634.505
  • development release, workflow 11634.506
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • development release, workflow 11634.511
  • development release, workflow 11634.512
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
  • development release, workflow 11634.521
  • development release, workflow 11634.522
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • development release, workflow 136.885502
  • development release, workflow 136.885512
  • development release, workflow 136.885522
  • testing release, workflow 11634.5
  • testing release, workflow 11634.501
  • testing release, workflow 11634.502
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • testing release, workflow 11634.505
  • testing release, workflow 11634.506
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • testing release, workflow 11634.511
  • testing release, workflow 11634.512
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
  • testing release, workflow 11634.521
  • testing release, workflow 11634.522
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • testing release, workflow 136.885502
  • testing release, workflow 136.885512
  • testing release, workflow 136.885522

/RelValZMM_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v2/GEN-SIM-DIGI-RAW

  • reference release, workflow 11634.5
  • development release, workflow 11634.5
  • development release, workflow 11634.501
  • development release, workflow 11634.502
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • development release, workflow 11634.505
  • development release, workflow 11634.506
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • development release, workflow 11634.511
  • development release, workflow 11634.512
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
  • development release, workflow 11634.521
  • development release, workflow 11634.522
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • development release, workflow 136.885502
  • development release, workflow 136.885512
  • development release, workflow 136.885522
  • testing release, workflow 11634.5
  • testing release, workflow 11634.501
  • testing release, workflow 11634.502
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • testing release, workflow 11634.505
  • testing release, workflow 11634.506
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • testing release, workflow 11634.511
  • testing release, workflow 11634.512
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
  • testing release, workflow 11634.521
  • testing release, workflow 11634.522
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • testing release, workflow 136.885502
  • testing release, workflow 136.885512
  • testing release, workflow 136.885522

/RelValZEE_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

  • reference release, workflow 11634.5
  • development release, workflow 11634.5
  • development release, workflow 11634.501
  • development release, workflow 11634.502
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • development release, workflow 11634.505
  • development release, workflow 11634.506
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • development release, workflow 11634.511
  • development release, workflow 11634.512
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
  • development release, workflow 11634.521
  • development release, workflow 11634.522
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • development release, workflow 136.885502
  • development release, workflow 136.885512
  • development release, workflow 136.885522
  • testing release, workflow 11634.5
  • testing release, workflow 11634.501
  • testing release, workflow 11634.502
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • testing release, workflow 11634.505
  • testing release, workflow 11634.506
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • testing release, workflow 11634.511
  • testing release, workflow 11634.512
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
  • testing release, workflow 11634.521
  • testing release, workflow 11634.522
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
    • ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
  • testing release, workflow 136.885502
  • testing release, workflow 136.885512
  • testing release, workflow 136.885522

Logs

The full log is available at https://patatrack.web.cern.ch/patatrack/validation/pulls/6fb888134e1c0c96596810e45bca76019e19f734/log .

@fwyzard
Copy link
Author

fwyzard commented Nov 30, 2020

Should fix #564

@VinInn
Copy link

VinInn commented Nov 30, 2020

if compiles and run the code is ok with me.
CPU workflows may even go faster w/o TLS

Copy link
Author

@fwyzard fwyzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove unnecessary asserts.

Co-authored-by: Matti Kortelainen <matti.kortelainen@cern.ch>
@fwyzard
Copy link
Author

fwyzard commented Dec 1, 2020

All crashes in the CPU workflows are indeed fixed.

No changes to the physics performance or throughput, as expected.

@fwyzard fwyzard merged commit 9de7905 into cms-patatrack:CMSSW_11_2_X_Patatrack Dec 1, 2020
@fwyzard fwyzard deleted the cudacompat_one_dimensional_grid branch December 1, 2020 01:22
fwyzard pushed a commit that referenced this pull request Dec 25, 2020
Remove the possibility of changing the grid size used by the
cms::cudacompat layer, and make it a constant equal to {1, 1, 1}.

This avoids a thread-related problem caused by TBB using worker threads
where the grid size had not been initialised.

The kernel for pixel clustering need to be rewritten to support a
one-dimensional grid to run on the CPU.
Currently they are only used on the GPU in the Patatrack workflows, but
they are exercised on the CPU by the gpuClustering_t tests; those tests
have been commented out until the kernels can be updated.
fwyzard added a commit that referenced this pull request Dec 26, 2020
Remove the possibility of changing the grid size used by the
cms::cudacompat layer, and make it a constant equal to {1, 1, 1}.

This avoids a thread-related problem caused by TBB using worker threads
where the grid size had not been initialised.

The kernel for pixel clustering need to be rewritten to support a
one-dimensional grid to run on the CPU.
Currently they are only used on the GPU in the Patatrack workflows, but
they are exercised on the CPU by the gpuClustering_t tests; those tests
have been commented out until the kernels can be updated.
fwyzard pushed a commit that referenced this pull request Dec 29, 2020
Remove the possibility of changing the grid size used by the
cms::cudacompat layer, and make it a constant equal to {1, 1, 1}.

This avoids a thread-related problem caused by TBB using worker threads
where the grid size had not been initialised.

The kernel for pixel clustering need to be rewritten to support a
one-dimensional grid to run on the CPU.
Currently they are only used on the GPU in the Patatrack workflows, but
they are exercised on the CPU by the gpuClustering_t tests; those tests
have been commented out until the kernels can be updated.
fwyzard pushed a commit that referenced this pull request Dec 29, 2020
Remove the possibility of changing the grid size used by the
cms::cudacompat layer, and make it a constant equal to {1, 1, 1}.

This avoids a thread-related problem caused by TBB using worker threads
where the grid size had not been initialised.

The kernel for pixel clustering need to be rewritten to support a
one-dimensional grid to run on the CPU.
Currently they are only used on the GPU in the Patatrack workflows, but
they are exercised on the CPU by the gpuClustering_t tests; those tests
have been commented out until the kernels can be updated.
fwyzard added a commit that referenced this pull request Dec 29, 2020
Remove the possibility of changing the grid size used by the
cms::cudacompat layer, and make it a constant equal to {1, 1, 1}.

This avoids a thread-related problem caused by TBB using worker threads
where the grid size had not been initialised.

The kernel for pixel clustering need to be rewritten to support a
one-dimensional grid to run on the CPU.
Currently they are only used on the GPU in the Patatrack workflows, but
they are exercised on the CPU by the gpuClustering_t tests; those tests
have been commented out until the kernels can be updated.
fwyzard added a commit that referenced this pull request Dec 29, 2020
Remove the possibility of changing the grid size used by the
cms::cudacompat layer, and make it a constant equal to {1, 1, 1}.

This avoids a thread-related problem caused by TBB using worker threads
where the grid size had not been initialised.

The kernel for pixel clustering need to be rewritten to support a
one-dimensional grid to run on the CPU.
Currently they are only used on the GPU in the Patatrack workflows, but
they are exercised on the CPU by the gpuClustering_t tests; those tests
have been commented out until the kernels can be updated.
fwyzard added a commit that referenced this pull request Dec 29, 2020
Remove the possibility of changing the grid size used by the
cms::cudacompat layer, and make it a constant equal to {1, 1, 1}.

This avoids a thread-related problem caused by TBB using worker threads
where the grid size had not been initialised.

The kernel for pixel clustering need to be rewritten to support a
one-dimensional grid to run on the CPU.
Currently they are only used on the GPU in the Patatrack workflows, but
they are exercised on the CPU by the gpuClustering_t tests; those tests
have been commented out until the kernels can be updated.
makortel added a commit to makortel/pixeltrack-standalone that referenced this pull request Dec 29, 2020
makortel added a commit to makortel/pixeltrack-standalone that referenced this pull request Dec 29, 2020
fwyzard pushed a commit that referenced this pull request Jan 13, 2021
Remove the possibility of changing the grid size used by the
cms::cudacompat layer, and make it a constant equal to {1, 1, 1}.

This avoids a thread-related problem caused by TBB using worker threads
where the grid size had not been initialised.

The kernel for pixel clustering need to be rewritten to support a
one-dimensional grid to run on the CPU.
Currently they are only used on the GPU in the Patatrack workflows, but
they are exercised on the CPU by the gpuClustering_t tests; those tests
have been commented out until the kernels can be updated.
fwyzard added a commit that referenced this pull request Jan 15, 2021
Remove the possibility of changing the grid size used by the
cms::cudacompat layer, and make it a constant equal to {1, 1, 1}.

This avoids a thread-related problem caused by TBB using worker threads
where the grid size had not been initialised.

The kernel for pixel clustering need to be rewritten to support a
one-dimensional grid to run on the CPU.
Currently they are only used on the GPU in the Patatrack workflows, but
they are exercised on the CPU by the gpuClustering_t tests; those tests
have been commented out until the kernels can be updated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix Pixels Pixels-related developments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants