Skip to content
This repository was archived by the owner on Dec 9, 2024. It is now read-only.

Precompute custom layer variable#297

Merged
YonsiG merged 1 commit intomasterfrom
precomputeVariables_master
Jul 7, 2023
Merged

Precompute custom layer variable#297
YonsiG merged 1 commit intomasterfrom
precomputeVariables_master

Conversation

@VourMa
Copy link
Copy Markdown
Contributor

@VourMa VourMa commented Jun 23, 2023

As per title, we identified some variables that were computed per kernel, even though they are properties of the modules and, hence, they can be computed up front. These variables are used in the Triplet and Quintuplet kernels.

On cgpu-1 (A30)
Before:
image

After:
image

I guess the timing is within the usual variations.

Stall reduction
Before (T3):
image

After (T3):
image

Before (T5):
image

After (T5):
image

@VourMa VourMa requested review from GNiendorf and YonsiG June 30, 2023 15:13
@GNiendorf
Copy link
Copy Markdown
Member

GNiendorf commented Jun 30, 2023

What's the argument for precomputing these? I forget, but the modules.cu stuff isn't included in the timing right? So are we expecting this quantity to be on the GPU already when we integrate into CMSSW?

Edit: Never mind, we discussed this in the meeting, but I agree that the issue of putting the modules.cu calculations in the timing should be done in a separate PR. Would be outside the scope of this PR.

@GNiendorf
Copy link
Copy Markdown
Member

GNiendorf commented Jun 30, 2023

Is there any benefit to the kernel timing from the reduced stalls btw? From the profiler.

@VourMa
Copy link
Copy Markdown
Contributor Author

VourMa commented Jun 30, 2023

Is there any benefit to the kernel timing from the reduced stalls btw? From the profiler.

Yes, about 7% - 1.98 to 1.84 ms on the A100 of the NVIDIA cluster.

@YonsiG
Copy link
Copy Markdown
Contributor

YonsiG commented Jul 7, 2023

Thank you for this checks Manos! Gavin has asked some good questions and I think the commit is clear for me to merge.

@YonsiG YonsiG merged commit 37d9d3c into master Jul 7, 2023
Comment thread SDL/Quintuplet.cu
@ariostas ariostas deleted the precomputeVariables_master branch May 8, 2024 21:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants