Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster LBL derivatives? #342

Merged
merged 3 commits into from
Apr 29, 2021
Merged

Faster LBL derivatives? #342

merged 3 commits into from
Apr 29, 2021

Conversation

riclarsson
Copy link
Contributor

Here's my output from the script Oliver provided in a currently open issue:

Time per pressure/temperature combination
CASE 1 - All species w/ jacobian ArrayOfSpeciesTagVMR: 1.586376s
CASE 2 - All species w/ jacobian Line::VMR : 3.782718s
CASE 3 - All species w/o jacobian : 1.323047s
CASE 4 - One species at a time : 1.330896s

@olemke
Copy link
Member

olemke commented Apr 28, 2021

I reran case 1 and 4 through the profiler. The remaining difference now comes down to the extra time spent in ComputeValues::operator-= for the derivates.

Case 1
case1_5000_4

Case 4
case4_5000_4

@riclarsson
Copy link
Contributor Author

Thank you for running the profiler, Oliver. I will have a look. It is definitely doing more than it should for case 1. I think a similar cut-down in time should be possible as for frequency_loop, but I need to check. (My initial quick-and-dirty stuff just now didn't solve it, so I need to have a more detailed look when there's time.)

@riclarsson
Copy link
Contributor Author

New times are slightly faster. For the same case as above (5000 frequencies):

Time per pressure/temperature combination
CASE 1 - All species w/ jacobian ArrayOfSpeciesTagVMR: 1.486118s
CASE 2 - All species w/ jacobian Line::VMR           : 3.820726s
CASE 3 - All species w/o jacobian                    : 1.339322s
CASE 4 - One species at a time                       : 1.349771s

Also for 100,000 frequencies:

Time per pressure/temperature combination
CASE 1 - All species w/ jacobian ArrayOfSpeciesTagVMR: 21.045467s
CASE 2 - All species w/ jacobian Line::VMR           : 64.153041s
CASE 3 - All species w/o jacobian                    : 20.579016s
CASE 4 - One species at a time                       : 20.542538s

In #339 the case 1 to 4 ratio was 150-200%. It is 2-10% now. The case 2 to 4 ratio is pretty much the same as before since it is computing the full derivative.

@riclarsson riclarsson marked this pull request as ready for review April 29, 2021 09:49
@riclarsson
Copy link
Contributor Author

I'm marking this as ready for review since the previous tests showed where time was being spent. I think the small 'regression' in case 2/3/4 is just noise. They should not be seeing any difference that could cost time now cf the first time comparison of this PR (done at the commit before the latest)

@olemke
Copy link
Member

olemke commented Apr 29, 2021

I agree that the slight differences in case 2-4 are negligible. I see basically the same times for 3/4 and 2 scores even slightly lower.

Before cc84172
CASE 1: 1.273507
CASE 2: 2.751986
CASE 3: 1.070707
CASE 4: 1.074574

After cc84172
CASE 1: 1.129954
CASE 2: 2.709163
CASE 3: 1.071539
CASE 4: 1.075069

@olemke olemke merged commit 8025f55 into atmtools:master Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants