Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Improvements and Code Simplification #494

Merged
merged 47 commits into from Jul 6, 2023

Conversation

RHammond2
Copy link
Collaborator

Feature or improvement description
This pull request includes some minor to moderate speedups in the following models, with the likely biggest speedups occurring for a GCH analysis:

  • Gauss wake deflection
  • Crespo Hernandez wake turbulence
  • Gauss wake velocity
  • Jensen wake velocity
  • TurbOPark wake velocity

The methods used to accomplish this include:

  • changing np.array(boolean_operation) to boolean_operation to reduce the number of array creations
  • reducing the number of stored variables, especially for single-use variables
  • variable reassignment
  • chunking large methods into multiple methods, especially those where the code is repeated (including calculations with minor variations)
  • more usage of the NumExpr library for large operations
  • list comprehensions in favor of loops and nested loops

Related issue, if one exists

Impacted areas of the software

This impacts a large portion of the code base, especially given the use of the pre-commit workflow, which seems to have been turned off somewhere along the way.

Additional supporting information

Test results, if applicable

@RHammond2 RHammond2 added the enhancement An improvement of an existing feature label Sep 2, 2022
@RHammond2 RHammond2 requested review from bayc, rafmudaf and misi9170 and removed request for bayc September 2, 2022 23:36
@RHammond2
Copy link
Collaborator Author

@rafmudaf I fixed up a merge conflict, and resolved the following based on our conversation:

  • Where black changed masking from variable[i:i+1] to variable[i : i + 1], I've put it back
  • I made all the exponents consistent to ** throughout the code
  • Any method that black put on multiple lines, I've reformatted to be of our previously discussed format of a single parameter per line

This should make the diff much easier to follow and stick with the expected formatting (if only for a handful of undiscussed or missed changes)

rafmudaf and others added 4 commits September 19, 2022 17:27
These are super useful for both memory and performance (runtime) profiling
@RHammond2
Copy link
Collaborator Author

@rafmudaf this is up to date with the latest formatting changes. The only thing not passing the linter is a file that should be deleted before merging but shows some of the changes.

v_top = (Gamma_top * zT) / (2 * np.pi * rT) * core_shape
v_top = np.mean( v_top, axis=(3,4) )
# w_top = (-1 * Gamma_top * yLocs) / (2 * np.pi * rT) * core_shape * decay
v_top, _ = _calculate_vortex(z_i, HH, D, yLocs, eps, Gamma_top, which="top")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RHammond2 I see and appreciate your intent here. In the original code, there's a lot of repeat and it can clearly be consolidated into a function. However, I find that readability is reduced, in this particular case. In comparing to the paper, you can follow along with what the original code is calculating, but as a reader you have to parse the new function and map it back to the variables in the calling code. Additionally, it was really helpful to print and plot different terms from each vortex calculation during development, so it would be good to preserve that functionality for future debugging. Instead of consolidation to a function, I propose to incorporate numexpr into all of the terms. I'm happy to make those changes, but do you agree with that suggestion?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, @rafmudaf! I'm totally fine with switching this back given that it helps align the methods with the paper that most users would be familiar with. If you're feeling particularly motivated, then you're more than welcome to put this back in place, otherwise, I'm ok with switching it back myself.

Also removes the implementation of the model that doesn't use numexpr. This was helpful in initial development of v3, but it's no longer practical.
ground=True,
)

V = np.add.reduce((V1, V2, V3, V4, V5, V6))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RHammond2 Can you explain the benefit us using np.add.reduce with the array sequence instead of np.add or the np.array + operator?

Copy link
Collaborator Author

@RHammond2 RHammond2 Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I can't remember why that's there. My hunch is that it was faster, or more convenient for some use case I was working on, but I've lost that thread entirely at this point. However, after testing out the snippet on small and large arrays, I found it's quite a bit slower compared to V1 + V2 + V3 + V4 + V5 + V6. The only reason to use np.add.reduce instead of np.add is to ensure the addition is handled consistently.

My vote is to put it back to original arithmetic, and if whatever I was attempting to do at that point comes back to me, it can be handled then.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok cool, thanks for following up. I'll revert that now but we can keep it on our radar for the future

This change should be revisited. There’s good changes, especially putting the vortex calculations in functions, but it should more directly relate to the model papers
@rafmudaf
Copy link
Collaborator

rafmudaf commented Jul 5, 2023

@RHammond2 I removed a few more places where np.array is redundant. Could you please review this commit: 879e613?

@rafmudaf
Copy link
Collaborator

rafmudaf commented Jul 5, 2023

I've also timed a 3-turbine array on Eagle, and we see pretty good results:

Commit hash Jensen Gauss GCH CC
16628a0 (v3.3) 0.2694 0.9297 1.4587 1.6214
5e5bb7f (v3.4) 0.2553 0.8701 1.4324 1.5948
c7a3912 (This PR) 0.2915 0.8588 1.0687 1.2973

This pull request speeds up GCH by about 25% and CC by about 18%.

@rafmudaf rafmudaf added this to the v3.5 milestone Jul 5, 2023
@rafmudaf rafmudaf self-assigned this Jul 5, 2023
@paulf81
Copy link
Collaborator

paulf81 commented Jul 6, 2023

Really nice result!

@RHammond2
Copy link
Collaborator Author

@RHammond2 I removed a few more places where np.array is redundant. Could you please review this commit: 879e613?

@rafmudaf this looks good to me. Basically as long as one of the items is an array, the result is always going to be an array.

@rafmudaf
Copy link
Collaborator

rafmudaf commented Jul 6, 2023

Ok got ya. I made note of that in the Proposed Style Guide Discussion. The other changes here are less syntactic and more algorithmic, but this one seems easy to enforce more broadly.

@rafmudaf rafmudaf merged commit 67104dd into NREL:develop Jul 6, 2023
5 checks passed
@RHammond2 RHammond2 deleted the test/speedups branch July 6, 2023 16:30
@rafmudaf rafmudaf mentioned this pull request Oct 26, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove unnecessary variable assignment in model functions and core simulation architecture
4 participants