-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cosine bell in pr test suite "takes too long" #241
Comments
On Chrysalis on 2 nodes, the cosine bell test takes 3:48 out of 16:22 total for the pr test suite, which seemed acceptable to me. However, if we need to reduce the highest resolution(s) from 60 km to 90 km or 120 km, that might be okay. @mark-petersen, can you clarify what would need to change for this test to be usable for you? @vanroekel, would reducing the highest resolution be acceptable or would it negate the usefulness of the test? Obviously, we will improve performance in the long run using parsl but that project will need at least several more months. |
@xylar thanks for posting this. My preference is the 150km. Then we still have four resolutions, which is fine for a convergence test:
The QU120 forward step takes 2:43 debug and 1:00 minutes with gnu optimized on grizzly. That doesn't seem worth it for a fifth point. |
The problem is that the error at coarse resolution saturates near 1.0 so this statement isn't really accurate. You might determine that the order of convergence is lower than it really is if you use resolutions that are too coarse. As a result, you might be more tolerant for errors that reduce the order of convergence in the future. So I hesitate to rely on such coarse resolutions. I think 3 minutes is still an acceptable amount of time for the test to take. |
I don't think we want to run the @vanroekel, could you weigh in? @ambrad, do you have comments? |
Sorry to be a bit pedantic/negative, but I personally don't think 3:48 is a huge burden for a test. My view of this suite is it only gets run when a PR is made and only needs to be run once. 16min of testing doesn't seem onerous to me especially given its only 2 nodes, which seems it won't get in the way of other simulations/testing. I also agree with @xylar that we don't want to run the pr suite in debug. It seems more appropriate for the things nightly catches. Finally, looking at the plot here -- #111 -- I'm not convinced either we can get away with fewer resolutions, especially just 4. |
This plot shows what I'm talking about with regard to error saturating at coarse resolution: And I fully agree with @vanroekel having just looked at the plot at #111 (comment) that I don't think the 4 coarsest resolutions can be trusted to give an accurate estimate of the rate of convergence. |
@mark-petersen, I realize Grizzly is slower than Chrysalis. What kind of timing do you see in optimized for the full test case? If it's too expensive, maybe it's time to move your testing to Chrysalis or use more nodes or something? |
OK, looks like I should change my testing method to use a light-weight scheme up until the final one, and try to use |
I'd think there's no harm in using |
@mark-petersen mentioned in E3SM-Project/E3SM#4552 (comment)
We should make the necessary changes to the cosine bell test doesn't have to be skipped.
The text was updated successfully, but these errors were encountered: