Cleaning up tests for efficiency #1990

tomchor · 2021-09-19T18:12:21Z

I was going quickly through some tests for a PR and found more than one instance of something like this:

Oceananigans.jl/test/test_output_writers.jl

Lines 181 to 213 in 93c497a

    
           for arch in archs 
        
               # Some tests can reuse this same grid and model. 
        
               topo = (Periodic, Periodic, Bounded) 
        
               grid = RegularRectilinearGrid(topology=topo, size=(4, 4, 4), extent=(1, 1, 1)) 
        
               model = NonhydrostaticModel(architecture=arch, grid=grid) 
        
               @testset "WindowedTimeAverage [$(typeof(arch))]" begin 
        
                   @info "  Testing WindowedTimeAverage [$(typeof(arch))]..." 
        
                   @test instantiate_windowed_time_average(model) 
        
                   @test time_step_with_windowed_time_average(model) 
        
                   @test_throws ArgumentError AveragedTimeInterval(1.0, window=1.1) 
        
               end 
        
           end 
        
           include("test_netcdf_output_writer.jl") 
        
           include("test_jld2_output_writer.jl") 
        
           include("test_checkpointer.jl") 
        
           for arch in archs 
        
               topo =(Periodic, Periodic, Bounded) 
        
               grid = RegularRectilinearGrid(topology=topo, size=(4, 4, 4), extent=(1, 1, 1)) 
        
               model = NonhydrostaticModel(architecture=arch, grid=grid) 
        
               @testset "Dependency adding [$(typeof(arch))]" begin 
        
                   @info "    Testing dependency adding [$(typeof(arch))]..." 
        
                   test_dependency_adding(model) 
        
               end 
        
               @testset "Time averaging of output [$(typeof(arch))]" begin 
        
                   @info "    Testing time averaging of output [$(typeof(arch))]..." 
        
                   test_windowed_time_averaging_simulation(model) 
        
               end 
        
           end

Where, unless I'm missing something we run a couple of unnecessary loops. In this case I believe we're creating 4 models, when we could be creating only two.

Since the tests are taking a considerable amount of time to run (I think something around 2 hours on the CI servers) I think it'd be a good idea for us to tackle these as time permits. Not necessarily all at once, which would take a huge amount of effort, but maybe one PR here and there when we catch these things. (Although I'm also not opposed to re-organizing all the tests if it'll significantly improve performance.)

glwagner · 2021-09-19T19:51:46Z

Yes we definitely need to speed up the tests. I think 95% of the time it takes to run tests is spent in compilation, however, so creating fewer models may not help much. Instead we may need to be more strategic with how we run tests, and perhaps also work on speeding up compilation time using something like SnoopCompile.jl.

On #1962 we add bars capability, which will allow us to separate the tests into a few categories: fast-running, crucial tests will run on every commit to a PR, and slower, more comprehensive integration tests will run only when bors try or bors r+ is invoked. This might help streamline the development workflow. Also if we are using Caltech's central cluster for CI we can potentially split the jobs amongst more workers, which might help speed up tests overall.

Even more important is simplifying the test implementation. Right now updating tests and validation experiments is a time sink for developers that change the API and has really slowed development down lately. So we shouldn't rewrite tests in a way that makes development more difficult (eg keeping the maintenance burden of the test infrastructure small is more important than decreasing the computational cost of tests). This is really a side comment --- we should be able to improve the tests both so they are faster and easier to maintain, if we are careful.

tomchor · 2021-09-19T20:10:34Z

On #1962 we add bars capability, which will allow us to separate the tests into a few categories: fast-running, crucial tests will run on every commit to a PR, and slower, more comprehensive integration tests will run only when bors try or bors r+ is invoked. This might help streamline the development workflow. Also if we are using Caltech's central cluster for CI we can potentially split the jobs amongst more workers, which might help speed up tests overall.

Definitely looking forward to that PR.

Even more important is simplifying the test implementation. Right now updating tests and validation experiments is a time sink for developers that change the API and has really slowed development down lately. So we shouldn't rewrite tests in a way that makes development more difficult (eg keeping the maintenance burden of the test infrastructure small is more important than decreasing the computational cost of tests). This is really a side comment --- we should be able to improve the tests both so they are faster and easier to maintain, if we are careful.

I definitely agree with that. I just don't see how to do it. Being thorough with the tests (which I believe is something we want) necessarily comes with using the API many times, making changes to it a bit slower to implement, no?

glwagner · 2021-09-19T20:56:48Z

I definitely agree with that. I just don't see how to do it. Being thorough with the tests (which I believe is something we want) necessarily comes with using the API many times, making changes to it a bit slower to implement, no?

I think many of the tests can probably be designed / written in a more maintainable way. There's a lot of boilerplate and copy-pasted code in the tests.

glwagner · 2023-03-23T04:21:28Z

Should we close this? There's no real specific action we can take to close it, though certainly improving the tests should be an ongoing priority...

tomchor · 2023-03-23T16:10:47Z

Sure

tomchor added the performance 🏍️ So we can get the wrong answer even faster label Sep 20, 2021

tomchor closed this as completed Mar 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleaning up tests for efficiency #1990

Cleaning up tests for efficiency #1990

tomchor commented Sep 19, 2021

glwagner commented Sep 19, 2021 •

edited

tomchor commented Sep 19, 2021

glwagner commented Sep 19, 2021

glwagner commented Mar 23, 2023

tomchor commented Mar 23, 2023

Cleaning up tests for efficiency #1990

Cleaning up tests for efficiency #1990

Comments

tomchor commented Sep 19, 2021

glwagner commented Sep 19, 2021 • edited

tomchor commented Sep 19, 2021

glwagner commented Sep 19, 2021

glwagner commented Mar 23, 2023

tomchor commented Mar 23, 2023

glwagner commented Sep 19, 2021 •

edited