Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix for non traditional beta plane #2877

Merged
merged 12 commits into from
Feb 24, 2023
Merged

Conversation

simone-silvestri
Copy link
Collaborator

@simone-silvestri simone-silvestri commented Jan 24, 2023

cc @francispoulin

closes #2876

Copy link
Collaborator

@francispoulin francispoulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change you made makes pefect sense and I am happy to say that now it runs!

Thank you for finding the fix so quickly!

@simone-silvestri
Copy link
Collaborator Author

It seemed quite easy but now all tests are broken for a weird reason 😅

@francispoulin
Copy link
Collaborator

It seemed quite easy but now all tests are broken for a weird reason sweat_smile

Very strange. I can't believe this one simple change would cause all these problems. Is there something else going on here?

@glwagner
Copy link
Member

@ali-ramadhan @suyashbire1 can you take a look at this change?

@francispoulin
Copy link
Collaborator

Would it be worth retrying the tests to see if the problems persist on a second attempt?

@navidcy navidcy added the bug 🐞 Even a perfect program still has bugs label Jan 27, 2023
@francispoulin
Copy link
Collaborator

Is anyone able to restart the tests to see if that fixes the problem?

@francispoulin
Copy link
Collaborator

@simone-silvestri , any idea what might be going on here? Can we try restarting the tests to see if it was just a glitch the first time?

@simone-silvestri
Copy link
Collaborator Author

I restarted them, sorry it took a while

@francispoulin
Copy link
Collaborator

I restarted them, sorry it took a while

No problem and thanks a lot for doing it!

@simone-silvestri
Copy link
Collaborator Author

This is very weird

@francispoulin
Copy link
Collaborator

Thanks for the additional changes and it seems to be doing much better! Still three tests that fail. I looked ath the initialize environments and saw what I copied below. It seems to have probelms with Oceananigans?



✗ Oceananigans
--
  | 105 dependencies successfully precompiled in 112 seconds
  | 1 dependency errored. To see a full report either run `import Pkg; Pkg.precompile()` or load the package
  | Precompiling project...
  | ✗ Oceananigans
  | 0 dependencies successfully precompiled in 22 seconds. 105 already precompiled.
  |  
  | ERROR: The following 1 direct dependency failed to precompile:
  |  
  | Oceananigans [9e8cae18-63c1-5223-a75c-80ca9d6e9a09]
  |  
  | Failed to precompile Oceananigans [9e8cae18-63c1-5223-a75c-80ca9d6e9a09] to /storage5/buildkite-agent/.julia-9773/compiled/v1.8/Oceananigans/jl_yggB5x.
  | [NVBLAS] No Gpu available
  | [NVBLAS] NVBLAS_CONFIG_FILE environment variable is NOT set : relying on default config filename 'nvblas.conf'
  | [NVBLAS] Cannot open default config file 'nvblas.conf'
  | [NVBLAS] Config parsed
  | [NVBLAS] CPU Blas library need to be provided
  | ERROR: LoadError: syntax: missing comma or ) in argument list
  | Stacktrace:
  | [1] top-level scope
  | @ ~/builds/tartarus-1/clima/oceananigans/src/Coriolis/non_traditional_beta_plane.jl:75
  | [2] include(mod::Module, _path::String)
  | @ Base ./Base.jl:419
  | [3] include(x::String)
  | @ Oceananigans.Coriolis ~/builds/tartarus-1/clima/oceananigans/src/Coriolis/Coriolis.jl:1
  | [4] top-level scope
  | @ ~/builds/tartarus-1/clima/oceananigans/src/Coriolis/Coriolis.jl:28
  | [5] include(mod::Module, _path::String)
  | @ Base ./Base.jl:419
  | [6] include(x::String)
  | @ Oceananigans ~/builds/tartarus-1/clima/oceananigans/src/Oceananigans.jl:5
  | [7] top-level scope
  | @ ~/builds/tartarus-1/clima/oceananigans/src/Oceananigans.jl:230
  | [8] include
  | @ ./Base.jl:419 [inlined]
  | [9] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::Nothing)
  | @ Base ./loading.jl:1554
  | [10] top-level scope
  | @ stdin:1
  | in expression starting at /var/lib/buildkite-agent/builds/tartarus-1/clima/oceananigans/src/Coriolis/non_traditional_beta_plane.jl:75
  | in expression starting at /var/lib/buildkite-agent/builds/tartarus-1/clima/oceananigans/src/Coriolis/Coriolis.jl:1
  | in expression starting at /var/lib/buildkite-agent/builds/tartarus-1/clima/oceananigans/src/Oceananigans.jl:1
  | in expression starting at stdin:1
  | Stacktrace:
  | [1] pkgerror(msg::String)
  | @ Pkg.Types /storage5/buildkite-agent/julia-1.8.2/share/julia/stdlib/v1.8/Pkg/src/Types.jl:67
  | [2] precompile(ctx::Pkg.Types.Context, pkgs::Vector{String}; internal_call::Bool, strict::Bool, warn_loaded::Bool, already_instantiated::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
  | @ Pkg.API /storage5/buildkite-agent/julia-1.8.2/share/julia/stdlib/v1.8/Pkg/src/API.jl:1432
  | [3] precompile
  | @ /storage5/buildkite-agent/julia-1.8.2/share/julia/stdlib/v1.8/Pkg/src/API.jl:1063 [inlined]
  | [4] #precompile#225
  | @ /storage5/buildkite-agent/julia-1.8.2/share/julia/stdlib/v1.8/Pkg/src/API.jl:1062 [inlined]
  | [5] precompile (repeats 2 times)
  | @ /storage5/buildkite-agent/julia-1.8.2/share/julia/stdlib/v1.8/Pkg/src/API.jl:1062 [inlined]
  | [6] top-level scope
  | @ none:1
  | 🚨 Error: The command exited with status 1
  | user command error: exit status 1


@navidcy
Copy link
Collaborator

navidcy commented Feb 5, 2023

I rebuild the CI -- let's see.

@francispoulin
Copy link
Collaborator

When I looked at the error I see there is a problem with yode?


Coriolis: Error During Test at /var/lib/buildkite-agent/builds/tartarus-10/clima/oceananigans/test/test_time_stepping.jl:280
  | Test threw exception
  | Expression: time_stepping_works_with_coriolis(arch, FT, Coriolis)
  | TaskFailedException
  |  
  | nested task error: MethodError: no method matching ynode(::Type{Center}, ::Int64, ::RectilinearGrid{Float64, Periodic, Periodic, Bounded, Float64, Float64, Float64, OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, CPU})
  | Closest candidates are:
  | ynode(::Any, ::Any, ::Any, ::AbstractField) at ~/builds/tartarus-10/clima/oceananigans/src/Fields/abstract_field.jl:79
  | ynode(::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::ImmersedBoundaryGrid) at ~/builds/tartarus-10/clima/oceananigans/src/ImmersedBoundaries/ImmersedBoundaries.jl:233
  | ynode(::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any) at ~/builds/tartarus-10/clima/oceananigans/src/Grids/grid_utils.jl:228
 

@navidcy
Copy link
Collaborator

navidcy commented Feb 12, 2023

Let's see if 56c9b53 does the job.

@francispoulin
Copy link
Collaborator

Thanks @navidcy for making these changes, this is much better!

I looked in the regression tests and it seems that the v velocity is not as close as it used to be. I wonder if this is a matter of the error being slightly bigger?

Shallow Water Bickley jet simulation [GPU, ConservativeFormulation]: Test Failed at /net/ocean/home/data44/data5/glwagner/.buildkite-agent/builds/sverdrup-10/clima/oceananigans/test/regression_tests/shallow_water_bickley_jet_regression.jl:93
  | Expression: all(test_fields.v .≈ truth_fields.v)
  | Stacktrace:
  | [1] macro expansion
  | @ /net/ocean/home/data44/data5/glwagner/julia-1.8.5/share/julia/stdlib/v1.8/Test/src/Test.jl:464 [inlined]
  | [2] run_shallow_water_regression(arch::GPU, formulation::ConservativeFormulation; regenerate_data::Bool)
  | @ Main /net/ocean/home/data44/data5/glwagner/.buildkite-agent/builds/sverdrup-10/clima/oceananigans/test/regression_tests/shallow_water_bickley_jet_regression.jl:93
 

@navidcy
Copy link
Collaborator

navidcy commented Feb 12, 2023

hm... before we start pondering on that I just rerun the test and see what happens :)

@navidcy
Copy link
Collaborator

navidcy commented Feb 12, 2023

OK, it fails. We should run manually and figure out where the problem comes from?

@navidcy
Copy link
Collaborator

navidcy commented Feb 12, 2023

If it was related to the rewrite of the terms then shouldn't been affecting CPU regression test as well?

@simone-silvestri
Copy link
Collaborator Author

This is a problem on all PRs. I am not sure about our regression test data. It might be faulty.

@navidcy
Copy link
Collaborator

navidcy commented Feb 19, 2023

Let's hold on merging until #2922 is resolved.

@navidcy navidcy merged commit 5f45abe into main Feb 24, 2023
@navidcy navidcy deleted the ss/bugfix_non_traditional_coriolis branch February 24, 2023 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Even a perfect program still has bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problems with NonTraditionalBetaPlane
4 participants