Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support file split in NetCDFOutputWriter when max_filesize is exceeded #3506

Closed
wants to merge 18 commits into from

Conversation

josuemtzmo
Copy link
Collaborator

@josuemtzmo josuemtzmo commented Mar 13, 2024

This PR aims to allow the user to split netCDF files into smaller files when the filesize exceeds the max_filesize. This addresses #2967

@josuemtzmo
Copy link
Collaborator Author

josuemtzmo commented Mar 13, 2024

Although the new test_netcdf_file_splitting is working, I'm currently having issues in with the test_netcdf_time_averaging, where this test fails.

@test all(isapprox.(ds["c2"][:, n+1], c̄2(averaging_times), rtol=rtol))

I've tested the PR & the main upstream branch in my computer and I have the same error:

Stacktrace:
 [1] macro expansion
   @ ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Test/src/Test.jl:672 [inlined]
 [2] test_netcdf_time_averaging(arch::CPU)
   @ Main ~/github/Oceananigans.jl/test/test_netcdf_output_writer.jl:727
 [3] macro expansion
   @ ~/github/Oceananigans.jl/test/test_netcdf_output_writer.jl:889 [inlined]
 [4] macro expansion
   @ ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
 [5] top-level scope
   @ ~/github/Oceananigans.jl/test/test_netcdf_output_writer.jl:880
NetCDF output writer [CPU]: Test Failed at /Users/jmtzmo/github/Oceananigans.jl/test/test_netcdf_output_writer.jl:727
  Expression: all(isapprox.((ds["c2"])[:, n + 1], c̄2(averaging_times), rtol = rtol))

@navidcy navidcy linked an issue Mar 13, 2024 that may be closed by this pull request
josuemtzmo and others added 2 commits March 13, 2024 15:10
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
josuemtzmo and others added 2 commits March 13, 2024 15:12
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
josuemtzmo and others added 7 commits March 13, 2024 15:16
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
@navidcy
Copy link
Collaborator

navidcy commented Mar 13, 2024

Although the new test_netcdf_file_splitting is working, I'm currently having issues in with the test_netcdf_time_averaging, where this test fails.

@test all(isapprox.(ds["c2"][:, n+1], c̄2(averaging_times), rtol=rtol))

I've tested the PR & the main upstream branch in my computer and I have the same error:

Stacktrace:
 [1] macro expansion
   @ ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Test/src/Test.jl:672 [inlined]
 [2] test_netcdf_time_averaging(arch::CPU)
   @ Main ~/github/Oceananigans.jl/test/test_netcdf_output_writer.jl:727
 [3] macro expansion
   @ ~/github/Oceananigans.jl/test/test_netcdf_output_writer.jl:889 [inlined]
 [4] macro expansion
   @ ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
 [5] top-level scope
   @ ~/github/Oceananigans.jl/test/test_netcdf_output_writer.jl:880
NetCDF output writer [CPU]: Test Failed at /Users/jmtzmo/github/Oceananigans.jl/test/test_netcdf_output_writer.jl:727
  Expression: all(isapprox.((ds["c2"])[:, n + 1], c̄2(averaging_times), rtol = rtol))

me and @josuemtzmo sorted this out; things should be OK now!

@navidcy navidcy changed the title Support to netCDF split Support to netCDF file split when max_filesize is exceeded Mar 13, 2024
@navidcy navidcy changed the title Support to netCDF file split when max_filesize is exceeded Support file split in NetCDFOutputWriter when max_filesize is exceeded Mar 13, 2024
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
@glwagner
Copy link
Member

Cool! Very nice!

@navidcy navidcy self-requested a review March 13, 2024 18:29
@navidcy
Copy link
Collaborator

navidcy commented Mar 14, 2024

@Sbozzolo, why is the distributed CI not starting here? Because this PR is from a fork?

@Sbozzolo
Copy link
Member

The job is not showing up in the buildkite pipeline at all...

It should build PRs, can you try commiting again to see if it picked up?

@glwagner
Copy link
Member

glwagner commented Mar 14, 2024

We also had this problem on Thermodynamics and it was fixed by re-opening the PR from the repo rather than a fork. There is a security issue because you may not want to run buildkite jobs from random people (despite that we definitely do want to run buildkite for PRs from @josuemtzmo <3).

@glwagner
Copy link
Member

glwagner commented Mar 14, 2024

Screenshot 2024-03-14 at 9 00 50 AM

@Sbozzolo is this enabled? (This is in the pipeline settings on buildkite)

@Sbozzolo
Copy link
Member

Now it is.

(But I can turn it off if you prefer)

@navidcy
Copy link
Collaborator

navidcy commented Mar 14, 2024

hm... OK -- thanks all.

Not sure whether we want that setting or not... if we keep it disable it implies that every PR from a fork will never be merged?)

In the meantime, @josuemtzmo any chance you can re-open this as a branch of the main repo? we have the convention that branches are named with initials/my-branch-name (dashes), e.g., ncc/the-best-new-feature. I invited you as a collaborator in the repo.

@josuemtzmo
Copy link
Collaborator Author

Ok, I will push to the upstream, I'm closing this PR in preference to that other one.

@josuemtzmo
Copy link
Collaborator Author

This PR was replaced by #3512

@josuemtzmo josuemtzmo closed this Mar 14, 2024
@glwagner
Copy link
Member

Now it is.

(But I can turn it off if you prefer)

Well it is enabled for Oceananigans too. So I guess there is a different reason that forks are problematic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Splitting NetCDF output with NetCDFOutputWriter
4 participants