Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extensions feature from Julia 1.9 with backward compatibility #98

Merged
merged 16 commits into from
Jan 21, 2024

Conversation

b-fg
Copy link
Member

@b-fg b-fg commented Jan 10, 2024

This allows the conditional loading of the CUDA, AMDGPU, and WriteVTK packages. Code directly depending on these packages is now located in ext/. For example, now the L₂ for CuArray is only loaded when using CUDA is called in addition to using WaterLily.

This is currently a first implementation without propers checks. To do:

  • Test actual backward compatibility: Tested on 1.8.5 and 1.9.4
  • Figure out why having the optional packages in the main Julia installation (ie. release) still messes up the initialization
  • Check CI tests compat and implement do-nothing @allowscalar version so that the CUDA extension can actually extend it: now using GPUArrays.@allowscalar which works for both CUDA and AMD.

This allows the conditional loading of the CUDA, AMDGPU, and WriteVTK packages.
Code directly depending on these packages is now located in ext/.
This is currently a first implementation without propers checks.
ToDo:
 - Test actual backward compatibility
 - Figure out why having the optional packages in the main Julia installation (ie. release) still messes up the initialization
@b-fg b-fg added the enhancement New feature or request label Jan 10, 2024
@b-fg
Copy link
Member Author

b-fg commented Jan 11, 2024

This is looking better and close to finishing. Some comments:

  • CI now only triggers when a new (normal, not draft) PR is created and when a push is done into master, such as a merge from a PR. For new draft PRs, CI is not triggered. Within a PR (normal or draft) new pushes do not trigger the CI - triggering can always be done manually if necessary. This should reduce notifications spam of CI failing and reduce the number of tests that we actually launch (important when migrating to a GPU-enabled CI system).

  • The new extensions packaging can make the developer workflow a little bit different. Before this, I used to just to develop within the WaterLily.jl environment (cd WaterLily.jl; julia --project; include(...)) and test new features there. Now, if new features make use of a weak dependency such as CUDA or WriteVTK, testing code within the WaterLily.jl environment is problematic since we cannot do using CUDA in there, as the package manager detects that CUDA is no within [deps] (it's in [weakdeps]) and asks to install such package. Instead, I create a new project and environment called WaterLilyTests.jl, I add WaterLily, mark it as development package using Pkg; Pkg.develop(PackageSpec(path="<WaterLily.jl path>"));, and also add whatever packages I need to test development such as CUDA: add CUDA. When WaterLily detects that CUDA is loaded (using CUDA) it will compile the code in ext/WaterLilyCUDAExt.jl and make the extensions available.

@b-fg
Copy link
Member Author

b-fg commented Jan 12, 2024

So everything is pretty much done. I am locally running the test suite and the only current issue is that while tests are correctly passing for 1.8.5 and 1.9.4, for 1.10.0 are not. Specifically, this test is failing on 1.10.0:
https://github.com/weymouth/WaterLily.jl/blob/fe9827338df7c6e53bfefb749fe32d5a0cf93e74/test/runtests.jl#L298

WaterLily.jl: Test Failed at /home/b-fg/Workspace/tudelft1/WaterLily.jl/test/runtests.jl:298
  Expression: sim_time(sim)  0.1 > (sum(sim.flow.Δt[1:end - 2]) * sim.U) / sim.L
   Evaluated: 0.07563669234514236  0.1 > 0.03125

Any idea why this might be happening @weymouth?

@b-fg
Copy link
Member Author

b-fg commented Jan 12, 2024

Huh, I just did a merge from master and now is throwing some NaNs. Adding the verbose flag in the test I see:

tU/L=0.0312, Δt=0.355
tU/L=0.0756, Δt=NaN
tU/L=NaN, Δt=NaN

All other test sections are passing in 1.10.0 though.

@b-fg
Copy link
Member Author

b-fg commented Jan 12, 2024

Seems like exitBC=true is needed in that test so that the simulation does not blow up in 1.10.0. But then again the other tests after this one which use exitBC=false for Array and CuArray work fine... So for the moment I will modify this test and maybe we could compat to 1.9 only until this is figured out.

@b-fg b-fg marked this pull request as ready for review January 12, 2024 23:13
Copy link

codecov bot commented Jan 12, 2024

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (b267c6b) 90.42% compared to head (2453e76) 90.55%.

Files Patch % Lines
ext/WaterLilyAMDGPUExt.jl 50.00% 1 Missing ⚠️
ext/WaterLilyCUDAExt.jl 50.00% 1 Missing ⚠️
ext/WaterLilyWriteVTKExt.jl 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master      #98      +/-   ##
==========================================
+ Coverage   90.42%   90.55%   +0.13%     
==========================================
  Files           9       11       +2     
  Lines         428      434       +6     
==========================================
+ Hits          387      393       +6     
  Misses         41       41              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@weymouth
Copy link
Collaborator

That's weird. Also notice that it's failing on macOS again, even for 1.9. We must have something not quite right in that test.

@b-fg
Copy link
Member Author

b-fg commented Jan 13, 2024

Yeap!

@weymouth
Copy link
Collaborator

I'll try it on my home laptop, which is a mac.

@b-fg
Copy link
Member Author

b-fg commented Jan 19, 2024

Did you have a chance to look at this @weymouth?

@weymouth
Copy link
Collaborator

Nope, but I'll check this weekend.

Apologies if I've misunderstood, but I don't think we need plotting routines in the test dependancies list, do we?
@weymouth
Copy link
Collaborator

When I run ] test on this PR, the CUDA and AMDGPU both throw errors when they try to load (since this laptop doesn't have a GPU) but the tests still run and pass. Is that the correct behaviour?

@b-fg
Copy link
Member Author

b-fg commented Jan 21, 2024

Yes, that is expected since we have it hardcoded in tests hence it tries to load those. Otherwise it would never try to pass them for GPUs, even if CUDA or AMDGPU was functional. I think there is no better way to this at the moment. The main benefit is that for a normal usage (outside of tests) users now do not download and install CUDA/AMDGPU by default.

Project.toml Show resolved Hide resolved

L₂ norm of ROCArray `a` excluding ghosts.
"""
L₂(a::ROCArray,R::CartesianIndices=inside(a)) = mapreduce(abs2,+,@inbounds(a[R]))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GPU tests cannot really be covered in the current Github CI.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's clear.

…w). GPU extensions already annotate the L2 function arguments type.
@weymouth
Copy link
Collaborator

This ready?

@b-fg
Copy link
Member Author

b-fg commented Jan 21, 2024

Yes, did a last check and all looks good. Also I just passed tests locally on 1.8.5 and 1.10.0 and worked fine. Good to go!

@b-fg b-fg merged commit bcb925c into master Jan 21, 2024
13 of 14 checks passed
marinlauber pushed a commit to marinlauber/WaterLily that referenced this pull request Jan 22, 2024
@weymouth weymouth mentioned this pull request Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants