Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI][ITensorGPU] Update to latest Metal.jl, fix ITensorGPU CI #1383

Merged
merged 20 commits into from
Apr 18, 2024

Conversation

kmp5VT
Copy link
Collaborator

@kmp5VT kmp5VT commented Apr 12, 2024

Description

In this PR I am making some modifications to deal with changes from the bump in Metals version. First, now metal has a resize! function so the Blocksparse DMRG code now is functional with that backend so remove the is_broken function.

Next there seems to be an issue with copy and subarrays so I opened an issue with Metal. In the meantime I marked the test as broken for metal in the Expose library. I am currently running the tests to determine if there is anything else that needs to be fixed with metal and if there is I will make those changes here too.

Checklist:

  • Metal DMRG tests pass
  • Metal CI no longer is failing

@kmp5VT kmp5VT marked this pull request as draft April 12, 2024 21:44
@kmp5VT kmp5VT requested a review from mtfishman April 14, 2024 15:02
@kmp5VT kmp5VT marked this pull request as ready for review April 14, 2024 15:02
jenkins/Jenkinsfile Outdated Show resolved Hide resolved
@mtfishman mtfishman changed the title [NDTensors] Address Metal with version bump. [NDTensors] Fix issues caused by new version of Metal Apr 18, 2024
@mtfishman mtfishman changed the title [NDTensors] Fix issues caused by new version of Metal [NDTensors] Fix issues caused by new version of Metal.jl Apr 18, 2024
@kmp5VT
Copy link
Collaborator Author

kmp5VT commented Apr 18, 2024

@mtfishman regarding this issue. I tried deleting everything in my .julia folder. Next I ran this on my mac

$ julia
julia> using Pkg; Pkg.activate(temp = true);
julia> Pkg.add(;name="Metal", version="1.0.0")
julia> exit()
$julia
julia> using Pkg; Pkg.activate(temp=true)
julia> Pkg.dev("/path/to/NDTensors"); Pkg.dev("/path/to/ITensors");
julia> Pkg.test("NDTensors"; test_args=["metal"])

and it looks like metal 1.1.0 is added like it should be.

Running /Users/kpierce/Workspace/julia_deps_41824/dev/ITensors/NDTensors/test/test_blocksparse.jl
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
   Installed LLVMDowngrader_jll ─ v0.1.0+1
   Installed ObjectiveC ───────── v2.1.1
   Installed CodecBzip2 ───────── v0.8.2
   Installed TranscodingStreams ─ v0.10.7
   Installed Metal ────────────── v1.1.0
  Downloaded artifact: LLVMDowngrader
    Updating `/private/tmp/jl_OeKJm5/Project.toml`
  [dde4c033] + Metal v1.1.0
    Updating `/private/tmp/jl_OeKJm5/Manifest.toml`

So most likely the problem we experienced is something related to Jenkins. I went and added a [compat] for metal in NDTensors/test/Project.toml just in case that does work in Pkg.test. My thought is that we might just have to manually update Metal on Jenkins when there is a version bump but its hard to determine until there is another version bump.
Another thing I could try is on Jenkins before Julia is started I could cd ~/.julia/packages/ and rm Metal/?

@mtfishman
Copy link
Member

@kmp5VT I think you have to add Metal to [extras] in test/Project.toml in older versions of Julia in order to add a compat entry for it.

@mtfishman mtfishman changed the title [NDTensors] Fix issues caused by new version of Metal.jl [NDTensors] Update to latest Metal.jl, fix ITensorGPU CI Apr 18, 2024
@mtfishman
Copy link
Member

@kmp5VT can you change the ITensorGPU julia compat entry to 1.6 - 1.9?

@mtfishman
Copy link
Member

Also can you remove HDF5 as a dependency of ITensorGPU?

@kmp5VT
Copy link
Collaborator Author

kmp5VT commented Apr 18, 2024

@mtfishman I removed HDF5 and added the compat info for Julia in ITensorGPU

@mtfishman
Copy link
Member

Looks good, thanks for sorting that out.

@mtfishman mtfishman merged commit fcf1e94 into ITensor:main Apr 18, 2024
18 checks passed
@mtfishman mtfishman changed the title [NDTensors] Update to latest Metal.jl, fix ITensorGPU CI [CI][ITensorGPU] Update to latest Metal.jl, fix ITensorGPU CI Apr 18, 2024
@kmp5VT kmp5VT deleted the kmp5/refactor/test_mtl_dmrg branch April 18, 2024 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants