Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NDTensors] JLArrays Extension #1508

Draft
wants to merge 32 commits into
base: main
Choose a base branch
from

Conversation

kmp5VT
Copy link
Collaborator

@kmp5VT kmp5VT commented Jun 21, 2024

Description

Introduce the JLArrays extension to test GPU for ITensors

Checklist:

  • Introduce JLArrays and all NDTensors tests pass with array backend
  • Potentially move function from GPU array specific libraries to GPUArraysCoreExt
  • Update ITensors suite for JLArray to test on GPU adjacent array on CPU

@kmp5VT
Copy link
Collaborator Author

kmp5VT commented Jun 21, 2024

There is an issue in JLArrays, there is a missing resize! function JuliaGPU/GPUArrays.jl#541

Comment on lines 9 to 10
TypeParameterAccessors.position(::Type{<:JLArray}, ::typeof(eltype)) = Position(1)
TypeParameterAccessors.position(::Type{<:JLArray}, ::typeof(ndims)) = Position(2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These shouldn't be needed since I made generic definitions in #1505.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I removed these definitions for all GPU backends. Thanks!

NDTensors/Project.toml Outdated Show resolved Hide resolved
Comment on lines 6 to 8
function TypeParameterAccessors.default_type_parameters(::Type{<:JLArray})
return (Float64, 1)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could consider making this a generic AbstractArray definition.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something I was thinking about. I think one issue with that is wrappers don't always have ndims second for example

julia> a = Matrix{Float32}(undef, (2,2))
julia> typeof(transpose(a))
LinearAlgebra.Transpose{Float32, Matrix{Float32}}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so then for those types we would require making a custom overload of TypeParameterAccessors.default_type_parameters, which would be required anyway.

Comment on lines 26 to 29
if "jlarrays" in ARGS || "all" in ARGS
Pkg.add("JLArrays")
using JLArrays
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a reason to not run the tests for JLArrays since they run on CPU anyway. So I think it should just be added as a normal test dependency, and inserted into the device list by default.

Copy link
Collaborator Author

@kmp5VT kmp5VT Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I am seeing an issue with the GPU tests when I have JLArrays is in the Project.toml. To fix the problem, I moved JLArrays to [extra] and I add it to the project if isempty(ARGS) || "base" in ARGS. However, this would run into a problem if the test args is ["base","metal"]. Right now we aren't testing in that way, though if you would prefer, I can just include JLArrays if no GPUs are being tested, i.e. only allow JLArrays if isempty(ARGS). Let me know, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the issue you see? It would be best to try to address that issue and have JLArrays as a dependency in Project.toml and run by default in the tests as we planned, so I would prefer to try to aim for that rather than work around an issue with a more convoluted solution.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that GPUArrays does not compile during Pkg.test

ERROR: LoadError: UndefVarError: `SimplifyCFGPassOptions` not defined
Stacktrace:
 [1] top-level scope
   @ ~/.julia/packages/GPUCompiler/nWT2N/src/optim.jl:57
 [2] include(mod::Module, _path::String)
   @ Base ./Base.jl:495
 [3] include(x::String)
   @ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/GPUCompiler.jl:1
 [4] top-level scope
   @ ~/.julia/packages/GPUCompiler/nWT2N/src/GPUCompiler.jl:37
 [5] include
   @ Base ./Base.jl:495 [inlined]
 [6] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt128}}, source::String)
   @ Base ./loading.jl:2216
 [7] top-level scope
   @ stdin:3
in expression starting at /home/jenkins/workspace/ITensor_ITensors.jl_PR-1508@tmp/.julia/packages/GPUCompiler/nWT2N/src/optim.jl:56
in expression starting at /home/jenkins/workspace/ITensor_ITensors.jl_PR-1508@tmp/.julia/packages/GPUCompiler/nWT2N/src/GPUCompiler.jl:1
in expression starting at stdin:3
ERROR: LoadError: Failed to precompile GPUCompiler [61eb1bfa-7361-4325-ad38-22787b887f55] to "/home/jenkins/workspace/ITensor_ITensors.jl_PR-1508@tmp/.julia/compiled/v1.10/GPUCompiler/jl_xlhZZG".

However, I think I was able to fix the problem by moving using JLArrays to after the GPU using statements.

@codecov-commenter
Copy link

codecov-commenter commented Jun 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.38%. Comparing base (82cfd76) to head (c0ebacb).
Report is 23 commits behind head on main.

Current head c0ebacb differs from pull request most recent head d6e675d

Please upload reports for the commit d6e675d to get more accurate results.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1508      +/-   ##
==========================================
- Coverage   78.05%   77.38%   -0.68%     
==========================================
  Files         148      140       -8     
  Lines        9679     9103     -576     
==========================================
- Hits         7555     7044     -511     
+ Misses       2124     2059      -65     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -0,0 +1,8 @@
module NDTensorsJLArraysExt
include("copyto.jl")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly copied these from CUDA definitions. I am working to determine which functions are actually necessary.

@kmp5VT kmp5VT marked this pull request as draft June 27, 2024 21:16
@kmp5VT
Copy link
Collaborator Author

kmp5VT commented Jul 2, 2024

For some reason I am having an issue with SparseArrays with Julia 1.6. All 1.6 tests fail with a compat issue

   Resolving package versions...
Downstream tests for ITensor DMRG: Error During Test at /Users/kpierce/.julia/packages/SafeTestsets/raUNr/src/SafeTestsets.jl:30
  Got exception outside of a @test
  LoadError: LoadError: LoadError: LoadError: Unsatisfiable requirements detected for package SparseArrays [2f01184e]:
   SparseArrays [2f01184e] log:
   ├─possible versions are: 0.0.0 or uninstalled
   ├─SparseArrays [2f01184e] is fixed to version 0.0.0
   └─restricted to versions 1.6.0-1 by NDTensors [23ae76d9] — no versions left
     └─NDTensors [23ae76d9] log:
       ├─possible versions are: 0.3.39 or uninstalled
       ├─restricted to versions 0.3.34-0.3 by ITensors [9136182c], leaving only versions 0.3.39
       │ └─ITensors [9136182c] log:
       │   ├─possible versions are: 0.6.16 or uninstalled
       │   └─ITensors [9136182c] is fixed to version 0.6.16
       └─NDTensors [23ae76d9] is fixed to version 0.3.39

@mtfishman
Copy link
Member

For some reason I am having an issue with SparseArrays with Julia 1.6. All 1.6 tests fail with a compat issue

That's strange, do you see that locally as well?

Since CUDA.jl doesn't support Julia 1.6 now anyway, maybe if there isn't an obvious solution to that we could just stop testing against Julia 1.6 in the Jenkins workflow and not worry about Julia 1.6 support for the GPU backends.

The latest version of Julia compatible with CUDA.jl v5 is Julia 1.8, what if we test against Julia 1.8 instead of Julia 1.6 in the Jenkins tests? If that works, I would vote for just doing that.

@kmp5VT
Copy link
Collaborator Author

kmp5VT commented Jul 3, 2024

For some reason I am having an issue with SparseArrays with Julia 1.6. All 1.6 tests fail with a compat issue

That's strange, do you see that locally as well?

Yes I was seeing this locally on CPU and GPU. I am not sure why NDTensors tests work on CI with version 1.6. I bumped the version for the CUDA test to 1.8

@kmp5VT
Copy link
Collaborator Author

kmp5VT commented Jul 3, 2024

Also JLArrays does not have JLMatrix defined in julia v1.6

julia> versioninfo()
Julia Version 1.6.7
Commit 3b76b25b64 (2022-07-19 15:11 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin21.4.0)
  CPU: Apple M1 Max
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, westmere)

julia> using JLArrays

julia> JLArray.JLMatrix
ERROR: type UnionAll has no field JLMatrix
Stacktrace:
 [1] getproperty(x::Type, f::Symbol)
   @ Base ./Base.jl:28
 [2] top-level scope
   @ REPL[7]:1

@kmp5VT
Copy link
Collaborator Author

kmp5VT commented Jul 3, 2024

And JLArrays also has no resize! function in Julia 1.8

Running /Users/kpierce/.julia/dev/ITensors/NDTensors/test/test_blocksparse.jl
test device: jl, eltype: Float32: Error During Test at /Users/kpierce/.julia/dev/ITensors/NDTensors/test/test_blocksparse.jl:32
  Got exception outside of a @test
  MethodError: no method matching resize!(::JLArrays.JLArray{Float32, 1}, ::Int64)
  Closest candidates are:
    resize!(::Vector, ::Integer) at array.jl:1233
    resize!(::BitVector, ::Integer) at bitarray.jl:814
    resize!(::NDTensors.SmallVectors.SubMSmallVector, ::Integer) at ~/.julia/dev/ITensors/NDTensors/src/lib/SmallVectors/src/subsmallvector/subsmallvector.jl:69

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants