Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ones_like on GPUs #142

Merged
merged 1 commit into from
Sep 13, 2023
Merged

Fix ones_like on GPUs #142

merged 1 commit into from
Sep 13, 2023

Conversation

adrhill
Copy link
Member

@adrhill adrhill commented Sep 13, 2023

Closes #59.

Benchmarks

I had to run tests on a GPU with only 2 GB of VRAM, so I only benchmarked very small batchsizes:

using CUDA, cuDNN
CUDA.functional() || error("CUDA not functional")
CUDA.versioninfo()
CUDA.allowscalar(false)

using Flux
using ExplainableAI
using Metalhead
using Test 

# Load model
model = VGG(16, pretrain=true).layers
model = strip_softmax(model)

# Create analyzer
composite = EpsilonPlusFlat()
analyzer = LRP(model, composite)

model_gpu = model |> gpu;
analyzer_gpu = LRP(model_gpu, composite)

function time_analyzer(batchsize)
    input = rand(Float32, 224, 224, 3, batchsize)
    input_gpu = input |> gpu

    @info "Timing CPU with batchsize $batchsize..."
    analyze(input, analyzer)
    @time analyze(input, analyzer)
	
    @info "Timing GPU with batchsize $batchsize..."
    analyze(input_gpu, analyzer_gpu)
    @time analyze(input_gpu, analyzer_gpu)
    return nothing
end
julia> time_analyzer(1)
[ Info: Timing CPU with batchsize 1...
  2.211998 seconds (4.32 k allocations: 3.342 GiB, 6.10% gc time)
[ Info: Timing GPU with batchsize 1...
  0.810851 seconds (13.94 k allocations: 778.767 KiB, 4.53% gc time)

julia> time_analyzer(2)
[ Info: Timing CPU with batchsize 2...
  3.751037 seconds (4.36 k allocations: 3.998 GiB, 1.97% gc time)
[ Info: Timing GPU with batchsize 2...
  1.433466 seconds (12.08 k allocations: 657.875 KiB, 2.17% gc time)

julia> time_analyzer(5)
[ Info: Timing CPU with batchsize 5...
  9.260670 seconds (4.37 k allocations: 5.964 GiB, 1.66% gc time)
[ Info: Timing GPU with batchsize 5...
  3.676279 seconds (16.99 k allocations: 994.595 KiB, 9.84% gc time)

julia> CUDA.versioninfo()
CUDA runtime 12.1, artifact installation
CUDA driver 12.2
NVIDIA driver 535.104.5

CUDA libraries: 
- CUBLAS: 12.1.3
- CURAND: 10.3.2
- CUFFT: 11.0.2
- CUSOLVER: 11.4.5
- CUSPARSE: 12.1.0
- CUPTI: 18.0.0
- NVML: 12.0.0+535.104.5

Julia packages: 
- CUDA: 4.4.1
- CUDA_Driver_jll: 0.5.0+1
- CUDA_Runtime_jll: 0.6.0+0

Toolchain:
- Julia: 1.9.3
- LLVM: 14.0.6
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5
- Device capability support: sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

1 device:
  0: NVIDIA GeForce 940MX (sm_50, 1.953 GiB / 2.000 GiB available)

@codecov
Copy link

codecov bot commented Sep 13, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.27% 🎉

Comparison is base (3aa10ba) 94.92% compared to head (52f6dc6) 95.19%.
Report is 3 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #142      +/-   ##
==========================================
+ Coverage   94.92%   95.19%   +0.27%     
==========================================
  Files          18       19       +1     
  Lines         689      687       -2     
==========================================
  Hits          654      654              
+ Misses         35       33       -2     
Files Changed Coverage Δ
src/ExplainableAI.jl 100.00% <ø> (ø)
src/lrp/rules.jl 94.16% <ø> (+0.96%) ⬆️
ext/TullioLRPRulesExt.jl 100.00% <100.00%> (ø)
src/lrp/canonize.jl 100.00% <100.00%> (ø)
src/utils.jl 95.45% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@adrhill adrhill merged commit 7a57f3a into master Sep 13, 2023
7 checks passed
@adrhill adrhill deleted the ah/lrp-fix-gpu branch September 13, 2023 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test GPU support
1 participant