-
Notifications
You must be signed in to change notification settings - Fork 244
Rationalize and try to fix failing ldiv tests #2809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kshyatt
wants to merge
2
commits into
master
Choose a base branch
from
ksh/interfaces_fix
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/test/libraries/cusparse/interfaces.jl b/test/libraries/cusparse/interfaces.jl
index fa25d8330..34f9d75f8 100644
--- a/test/libraries/cusparse/interfaces.jl
+++ b/test/libraries/cusparse/interfaces.jl
@@ -258,7 +258,7 @@ nB = 2
end
end
@testset "\\ -- CuMatrix" begin
- C = triangle(opa(A)) \ opb(B)
+ C = triangle(opa(A)) \ opb(B)
dC = triangle(opa(dA)) \ opb(dB)
@test C ≈ collect(dC)
if CUSPARSE.version() < v"12.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDA.jl Benchmarks
Benchmark suite | Current: 5db6744 | Previous: fb0a528 | Ratio |
---|---|---|---|
latency/precompile |
43353735535.5 ns |
43329572726 ns |
1.00 |
latency/ttfp |
7064329610 ns |
7031277701 ns |
1.00 |
latency/import |
3452525802 ns |
3465549489 ns |
1.00 |
integration/volumerhs |
9609113.5 ns |
9597721 ns |
1.00 |
integration/byval/slices=1 |
146862 ns |
147057 ns |
1.00 |
integration/byval/slices=3 |
425687 ns |
426256 ns |
1.00 |
integration/byval/reference |
144957 ns |
145230 ns |
1.00 |
integration/byval/slices=2 |
286192.5 ns |
286757 ns |
1.00 |
integration/cudadevrt |
103494 ns |
103670 ns |
1.00 |
kernel/indexing |
14279 ns |
14524 ns |
0.98 |
kernel/indexing_checked |
14912 ns |
15000.5 ns |
0.99 |
kernel/occupancy |
754.2209302325581 ns |
741.6906474820144 ns |
1.02 |
kernel/launch |
2269.8888888888887 ns |
2251.3333333333335 ns |
1.01 |
kernel/rand |
18036 ns |
14919 ns |
1.21 |
array/reverse/1d |
19937 ns |
19819 ns |
1.01 |
array/reverse/2d |
24174 ns |
25017 ns |
0.97 |
array/reverse/1d_inplace |
10330 ns |
10579 ns |
0.98 |
array/reverse/2d_inplace |
11826 ns |
12305 ns |
0.96 |
array/copy |
21210 ns |
21174 ns |
1.00 |
array/iteration/findall/int |
159709.5 ns |
158557 ns |
1.01 |
array/iteration/findall/bool |
140129 ns |
139738 ns |
1.00 |
array/iteration/findfirst/int |
162678 ns |
157479 ns |
1.03 |
array/iteration/findfirst/bool |
163518.5 ns |
158520.5 ns |
1.03 |
array/iteration/scalar |
73054 ns |
73433 ns |
0.99 |
array/iteration/logical |
215506.5 ns |
217515.5 ns |
0.99 |
array/iteration/findmin/1d |
45938 ns |
46904 ns |
0.98 |
array/iteration/findmin/2d |
95981 ns |
96886 ns |
0.99 |
array/reductions/reduce/Int64/1d |
42305 ns |
43179 ns |
0.98 |
array/reductions/reduce/Int64/dims=1 |
55110.5 ns |
47755.5 ns |
1.15 |
array/reductions/reduce/Int64/dims=2 |
61831 ns |
62189 ns |
0.99 |
array/reductions/reduce/Int64/dims=1L |
89151 ns |
89100 ns |
1.00 |
array/reductions/reduce/Int64/dims=2L |
86599.5 ns |
87187.5 ns |
0.99 |
array/reductions/reduce/Float32/1d |
34392 ns |
35302 ns |
0.97 |
array/reductions/reduce/Float32/dims=1 |
51824 ns |
41632 ns |
1.24 |
array/reductions/reduce/Float32/dims=2 |
59683 ns |
59745 ns |
1.00 |
array/reductions/reduce/Float32/dims=1L |
53027 ns |
53265 ns |
1.00 |
array/reductions/reduce/Float32/dims=2L |
69910.5 ns |
70365 ns |
0.99 |
array/reductions/mapreduce/Int64/1d |
41808 ns |
42192.5 ns |
0.99 |
array/reductions/mapreduce/Int64/dims=1 |
45499.5 ns |
55413.5 ns |
0.82 |
array/reductions/mapreduce/Int64/dims=2 |
61577 ns |
62472 ns |
0.99 |
array/reductions/mapreduce/Int64/dims=1L |
89162 ns |
89096 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2L |
86429 ns |
87031 ns |
0.99 |
array/reductions/mapreduce/Float32/1d |
34423 ns |
34963.5 ns |
0.98 |
array/reductions/mapreduce/Float32/dims=1 |
41823 ns |
42410 ns |
0.99 |
array/reductions/mapreduce/Float32/dims=2 |
60232 ns |
60141 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=1L |
53392 ns |
53371 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2L |
70498 ns |
70585 ns |
1.00 |
array/broadcast |
20874 ns |
21207 ns |
0.98 |
array/copyto!/gpu_to_gpu |
11337 ns |
11289 ns |
1.00 |
array/copyto!/cpu_to_gpu |
215316 ns |
217320 ns |
0.99 |
array/copyto!/gpu_to_cpu |
284614 ns |
283093 ns |
1.01 |
array/accumulate/Int64/1d |
125256 ns |
125220 ns |
1.00 |
array/accumulate/Int64/dims=1 |
83748 ns |
83742 ns |
1.00 |
array/accumulate/Int64/dims=2 |
157761 ns |
158355 ns |
1.00 |
array/accumulate/Int64/dims=1L |
1709202.5 ns |
1709296 ns |
1.00 |
array/accumulate/Int64/dims=2L |
966206 ns |
966418.5 ns |
1.00 |
array/accumulate/Float32/1d |
109184 ns |
109589 ns |
1.00 |
array/accumulate/Float32/dims=1 |
80881.5 ns |
80692 ns |
1.00 |
array/accumulate/Float32/dims=2 |
147791 ns |
148042 ns |
1.00 |
array/accumulate/Float32/dims=1L |
1618440 ns |
1618330 ns |
1.00 |
array/accumulate/Float32/dims=2L |
698196 ns |
698511 ns |
1.00 |
array/construct |
1283.9 ns |
1255.2 ns |
1.02 |
array/random/randn/Float32 |
47374 ns |
43548 ns |
1.09 |
array/random/randn!/Float32 |
24964 ns |
25021 ns |
1.00 |
array/random/rand!/Int64 |
27404 ns |
27406 ns |
1.00 |
array/random/rand!/Float32 |
8755.666666666666 ns |
8860.333333333334 ns |
0.99 |
array/random/rand/Int64 |
29882 ns |
38250 ns |
0.78 |
array/random/rand/Float32 |
13077.5 ns |
13162 ns |
0.99 |
array/permutedims/4d |
61086 ns |
61907 ns |
0.99 |
array/permutedims/2d |
54866.5 ns |
55440 ns |
0.99 |
array/permutedims/3d |
56006 ns |
56158 ns |
1.00 |
array/sorting/1d |
2756544 ns |
2757412.5 ns |
1.00 |
array/sorting/by |
3343162 ns |
3361155 ns |
0.99 |
array/sorting/2d |
1084151 ns |
1088259 ns |
1.00 |
cuda/synchronization/stream/auto |
1052.4 ns |
1030.8 ns |
1.02 |
cuda/synchronization/stream/nonblocking |
7722.1 ns |
8005.9 ns |
0.96 |
cuda/synchronization/stream/blocking |
828.0882352941177 ns |
853.1914893617021 ns |
0.97 |
cuda/synchronization/context/auto |
1215.8 ns |
1178.1 ns |
1.03 |
cuda/synchronization/context/nonblocking |
7555.7 ns |
7403.6 ns |
1.02 |
cuda/synchronization/context/blocking |
913.1162790697674 ns |
914.6428571428571 ns |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
maleadt
approved these changes
Jul 3, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Trying to fix intermittently failing CI. Doesn't make sense to have these checks for only one of the inplace/not-inplace versions. Hopefully this helps stability.