Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed #31

Closed
ivanslapnicar opened this issue Oct 23, 2018 · 5 comments
Closed

Speed #31

ivanslapnicar opened this issue Oct 23, 2018 · 5 comments

Comments

@ivanslapnicar
Copy link

The package DoubleDouble.jl which is now deprecated in favor of DoubleFloats.jl seems to run 10 times faster. Here are some data: for 512 x 512 matrix multiply:
For DoubleDouble on Julia 6.2:
The code:

using DoubleDouble
srand(123)
n=500
A=rand(n,n)
B=rand(n,n)
@time A*B
@time C=A*B

Ad=map(Double,A)
Bd=map(Double,B)

@time Ad*Bd
@time Cd=Ad*Bd

Ab=map(BigFloat,A)
Bb=map(BigFloat,B)

@time Ab*Bb
@time Cb=Ab*Bb;

vecnorm(Cb-C),vecnorm(Cb-Cd)

The output:

  0.760771 seconds (246.38 k allocations: 13.529 MiB)
  0.020722 seconds (7 allocations: 1.908 MiB)
  4.113227 seconds (336.95 k allocations: 17.923 MiB, 2.83% gc time)
  3.102003 seconds (13 allocations: 3.815 MiB)
 92.818986 seconds (502.24 M allocations: 24.324 GiB, 40.85% gc time)
115.261513 seconds (502.00 M allocations: 24.313 GiB, 44.28% gc time)
(1.805819154666807124971283783712203336688579278627334328304525314230137409530732e-11, 6.165584841387163497016967114443636899352678465783015830879065544438127870571583e-28)

The results on juliaBox are similar.

For DoubleFloats on julia 1.0
The code:

using DoubleFloats, Random, LinearAlgebra
Random.seed!(123)
n=500
A=rand(n,n)
B=rand(n,n)
@time A*B
@time C=A*B

Ad=Double64.(A)
Bd=Double64.(B)

@time Ad*Bd
@time Cd=Ad*Bd

Ab=map(BigFloat,A)
Bb=map(BigFloat,B)

@time Ab*Bb
@time Cb=Ab*Bb;

norm(Cb-C),norm(Cb-Cd)

The results:

  0.080844 seconds (6 allocations: 1.908 MiB, 79.15% gc time)
  0.022727 seconds (6 allocations: 1.908 MiB)
 31.637581 seconds (12 allocations: 3.815 MiB)
 32.924434 seconds (12 allocations: 3.815 MiB)
 72.552449 seconds (502.00 M allocations: 26.183 GiB, 38.97% gc time)
 76.039207 seconds (502.00 M allocations: 26.183 GiB, 38.76% gc time)
(1.80177057469210812126496564089303771446532868324947837791160773801388809191472e-11,
  5.5466913217631572054798245925465557e-28)

is this the expected behavior or am I doing something wrong?

N.B. DoubleFloats cannot be used on JuliaBox with 1.0 yet.

@JeffreySarnoff
Copy link
Member

first the last point: I have no experience with JuliaBox. What needs occur to get it running there
(if you know).
I expect DoubleFloats to run calculations that are mostly arithmetic operations very quickly. I don't think you are doing anything wrong because the package is to work for you as you prefer to work.

I will look that now. Thank you for info -- a few others have been using it without mentioning this.
When I have some information, I will add to this issue. Meanwhile, would you do similar comparative timings with componentwise addition and componentwise multiplication ( vec_a .* vec_b) and let me know if there exists a similar discrepancy.

@JeffreySarnoff
Copy link
Member

JeffreySarnoff commented Oct 23, 2018

with 100 being the fastest relative time and numbers > 100 being proportionately slower

... type . . . . addition . . . . multiplication
DoubleDouble . . . . 100 . . . . 135
DoubleFloats . . . . 115 . . . . 100

Thank you for bringing it to my attention. I will discuss it with the author of DoubleDouble, the algorithm for addition is the same in both packages; so this should be resolvable. The next tagged version will have this specific issue addressed. I assume that is what is influencing the matrix multiply. The only way to know is to retest using the next release. I should have it ready at the end of the week.
And I will post a note here at that time.

@ivanslapnicar
Copy link
Author

Thanks for the quick reply. Here are the results of componentwise multiplications.
For DoubloeDouble on Julia 6.3
The code:

using DoubleDouble
srand(123)
n=500
A=vec(rand(n,n))
B=vec(rand(n,n))
@time A.*B
@time C=A.*B

Ad=map(Double,A)
Bd=map(Double,B)

@time Ad.*Bd
@time Cd=Ad.*Bd

Ab=map(BigFloat,A)
Bb=map(BigFloat,B)

@time Ab.*Bb
@time Cb=Ab.*Bb;

vecnorm(Cb-C),vecnorm(Cb-Cd)

The results:

  0.134839 seconds (51.38 k allocations: 4.639 MiB)
  0.002283 seconds (32 allocations: 1.909 MiB)
  2.068475 seconds (130.60 k allocations: 10.743 MiB)
  0.006234 seconds (33 allocations: 3.816 MiB)
  0.348915 seconds (549.64 k allocations: 29.218 MiB, 42.67% gc time)
  0.182759 seconds (500.03 k allocations: 26.704 MiB, 60.18% gc time)
(7.702377128382391575778039987032993114656508612230634616468044635748578028159773e-15, 0.000000000000000000000000000000000000000000000000000000000000000000000000000000)

For DoubleFloats on Julia 1.0.1
The code:

using DoubleFloats, Random, LinearAlgebra
Random.seed!(123)
n=500
A=vec(rand(n,n))
B=vec(rand(n,n))
@time A.*B
@time C=A.*B

Ad=Double64.(A)
Bd=Double64.(B)

@time Ad.*Bd
@time Cd=Ad.*Bd

Ab=map(BigFloat,A)
Bb=map(BigFloat,B)

@time Ab.*Bb
@time Cb=Ab.*Bb;

norm(Cb-C),norm(Cb-Cd)

The results:

  0.168818 seconds (228.20 k allocations: 13.357 MiB, 30.40% gc time)
  0.002246 seconds (9 allocations: 1.908 MiB)
  0.206088 seconds (280.16 k allocations: 17.720 MiB)
  0.061515 seconds (9 allocations: 3.815 MiB)
  0.182106 seconds (743.01 k allocations: 40.933 MiB)
  0.059376 seconds (500.01 k allocations: 28.611 MiB)
(7.707771385268457071516998432512981433426423641139552101397606654180100693386613e-15, 0.0)

Again, DoubleFloats is 10 times slower than DoubleDouble and not faster than BigFloat?

@JeffreySarnoff
Copy link
Member

JeffreySarnoff commented Oct 23, 2018

I do not see that. When I have a stable change, I will re-benchmark it and post the results.

@JeffreySarnoff
Copy link
Member

Using the current master (v0.3.2) and BenchmarkTools (preferred to @time):

using BenchmarkTools
using DoubleFloats
using DoubleDouble # a version indelicately altered to run here

setprecision(BigFloats, 106)

d64a = reshape(rand(Double64,100*100),100,100);
d64b = reshape(rand(Double64,100*100),100,100);

biga = BigFloat.(d64a);
bigb = BigFloat.(d64b);

dbla = Double.(biga);
dblb = Double.(bigb);

@btime biga * bigb;
  122.196 ms (4080002 allocations: 217.97 MiB)
 @btime d64ma * d64mb;
  12.504 ms (8 allocations: 156.66 KiB)
@btime dbla * dblb;
  8.038 ms (8 allocations: 156.66 KiB)

122.2/12.5 # BigFloat is slower than DoubleFloats by a factor of
9.8

12.5/8.5 # DoubleFloats is slower than DoubleDoubleDouble by a factor of
1.5

DoubleDouble does not handle all values (there are no Infinities, all Infs become NaNs). And there are other differences. So, DoubleFloats is expected to run less quickly than DoubleDouble. DoubleFloats does run quickly relative to equi-capable alternatives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants