Skip to content

Speeding up copy() #121

@ViralBShah

Description

@ViralBShah

Create a deep_copy() so that it can be explicitly used where necessary.

Also, copy() and copy_to() need to be optimized to use the fastest implementations. Here are some tests (Mac, Intel Core 2 Duo) that suggest:

  1. memcpy for large copy and copy_to
  2. Native julia for small copy
  3. BLAS for small copy_to

Case 2 may be omitted since it is close enough to case 3 to keep things simple. Also, these tests need to be carried out on different architectures too.

!##### Test 1 #######

julia> a = ones(100)
[1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0]

!## Julia implementation (FASTEST)
julia> @time for i=1:1e5; jcopy(a); end;
elapsed time: 0.38377809524536133 sec

!## This is DCOPY from BLAS. It is an assembly language implementation in openblas
julia> @time for i=1:1e5; bcopy(a); end;
elapsed time: 0.41508984565734863 sec

!## This one dispatches to memcpy
julia> @time for i=1:1e5; copy(a); end;
elapsed time: 0.61258411407470703 sec

!##### Test 2 #######

I now implemented copy_to() for all cases to remove allocation/GC costs:

julia> a = ones(100)
[1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0]

julia> b = ones(100)
[1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0]

!## Julia implementation
julia> @time for i=1:1e5; jcopy_to(b,a); end;
elapsed time: 0.04522800445556641 sec

!## BLAS (FASTEST)
julia> @time for i=1:1e5; bcopy_to(b,a); end;
elapsed time: 0.01788496971130371 sec

!## memcpy
julia> @time for i=1:1e5; copy_to(b,a); end;
elapsed time: 0.27470088005065918 sec

!##### Test 3 #######

And now, a larger size:

julia> a = ones(1000000)
[1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0]

julia> b = ones(1000000)
[1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0]

!## Julia implementation
julia> @time for i=1:100; jcopy_to(b,a); end;
elapsed time: 0.5620429515838623 sec

!## BLAS
julia> @time for i=1:100; bcopy_to(b,a); end;
elapsed time: 0.5299229621887207 sec

!## memcpy (FASTEST)
julia> @time for i=1:100; copy_to(b,a); end;
elapsed time: 0.35404396057128906 sec

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions