Skip to content

cudaf/malloc-test

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

malloc() is a function for dynamic memory allocation in C.

testMalloc():
Testing performance of 100 memory copy operations
between CPU memory allocated with malloc().
testCudaMalloc():
Testing performance of 100 memory copy operations
between CPU memory allocated with malloc() and
GPU memory allocated with cudaMalloc(). Because
memory allocated with malloc() is pageable memory,
it will first be copied to a page-locked `staging`
area, before being transferring to GPU by DMA.
Note however that allocating too much pinned memory
can cause system slowdown, or even crash due to
lack of usable memory.
testCudaHostAlloc():
Testing performance of 100 memory copy operations
between CPU memory allocated with cudaHostAlloc()
and GPU memory allocated with cudaMalloc(). Memory
allocated with cudaHostAlloc() is page-locked
(pinned), which means the memory can be directly
copied by DMA into the GPU.
$ nvcc -std=c++17 -Xcompiler -O3 main.cu
$ ./a.out

# CPU malloc -> CPU malloc: 0.0 ms
#
# CPU malloc -> GPU cudaMalloc: 208.9 ms
# CPU malloc <- GPU cudaMalloc: 178.7 ms
#
# CPU cudaHostAlloc -> GPU cudaMalloc: 86.1 ms
# CPU cudaHostAlloc <- GPU cudaMalloc: 80.4 ms

See main.cu for code.



References