-
Notifications
You must be signed in to change notification settings - Fork 11
Closed
Description
Describe the bug
When conducting tests, I found that the speed of calculations on the GPU is extremely low (even lower than when calculating on the CPU).
Here is my results:
------ CPU Benchmark ------
0 iteration duration: 0.55710983276367 seconds.
1 iteration duration: 0.56731915473938 seconds.
2 iteration duration: 0.60753011703491 seconds.
------ GPU Benchmark ------
0 iteration duration: 0.95671892166138 seconds.
1 iteration duration: 0.94768095016479 seconds.
2 iteration duration: 0.94867181777954 seconds.
To Reproduce
Execute the following script:
<?php
use \NDArray as nd;
$matrix = [];
for ($i = 0; $i < 4096; $i++) {
for ($j = 0; $j < 4096; $j++) {
$matrix[$i][] = random_int(0, 10);
}
}
$x_cpu = nd::array($matrix)->cpu();
$y_cpu = nd::array($matrix)->cpu();
echo PHP_EOL;
echo "------ CPU Benchmark ------" . PHP_EOL . PHP_EOL;
for ($i = 0; $i < 3; $i++) {
$startTime = microtime(true);
nd::matmul($x_cpu, $y_cpu);
echo "$i iteration duration: " . microtime(true) - $startTime . " seconds." . PHP_EOL;
}
$x_gpu = nd::array($matrix)->gpu();
$y_gpu = nd::array($matrix)->gpu();
echo PHP_EOL;
echo "------ GPU Benchmark ------" . PHP_EOL . PHP_EOL;
for ($i = 0; $i < 3; $i++) {
$startTime = microtime(true);
nd::matmul($x_gpu, $y_gpu);
echo "$i iteration duration: " . microtime(true) - $startTime . " seconds." . PHP_EOL;
}
echo PHP_EOL;Expected behavior
Computations on the GPU are faster than on the CPU.
Here is my results for PyTorch:
------ CPU Benchmark ------
0 iteration duration: 0.49809885025024414 seconds
1 iteration duration: 0.5133359432220459 seconds
2 iteration duration: 0.5611481666564941 seconds
------ GPU Benchmark ------
0 iteration duration: 0.052846670150756836 seconds
1 iteration duration: 0.05301022529602051 seconds
2 iteration duration: 0.04915332794189453 seconds
Dumps
If applicable, add NDArray low-level dumps of the relevant arrays.
$x_gpu->dump();=================================================
NDArray.uuid 3
NDArray.ndim 2
NDArray.dims [ 4096 4096 ]
NDArray.strides [ 16384 4 ]
NDArray.device (1) GPU
NDArray.refcount 1
NDArray.descriptor.elsize 4
NDArray.descriptor.numElements 16777216
NDArray.descriptor.type float32
NDArray.iterator.current_index 0
=================================================
$y_gpu->dump();=================================================
NDArray.uuid 4
NDArray.ndim 2
NDArray.dims [ 4096 4096 ]
NDArray.strides [ 16384 4 ]
NDArray.device (1) GPU
NDArray.refcount 1
NDArray.descriptor.elsize 4
NDArray.descriptor.numElements 16777216
NDArray.descriptor.type float32
NDArray.iterator.current_index 0
=================================================
Environment:
- OS: Ubuntu 20.04
- RAM: 20GB
- CPU: Intel(R) Core(TM) i7-4710HQ CPU @ 2.50GHz
- GPU: NVIDIA GeForce GTX 980M 4GB
- Nvidia driver version: 550.67
- CUDA version: 11.6.2
- PHP version: 8.3.0
- NumPower version: 0.5.0
Optional: PHP Information
PHP Version Info (php -v)
PHP 8.3.0 (cli) (built: Jun 19 2024 03:38:22) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.3.0, Copyright (c) Zend Technologies
PHP Modules:
[PHP Modules]
Core
date
libxml
openssl
pcre
sqlite3
zlib
bz2
ctype
curl
dom
fileinfo
filter
gd
hash
iconv
json
mbstring
SPL
session
PDO
standard
mysqlnd
pdo_sqlite
Phar
posix
random
readline
Reflection
pdo_mysql
SimpleXML
tokenizer
xml
xmlreader
xmlwriter
zip
NumPower
[Zend Modules]
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Status
Done