GPU is not used on Jetson TX2 #5

ingenieroariel · 2017-05-11T11:26:33Z

My machine has 8GB of RAM and CUDA 8 on Ubuntu 16.04 and I've set the following env vars:

LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
LD_LIBRARY_PATH=/usr/lib/jvm/java-8-openjdk-arm64/jre/lib/aarch64/server:$LD_LIBRARY_PATH
LD_LIBRARY_PATH=/usr/local/mapd-deps/lib:$LD_LIBRARY_PATH

When running mapd the following error appears in the console:

E0511 11:10:42.262192 16533 MapDHandler.cpp:282] No GPUs detected, falling back to CPU mode

Note: The web based query editor / chart generator is amazing. Here is a screenshot with uname -a and the warning I see:

The text was updated successfully, but these errors were encountered:

ingenieroariel · 2017-05-11T12:26:27Z

Here is what deviceQuery[1] from CUDA sample says:

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GP10B"
  CUDA Driver Version / Runtime Version          8.5 / 8.0
  CUDA Capability Major/Minor version number:    6.2
  Total amount of global memory:                 7854 MBytes (8235356160 bytes)
  ( 2) Multiprocessors, (128) CUDA Cores/MP:     256 CUDA Cores
  GPU Max Clock rate:                            1301 MHz (1.30 GHz)
  Memory Clock rate:                             13 Mhz
  Memory Bus Width:                              64-bit
  L2 Cache Size:                                 524288 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            Yes
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.5, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GP10B
Result = PASS

[1]
https://gist.github.com/ingenieroariel/c4e8e1299be58a5b852d91d85ba7da24

andrewseidl · 2017-05-11T12:37:59Z

The most likely cause is that the GPU detection in startmapd isn't properly handling the Jetson. This detection is actually not required anymore now that mapd_server can handle it, so I've removed it.

Could you do a git pull to pick up the new version of startmapd to see if that helps? Commit f694c47.

A second potential cause is not having permission to access the GPU devices, which I've seen happen in older versions of L4T if not using the default user. This is probably not it in your case since the samples work and you're using the default nvidia user.

ingenieroariel · 2017-05-11T14:52:27Z

I get no warnings now. Thanks!

Temporary fix for NONE ENCODED strings

* Register local / global hint in Calcite * Support g_ prefix for global query hint name * Translate global hint in analyzer * Add tests * Apply comments #1: global hint registration * Apply comments #2: global hint flag identification * Apply comments #3: global hint translation * Apply comments #4: remove unnecessary virtual keyword * Fixup a bug on allow_gpu_hashtable build hint for overlaps join * Fixup a bug related to a query having multiple identical subqueries * Add global hint tests related to overlaps join hashtable * Apply comments #5: misc cleanup

* Add buffer holders for GPU execution * Rename structures used to codegen * Introduce WindowFunctionCtx namespace * Add preparation for GPU execution in window ctx * Cleanup & improve WindowFunctionContext::compute() * Improve a logic to build aggregate tree w/ supporting reusing * Improve segment tree constructor * Rebase * Address comments #1 * Address comments #2: refactor bool param functions * Address comments #3 * Address comments #4: tbb * Address comments #5 * Address comments #6 * Fixup test failures Signed-off-by: Misiu Godfrey <misiu.godfrey@kraken.mapd.com>

ingenieroariel closed this as completed May 11, 2017

This was referenced Sep 27, 2017

mapd_server crashed when queries and inserts proceeded concurrently #82

Closed

LOAD queries starve SELECT queries (and mapdql) when number of total concurrent queries is GTE than number of cpu cores #95

Closed

fexolm referenced this issue in fexolm/omniscidb Oct 3, 2019

Merge pull request #5 from fexolm/arrow_storage_interface

816defa

Temporary fix for NONE ENCODED strings

MarcusGDaniels mentioned this issue Mar 23, 2021

determinism and SELECT DISTINCT #636

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU is not used on Jetson TX2 #5

GPU is not used on Jetson TX2 #5

ingenieroariel commented May 11, 2017

ingenieroariel commented May 11, 2017

andrewseidl commented May 11, 2017 •

edited

Loading

ingenieroariel commented May 11, 2017

GPU is not used on Jetson TX2 #5

GPU is not used on Jetson TX2 #5

Comments

ingenieroariel commented May 11, 2017

ingenieroariel commented May 11, 2017

andrewseidl commented May 11, 2017 • edited Loading

ingenieroariel commented May 11, 2017

andrewseidl commented May 11, 2017 •

edited

Loading