Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I tested Alenka on TPC-H on my GPU GTX 460SE, 1GB, 288 Cores, SM2.0. #33

Closed
AlexeyAB opened this issue Jun 3, 2013 · 10 comments
Closed

Comments

@AlexeyAB
Copy link
Contributor

AlexeyAB commented Jun 3, 2013

Hi Anton!
I tested Alenka on TPC-H on my GPU GTX 460SE, 1GB, 288 Cores, SM2.0.

Can you show the results on your GPU and on the others GPU, on which you tested?

I have such results:

TPC-H GPU SM2.0 GTX 460SE 1GB 288 Cores
Volume of DB: 10GB 100GB
Launch number: 1st launch 2nd launch 1st launch 2nd launch
q1.sql 5,3 3,8 90 90
q2.sql 1 0,36 9,3 2,3
q3.sql 1,8 1,7 71 67
q4.sql 4,25 1,3 45 13
q5.sql - - - -
q6.sql 1,77 0,9 39 8
q7.sql 8,7 2 - -

Regards, Alexey

@AlexeyAB
Copy link
Contributor Author

AlexeyAB commented Jun 3, 2013

An additional, i can't run q5 and get an error.

For 100 GB TPC-H:

F:\GPU_test_100GB_new>AlenkaDB.exe q5.sql
GeForce GTX 460 SE : 1296.000 Mhz   (Ordinal 0)
6 SMs enabled. Compute Capability sm_21
FreeMem:    867MB   TotalMem:   1023MB.
Mem Clock: 1700.000 Mhz x 256 bits   (108.800 GB/s)
ECC Disabled

Process count = 6200000
BINARY LOAD: R region
Reading 5 records
...
JOIN J1 l_orderkey o_orderkey 829939712
segment 96 560193536
join count 18210128
join time 56.555 801366016
JOIN J2 l_suppkey s_suppkey 829939712
Cuda error in file 'bison.y' in line 1461 : out of memory.

And for TPC-H 10GB i get an exception:

F:\GPU_test_10GB_new>AlenkaDB.exe q5.sql
GeForce GTX 460 SE : 1296.000 Mhz (Ordinal 0)
6 SMs enabled. Compute Capability sm_21
FreeMem: 858MB TotalMem: 1023MB.
Mem Clock: 1700.000 Mhz x 256 bits (108.800 GB/s)
ECC Disabled

Process count = 6200000
BINARY LOAD: R region
Reading 5 records
FILTER RF R 900055040
MAP CHECK segment 0 R
filter is finished 1 899006464
filter time 0.004 900055040
BINARY LOAD: N nation
Reading 25 records
BINARY LOAD: O orders
Reading 15000000 records
FILTER OFI O 900055040
MAP CHECK segment 2 R
filter is finished 2275919 861650944
filter time 0.065 886554624
BINARY LOAD: S supplier
Reading 100000 records
BINARY LOAD: C customer
Reading 1500000 records
BINARY LOAD: L lineitem
Reading 59986052 records
JOIN J11 n_regionkey r_regionkey 886554624
segment 0 886554624
join count 5
join time 0.007 886554624
JOIN J12 c_nationkey n_nationkey 886554624
segment 0 874496000
join count 300270
join time 0.035 876593152
JOIN J13 o_custkey c_custkey 886554624
segment 2 801226752
join count 456771
join time 0.154 822460416
JOIN J1 l_orderkey o_orderkey 828882944
segment 9 723501056
join count 1825856
join time 1.253 793100288
JOIN J2 l_suppkey s_suppkey 820625408
segment 0 803848192

<- got an exception

@antonmks
Copy link
Owner

antonmks commented Jun 3, 2013

Hi Alexey
I do not have a 100GB database right now, only 300GB so I can tell you my 300x results.
I do not count the disk read time because it depends only on your disk subsystem so these are run from a host memory ("second run") results :
So here it is for 300GB (in seconds) :
Q1 - 33
Q2 - 4
Q3 -16
Q5
Q6 - 2
Q7 -30

Q5 doesn't quite work at this scale because of a large intermediate data set. It is easy to fix, I just need to add partitioning routine to a join, just like in groupby.
My setup is :
Pentium G620
GTX 580 3GB of DDR5
16GB of DDR3

Before creating datafiles I presorted the lineitem.tbl on date_ordered column, this way Alenka can skip entire segments in filter ops.
I should be able to run the tests next week on GTX Titan, I expect the results to be significantly faster.

Regards, Anton

@AlexeyAB
Copy link
Contributor Author

AlexeyAB commented Jun 3, 2013

Now Alenka uses segment's (storage) indexes in DSM, like in Oracle ExatadaV2 in PAX?

You'll get access to Titan at some university, in a cloud or will buy it?

And why you exclude q4.sql?

Regards, Alexey

@antonmks
Copy link
Owner

antonmks commented Jun 3, 2013

Alenka doesn't use any indexes.
Alenka stores min and max values of every column for every segment so when running operations it is often possible to figure out that none of the values in a segment match, even without reading the segment itself.
GTX Titan - I got it as a present from Israeli guys who use Alenka :-)
Q4 - I exclude a lot of queries, supporting all TPC-H would require implementing many additional features ...

@AlexeyAB
Copy link
Contributor Author

AlexeyAB commented Jun 3, 2013

This method in Oracle is called the Storage Index, which uses in Smart Scans :)
http://www.oracle.com/technetwork/issue-archive/2011/11-may/o31exadata-354069.html

Storage Index

Figure 1 shows how the data inside a storage index is maintained.
...
Each storage index entry covers a physical region of the table, contains minimum and maximum values 
of the columns in that region, and also indicates whether any of the rows in that region contain nulls.

And If interested in HCC Mechanics, three paragraphs about it (half a page), the structure of the hybrid columnar compression (what sounds a whole lot like PAX) in Oracle ExadataV2.
They use a sorting of data in the first place, for the effective compression ratio.
http://books.google.ru/books?id=1Aqw6mrxz5AC&lpg=PA104&ots=T9pNopWGoA&hl=ru&pg=PA72&redir_esc=y#v=onepage&q&f=false

Discussion and what said Kevin Closson about HCC:
http://www.dbms2.com/2009/09/03/oracle-11g-exadata-hybrid-columnar-compression/

@antonmks
Copy link
Owner

antonmks commented Jun 3, 2013

Yeah, hybrid compression becomes popular. If you are into this, take a look
at

http://www.dbms2.com/2013/05/27/ibm-blu

Pretty interesting system and it executes queries on compressed data too !

@AlexeyAB
Copy link
Contributor Author

AlexeyAB commented Jun 3, 2013

Thanks!

About presorted the lineitem.tbl on date_ordered column. Can you write that SQL?

This has no effect:

A := LOAD 'lineitem.tbl' USING ('|') AS (orderkey{1}:int,  l_suppkey{3}:int, qty{5}:int, 
 price{6}:decimal, discount{7}:decimal, tax{8}:decimal, rf{9}:varchar(1),
 lf{10}:varchar(1), shipdate{11}:int, commitdate{12}:int, receiptdate{13}:int);

OA := ORDER A BY shipdate ASC;

STORE OA INTO 'lineitem' BINARY;

Outpup:
Process count = 6200000
LOAD: A lineitem.tbl 11 |
order: OA A
STORE: OA lineitem
LOADING lineitem.tbl |
SQL scan parse worked
cycle time 0.81

And no files created.

@antonmks
Copy link
Owner

antonmks commented Jun 4, 2013

Well, you need to presort the entire file :
sort -t "|" -k11 lineitem.tbl > lineitems.tbl

Regards, Anton

@AlexeyAB
Copy link
Contributor Author

AlexeyAB commented Jun 4, 2013

Thanks!
This is bash, but I on Windows, here sort can't do this :)
Will have to run a virtual machine.

@antonmks
Copy link
Owner

antonmks commented Jun 4, 2013

Just install Cygwin, it is really convenient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants