Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Tests and Benchmarking #103

Open
vitcpp opened this issue Nov 7, 2023 · 3 comments
Open

Performance Tests and Benchmarking #103

vitcpp opened this issue Nov 7, 2023 · 3 comments

Comments

@vitcpp
Copy link
Contributor

vitcpp commented Nov 7, 2023

Dear Sirs,

There is a suggestion to implement performance tests and benchmark pgSphere. One of the important pgSphere features is the fast access and search to spherical objects using indexes. To improve existing functionality, we need a way to identify bottlenecks and compare modification results. We should benefit from performance testing.

There are some but not all the questions related to performance testing:
- Which data patterns to use?
- Where to get test data?
- How to configure PostgreSQL?

I would like to start discussion of the pgSphere performance testing. I welcome any feedback and suggestions on the subject.

@esabol
Copy link
Contributor

esabol commented Nov 8, 2023

Our most typical use-case:

Our tables have longitudes and latitudes in columns ra and dec, respectively, and we search for matches that appear in a user-specified circle.

First, we create an index on spoint_deg(ra,dec). Then, in response to user requests, we query

select * from sometable where spoint_deg(ra,dec) @ scircle_deg(spoint_deg(1.2345,6.7890),0.166667)

to find all the entries in sometable which have points inside the user-specified circle centered at (1.2345,6.7890) with a radius of 0.166667 degrees.

For test data, any large public domain astronomical catalog would work. I'm not sure which ones are public domain though.

@df7cb
Copy link
Contributor

df7cb commented Nov 8, 2023

@esabol, that is the use case I tried to improve in #80, but I'm not 100% sure if it also works with the _deg variants of the functions, could you give that a real-life check?

@esabol
Copy link
Contributor

esabol commented Nov 8, 2023

@esabol, that is the use case I tried to improve in #80, but I'm not 100% sure if it also works with the _deg variants of the functions, could you give that a real-life check?

I'm not sure, but I think it is? It says the index is being used. If I add a column to the table named s_pos, populate it with the result of spoint_deg(ra,dec), create an index on s_pos, the query is 3.1 times faster for this very large table we have (approximately 8 million rows). But it should be faster since it's not calculating spoint_deg(ra,dec) for each row at query time, right?

Here's the output of explain analyze for such a scenario:

db=> explain analyze select * from obscore where spoint_deg(s_ra,s_dec) @ scircle_deg(spoint_deg(1.2345,6.7890),0.166667);
                                                             QUERY PLAN                                 
                            
--------------------------------------------------------------------------------------------------------
----------------------------
 Bitmap Heap Scan on obscore  (cost=318.85..29276.98 rows=8056 width=518) (actual time=2608.485..2608.70
2 rows=77 loops=1)
   Recheck Cond: (spoint_deg(s_ra, s_dec) @ '<(0.02154608961587 , 0.118490402917895) , 0.002908887904421
39>'::scircle)
   Heap Blocks: exact=11
   ->  Bitmap Index Scan on s_point_deg_idx  (cost=0.00..316.84 rows=8056 width=0) (actual time=2553.354
..2553.355 rows=77 loops=1)
         Index Cond: (spoint_deg(s_ra, s_dec) @ '<(0.02154608961587 , 0.118490402917895) , 0.00290888790
442139>'::scircle)
 Planning Time: 5481.711 ms
 Execution Time: 3027.972 ms
(7 rows)

db=> explain analyze select * from obscore where s_pos @ scircle_deg(spoint_deg(1.2345,6.7890),0.166667);
                                                         QUERY PLAN                                     
                    
--------------------------------------------------------------------------------------------------------
--------------------
 Bitmap Heap Scan on obscore  (cost=318.85..29256.84 rows=8056 width=518) (actual time=975.301..975.472 
rows=77 loops=1)
   Recheck Cond: (s_pos @ '<(0.02154608961587 , 0.118490402917895) , 0.00290888790442139>'::scircle)
   Heap Blocks: exact=11
   ->  Bitmap Index Scan on s_pos_idx  (cost=0.00..316.84 rows=8056 width=0) (actual time=975.271..975.2
73 rows=77 loops=1)
         Index Cond: (s_pos @ '<(0.02154608961587 , 0.118490402917895) , 0.00290888790442139>'::scircle)
 Planning Time: 0.252 ms
 Execution Time: 975.583 ms
(7 rows)

Partial list of indexes we have on this table:

    "s_circle_idx" gist (s_circle)
    "s_circle_s_region" gist (scircle_intext(s_region))
    "s_point_deg_idx" gist (spoint_deg(s_ra, s_dec))
    "s_poly_idx" gist (s_poly)
    "s_poly_s_region" gist (spoly_intext(s_region))
    "s_pos_idx" gist (s_pos)

If I repeat the above, they are much faster on subsequent attempts (or maybe the database was busy doing something else when I tried the first time):

db=> explain analyze select * from obscore where s_pos <@ scircle_deg(spoint_deg(1.2345,6.7890),0.166667);
                                                       QUERY PLAN                                       
                
--------------------------------------------------------------------------------------------------------
----------------
 Bitmap Heap Scan on obscore  (cost=318.85..29256.84 rows=8056 width=518) (actual time=0.180..0.452 rows
=77 loops=1)
   Recheck Cond: (s_pos <@ '<(0.02154608961587 , 0.118490402917895) , 0.00290888790442139>'::scircle)
   Heap Blocks: exact=11
   ->  Bitmap Index Scan on s_pos_idx  (cost=0.00..316.84 rows=8056 width=0) (actual time=0.147..0.148 r
ows=77 loops=1)
         Index Cond: (s_pos <@ '<(0.02154608961587 , 0.118490402917895) , 0.00290888790442139>'::scircle
)
 Planning Time: 0.516 ms
 Execution Time: 0.582 ms
(7 rows)

db=> explain analyze select * from obscore where spoint_deg(s_ra,s_dec) <@ scircle_deg(spoint_deg(1.2345,6.7890),0.166667);
                                                          QUERY PLAN                                    
                      
--------------------------------------------------------------------------------------------------------
----------------------
 Bitmap Heap Scan on obscore  (cost=318.85..29276.98 rows=8056 width=518) (actual time=0.240..0.412 rows
=77 loops=1)
   Recheck Cond: (spoint_deg(s_ra, s_dec) <@ '<(0.02154608961587 , 0.118490402917895) , 0.00290888790442
139>'::scircle)
   Heap Blocks: exact=11
   ->  Bitmap Index Scan on s_point_deg_idx  (cost=0.00..316.84 rows=8056 width=0) (actual time=0.213..0
.214 rows=77 loops=1)
         Index Cond: (spoint_deg(s_ra, s_dec) <@ '<(0.02154608961587 , 0.118490402917895) , 0.0029088879
0442139>'::scircle)
 Planning Time: 0.391 ms
 Execution Time: 0.543 ms
(7 rows)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants