Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel support #235

Open
gabri94 opened this issue Nov 18, 2018 · 6 comments
Open

Parallel support #235

gabri94 opened this issue Nov 18, 2018 · 6 comments

Comments

@gabri94
Copy link

gabri94 commented Nov 18, 2018

Hi, I'd like to take advantage of the parallelization offered by the latest versions of Posgres.
I've tried to run the following query to check whether the planner would have executed it in parallel.

 SELECT pa
    FROM lidar_table
    WHERE PC_Intersects(pa, ST_GeomFromText('POINT(43.1 11.8)', 4326))

However I realized that it wasn't the case. I saw that the function PC_Intersects is marked as 'PARALLEL SAFE'
What could be the problem?

@autcrock
Copy link

Hi all.

I'm interested in this issue too.

I'll aim to work through analyzing planner output, but if there is any advice on what I need to consider with respect to the pointcloud context would appreciate it.

Cheers

Mike Thomas

@gabri94
Copy link
Author

gabri94 commented Apr 27, 2020

In the end to achieve better performance i dropped postgis and pointcloud for the raster managment at all.
Now I am directly working with raster files from python (using rasterio) and my performance are 100/1000x better

@Remi-C
Copy link
Contributor

Remi-C commented Apr 27, 2020

Hey @gabri94 , this is typically the type of query that should have been using the index.
To make it short, you should have small pa, (up to a few millions), indexed.
I'm not sure I understand the use of PC_intersects in this case.
most likely you dont need a pc function, but rather a postgis function here.
(overlap of pa bounding box and your point bounding box).
If you don't use indexes in your workflow, postgres/postgis/pgpointcloud is very useless, you might as well do a brute force solution.

@gabri94
Copy link
Author

gabri94 commented Apr 27, 2020

I was using the index obviously, but since i had to run 2Billion queries on the same DSM it would have took 30 days to finish.
i ended up removing the DBMS from the equation and load the whole DSM in memory at once.
I don't think there's a problem with pointcloud, rather that my use case was not suited for it.

@Remi-C
Copy link
Contributor

Remi-C commented Apr 27, 2020

Sure, point cloud and databases are only useful in some situations.
2 Billions queries is a lot!

@autcrock
Copy link

Thanks Gabriel and Remi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants