Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow Performance on 360k Records - Running as Node.js function #55

Closed
stevenyix opened this issue Dec 14, 2018 · 7 comments
Closed

Slow Performance on 360k Records - Running as Node.js function #55

stevenyix opened this issue Dec 14, 2018 · 7 comments
Assignees
Labels
question Further information is requested

Comments

@stevenyix
Copy link

I'm using geofirestore in a firestore cloud function with an http endpoint to perform radius search - return records within the radius of a lat/lon coordinate.

This query runs against a document collection with 360k records.

The performance is quite slow - queries are taking 12-14 seconds according to the Cloud Functions log. In comparison, I created a small Cloud SQL Postgres database and Express.js web API and it’s returning identical queries in 300ms.

I have one composite index created, otherwise it’s using default setting. The composite index I created is: "d.locationType ASC g ASC".

Below is a screenshot of the document structure. I added a few child fields in the 'd' field created by Geofirestore.

I posted this question on the google-cloud-firestore-discuss@googlegroups.com discussion group, and one of the Firestore engineers suggested reaching out to you - but would be willing to talk with the geofirestore developers and possibly help. I'll point him to this issue.

Any guidance on how to optimize this to run - ideally <1 sec?

image

@samtstern
Copy link

cc'ing myself here: curious to know how this library works and if there are possible optimizations!

@MichaelSolati
Copy link
Owner

@stevenyix hopefully I can try to address and figure out what is going on (and we can go from there). So there may be a lot of moving factors here, where I would love to see a code snippet to see how your cloud function works. From there I can hopefully address any performance issues (and maybe v3 might be better suited for you). Issue #38 addressed a similar concern where it seemed as if the data was being modified all the time, so the ready event never triggered, so knowing what's going on here would be very important.

Hey @samtstern, real quickly this works almost exactly how geofire works, except the guts have been reworked to use Firestore. Effectively we generate geohashes for points around the center, including for the center, and then run onSnapshot on each query generated. Since the geohash should have severely limited the items being returned then we just do a quick check on the distance between the origin and the queried doc, and if it's in range we will fire the callback to return the doc to the user.

@MichaelSolati MichaelSolati self-assigned this Dec 14, 2018
@MichaelSolati MichaelSolati added the question Further information is requested label Dec 14, 2018
@stevenyix
Copy link
Author

@MichaelSolati thanks for taking a look at this. When I ran my queries none of the data was being modified.

Here are links to 2 gists:

This just creates my cloud function endpoint, takes the request and makes the call to a geofirestore query:
https://gist.github.com/stevenyix/448f407cf14579f0f641bbd2348ed3ca

This contains the call to geofirestore. geoSearch() is the primary function. The implementation is a bit janky because it doesn't appear that geofirestore supports async/await or promises, so I put in a 1ms sleep() function that prevents the function from exiting until the 'ready' event changes a boolean flag to indicate the query is complete. If there's a more efficient way to do it - please let me know!

https://gist.github.com/stevenyix/e2e3e06ba5e574c45d159beb7925ea09

Thanks again.

@MichaelSolati
Copy link
Owner

Hey @stevenyix so I'm not 100% sure if the issue exists with the library, but I did some small changes to your function that will hopefully optimize it (let me know if it helps)

https://gist.github.com/MichaelSolati/c44d60126044778b5ee15054efc17d44

@alexandregiordanelli
Copy link

#63

The problem is that library is downloading 360k record and filter after on clientside..

@MichaelSolati
Copy link
Owner

@alexandregiordanelli the library does not in fact download the entire 360k record. It does an initial filter based on geohashes via firestores startat and endat functions and then does another check on client in case the document is off a little (as is the nature with geohashes)

@MichaelSolati
Copy link
Owner

@stevenyix closing this as it's been almost two weeks since I last heard any update from you, please feel free to comment back if my solution didn't work for you.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants