Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for libpostal over http service, adapter pattern et al. #249

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

missinglink
Copy link
Member

@missinglink missinglink commented Mar 17, 2020

this code is branched off #244, please merge that PR first!
▶️ see only the diff ◀️

this draft PR is all the best bits of #146 & #240 rebased on top of #244

See #146 for more info on the functionality and how to configure the service.

closes #146
closes #240

@orangejulius
Copy link
Member

orangejulius commented Mar 19, 2020

Okay, I noticed that the demo didn't work with this PR, so I tried to figure out why. It was pretty difficult and actually exposed an issue in the error handling for the server.

I was using Portland Metro data, and was using the demo to explore, which generated the following query:

http://localhost:4300/extract/geojson?lat=45.537858113762816&lon=-122.62214183807374&names=ne%20tillamook%20saint

This URL returns a 400 error, but also an empty JSON response, as shown below:

julian@rockaway ~/repos/pelias/interpolation $ curl -I "http://localhost:4300/extract/geojson?lat=45.537858113762816&lon=-122.62214183807374&names=ne%20tillamook%20saint"
HTTP/1.1 400 Bad Request
X-Powered-By: Express
Content-Type: application/json; charset=utf-8
Content-Length: 2
ETag: W/"2-vyGp6PvFo4RvsFtPoIWeCReyIC8"
Date: Thu, 19 Mar 2020 19:49:27 GMT
Connection: keep-alive

julian@rockaway ~/repos/pelias/interpolation $ curl "http://localhost:4300/extract/geojson?lat=45.537858113762816&lon=-122.62214183807374&names=ne%20tillamook%20saint"
{}

Diving deper, I took a look at the relevant express code:

conn.extract.query( point, names, function( err, data ){
if( err ){ return res.status(400).json( err ); }
if( !data ){ return res.status(200).json({}); }
res.json( pretty.geojson( data ) );
});

This code doesn't properly print the error to the response body, as JSON.stringify(err) ends up with an empty object.

Printing the error to stdout reveals the following:

interpolation_1  | SqliteError: near "/": syntax error
interpolation_1  |     at DynamicQueryCache.getStatement (/code/pelias/interpolation/query/DynamicQueryCache.js:70:28)
interpolation_1  |     at Object.module.exports [as extract] (/code/pelias/interpolation/query/extract.js:36:22)            
interpolation_1  |     at Object.q [as query] (/code/pelias/interpolation/api/extract.js:35:26)
interpolation_1  |     at processTicksAndRejections (internal/process/task_queues.js:94:5)

So there are two issues here:

  1. Higher priority: fix the error messaging for all endpoints.
  2. Improve this PR so it doesn't cause errors.

cmd/oa.js Outdated Show resolved Hide resolved
query/extract.js Outdated Show resolved Hide resolved
@missinglink missinglink force-pushed the better-sqlite3-v2-plus-libpostal-service branch from 705290c to 9db959d Compare March 20, 2020 14:58
@missinglink
Copy link
Member Author

missinglink commented Mar 20, 2020

Demo issues fixed in 6a5e084 and rebased to all 3 branches.

@missinglink missinglink force-pushed the better-sqlite3-v2-plus-libpostal-service branch 2 times, most recently from 449d1b4 to 538bc3c Compare March 20, 2020 15:18
@missinglink
Copy link
Member Author

Screenshot 2020-03-20 at 15 53 55

@orangejulius orangejulius force-pushed the better-sqlite3-v2-plus-libpostal-service branch from 538bc3c to de9206b Compare March 30, 2020 15:25
@orangejulius
Copy link
Member

I just did a quick rebase of this now that #244 is merged. We can continue to test this out and hopefully merge it soon!

@orangejulius orangejulius force-pushed the better-sqlite3-v2-plus-libpostal-service branch from c09b84f to 6e33f9e Compare November 4, 2021 19:54
@orangejulius
Copy link
Member

Well its been a while for this one but after some of the cleanup and maintenance work we've been doing in the interpolation repository, I took another look at this.

With all the better-sqlite3 stuff long since merged, this PR is a lot simpler now. I've rebased it on top of the latest changes from the master branch and it was all straightforward.

The development workflow for updating the mock libpostal responses went smoothly (set up libpostal and configure the service entry in pelias.json, run SEED_MOCK_LIBPOSTAL=true npm run ci, tests shoudl pass after that), and everything else seems to still work.

Maybe we take a look at getting this merged, since it's been quite helpful for avoiding the extra memory usage of having libpostal data loaded twice.

My current questions are:

  • Will an interpolation build in the docker projects still work? Will it run a lot slower using a remote libpostal service?
  • What should we do around messaging that a new configuration section is now recommended? Maybe add some deprecation notices to pelias/config? We'd want the Pelias API to start preferring the new services section too.
  • Should we add a pelias libpostal wait command or something similar to the docker CLI, and update the documentation on when to use it? Without something like that, running pelias compose up libpostal; pelias prepare interpolation might fail as the libpostal service takes a few seconds to start (though nothing like elasticsearch)

@missinglink
Copy link
Member Author

missinglink commented Nov 8, 2021

I agree it would be great to merge this functionality, since it provides a lot of flexibility going-forward.

Regarding which adapter is appropriate for default => (HTTP | NPM) is yet to be decided.

One of the complexities here is that this PR couples the adapter pattern code with a change to the default adapter to change it from NPM Module => HTTP.

So.. while I'm not totally against that, the HTTP and NPM Module adapters both have pros and cons, neither is a clearly superior method to the other, but I just wanted to note that its two changes in one, which is making this harder to merge.

Coupled with the unknown topic of performance (and consequently) build times it might be better to merge the code changes without changing the default adapter and then measuring the performance of different methods?

Alternatively we can use the existing code to measure the performance of a full planet build using HTTP and compare it to NPM Module which should give us some more confidence that changing the default adapter is the right choice, despite the additional complexity of network service discovery and I/O.

It might be worth running that build twice, since changes to how the NPM method is used in this PR vs. prior may also affect the performance of build times using the NPM method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants