New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make partial postcode searches space sensitive #44

Closed
cblanc opened this Issue Jul 8, 2015 · 11 comments

Comments

Projects
None yet
2 participants
@cblanc
Member

cblanc commented Jul 8, 2015

Update the autocomplete and query API to be sensitive to space in the search term

@tomredpepper

This comment has been minimized.

tomredpepper commented May 10, 2017

I've just run into this issue quite badly trying to install and test postcodes.io for election sites - it works great until someone types in 'M1', as in the example here, and the result in autocomplete and query mode is currently an M16 or M11 postcode (the result seems to vary between requests?)

I've been looking for a workaround but haven't spotted one yet... just being able to specify 'M1 ' or 'M1.' to indicate that I really mean M1 (not M1*) is the first part of the postcode would help a lot.

@cblanc

This comment has been minimized.

Member

cblanc commented May 10, 2017

Hi Tom, thanks for the feedback. Will get this done over the coming days

@cblanc

This comment has been minimized.

Member

cblanc commented May 16, 2017

Quick update @tomredpepper - I've created a test branch here with a new search methodology that impacts the autocomplete and /postcodes?q= APIs

Hopefully addresses 2 key issues, specifically 1) results aren't deterministic and 2) outcode matches are not handled well currently

Will be testing it on and off for a few days before considering to merge into master. If you have the time, feel free to give it a go

@tomredpepper

This comment has been minimized.

tomredpepper commented May 16, 2017

Thanks so much - I'll give it a try in the next couple of days.

@cblanc

This comment has been minimized.

Member

cblanc commented May 19, 2017

Just a quick update, seems like that branch is broken on linux because postgresql uses the system default collation (macos seems to use C for some reason). This caused the results to come out in a less useful order on linux

I'm afraid I mangled that branch trying to debug the issue on travis. I've made a new test branch here: https://github.com/ideal-postcodes/postcodes.io/tree/fixed-space-sensitive-search

There are a couple of significant schema changes. A couple of columns now use an explicit collation. A missing index is also added to the postcodes table

Am looking to get this out on Monday, but will likely mean a backward breaking release. i.e. rebuilding the database to accommodate new collation and index

@tomredpepper

This comment has been minimized.

tomredpepper commented May 19, 2017

Gave it a go but got the same results as before (eg M1 returns M11) - however it's entirely possible I set something up wrongly. Here's what I did:

  1. pm2 stop all
  2. re-cloned the git repo and checked out fixed-space-sensitive-search branch
  3. dropdb 'postcodesiodb'
  4. npm run setup (to rebuild the db)
  5. node server.js

I don't know if perhaps it's to do with npm run setup still importing https://postcodesio.s3.amazonaws.com/public/postcodesio-2017-03-06-1509.sql.gz? (I'm assuming this is a SQL dump that doesn't reflect your schema changes, if it needs to.)

@cblanc

This comment has been minimized.

Member

cblanc commented May 19, 2017

Ah yes, sorry I should have been clearer there. I meant rebuild it from the ONSPD source with ./bin/import.js /path/to/ONSPD.csv. The pg_dump itself will need to be rebuilt and uploaded at a later time

I think the simplest option for you is to manually do the schema changes:

  • CREATE UNIQUE INDEX ON postcodes (postcode); (create a missing index to speed up certain searches)
  • ALTER TABLE postcodes ALTER postcode TYPE varchar(10) COLLATE "C";
  • ALTER TABLE postcodes ALTER pc_compact TYPE varchar(9) COLLATE "C";

The last 2 commands explicitly set those columns to use the right collation.

These are the results I'm getting for:
M1

[
  "M1 1AD",
  "M1 1AE",
  "M1 1AF",
  "M1 1AG",
  "M1 1AH",
  "M1 1AJ",
  "M1 1AN",
  "M1 1BA",
  "M1 1BB",
  "M1 1BE"
]

"M1 "

[
  "M1 1AD",
  "M1 1AE",
  "M1 1AF",
  "M1 1AG",
  "M1 1AH",
  "M1 1AJ",
  "M1 1AN",
  "M1 1BA",
  "M1 1BB",
  "M1 1BE"
]

M11

[
  "M11 0AG",
  "M11 0AJ",
  "M11 0AT",
  "M11 0AY",
  "M11 1AA",
  "M11 1AD",
  "M11 1AE",
  "M11 1AF",
  "M11 1AG",
  "M11 1AH"
]

M1 1

[
  "M1 1AD",
  "M1 1AE",
  "M1 1AF",
  "M1 1AG",
  "M1 1AH",
  "M1 1AJ",
  "M1 1AN",
  "M1 1BA",
  "M1 1BB",
  "M1 1BE"
]

M11 1

[
  "M11 1AA",
  "M11 1AD",
  "M11 1AE",
  "M11 1AF",
  "M11 1AG",
  "M11 1AH",
  "M11 1AJ",
  "M11 1AL",
  "M11 1AN",
  "M11 1AP"
]
@tomredpepper

This comment has been minimized.

tomredpepper commented May 20, 2017

Hmm. I imported ONSPD_FEB_2017_UK.csv and the script gave all the success messages. Also ran the SQL commands you gave. And I'm definitely on fixed-space-sensitive-search branch (double checked git status). Yet still something's wrong.
http://46.101.44.176:8000/postcodes?q=m1
http://46.101.44.176:8000/postcodes?q=sk1
http://46.101.44.176:8000/postcodes?q=n1

I'm hoping this is just a config error somewhere on my part - don't want to distract from getting the fix out there.

Edit to add: this is Ubuntu 14.04.5 LTS, on DigitalOcean.

@cblanc

This comment has been minimized.

Member

cblanc commented May 20, 2017

Thanks for that, I'll run it on ubuntu today and see if I can replicate

@cblanc

This comment has been minimized.

Member

cblanc commented May 20, 2017

@tomredpepper

This comment has been minimized.

tomredpepper commented May 21, 2017

Yep - I spun up a new droplet and installed from the wiki instructions but using the fixed-space-sensitive-search branch, and it now gives exactly the same results as your three links there :)

@cblanc cblanc closed this May 31, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment