Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in Autocomplete sorting #1327

Open
NickStallman opened this issue Jun 26, 2019 · 7 comments
Open

Regression in Autocomplete sorting #1327

NickStallman opened this issue Jun 26, 2019 · 7 comments

Comments

@NickStallman
Copy link

I've just performed a full rebuild and update and I've found a regression in the ordering of autocomplete queries.

E.g. https://pelias.github.io/compare/#/v1/autocomplete%3Fboundary.country=AU&layers=locality%252Cpostalcode%252Clocaladmin%252Cregion%252Ccounty&text=hobart

This query is asking for results on the layers: locality, postalcode, localadmin, region, county
However only localities are being returned. The locality is the least broad layer in the list so I would have expected it to be last.

Removing it from the list gives this: https://pelias.github.io/compare/#/v1/autocomplete%3Fboundary.country=AU&layers=postalcode%252Clocaladmin%252Cregion%252Ccounty&text=hobart
Which is similar, now all we get is localadmin.

Removing localadmin and we finally get the Hobart county.
https://pelias.github.io/compare/#/v1/autocomplete%3Fboundary.country=AU&layers=county%252Cregion%252Cpostalcode&text=hobart

It's like the layer heirachy is upside down when sorting results with the most specific results first, not the broadest results first.

This was working correctly with my last update in April.

@orangejulius
Copy link
Member

orangejulius commented Jun 27, 2019

Hey Nick,
Preferring "more granular" layers like locality over region has been core behavior in Pelias for a while, at least several years. The only change I can think of recently is that we improved de-duplication to look at records that share a hierarchy, and remove the ones that are a parent of another result.

We know of lots of examples where this makes sense: San Francisco, Barcelona, Tokyo, and Berlin are all cities within a region of the same name (either with the same boundaries or encompassing a bigger area), and in all cases most people refer to the city over the region without further clarification. In the states, someone looking specifically for a county would usually say "Hobart county" as you have done. Many counties have alternate names in WOF to handle this, although I'm sure some don't.

Do you think its very common in Australia where a placename without any further qualifications would refer to a county over a city in that county that shares a name? If so we'll have to figure out something, but it could be difficult considering how deep this assumption is built into Pelias.

@NickStallman
Copy link
Author

Hey @orangejulius

That sounds like that's it - the broader regions are being deduplicated out.
That might make sense for a search query, but not so much for autocomplete perhaps.

In Australia the word "county" is never used. We don't have counties, the official equivelant term is Local Government Area (LGA) but that's not something a end user would ever specify in a search
It's also very common for the LGA to have the same name as the primary locality so by deduplicating them it's essentially the same as filtering the county/localadmin layers out entirely - it's impossible for them to appear in the results.

Since autocomplete is supposed to give a list of options, not try and get the most exact match, I think it should be disabled for autocomplete but left enabled for search. At the very least if it was attached to a flag that allows that behaviour to be configured.

The results we were getting previously were great, along the lines of:
Gosford Local Government Area
Gosford Suburb
West Gosford Suburb
North Gosford Suburb
etc....
As it's autocomplete, the options given to the user are perfect, they can select broad results or precise results.

@orangejulius
Copy link
Member

orangejulius commented Jun 28, 2019

Hey Nick,
That's a very good point that autocomplete is about giving a list of options. I agree that's how it should work when there are several possible matching results.

The reason we undertook the changes to improve deduping is that based on reports from many customers and users, people get very confused when they see duplicate results from autocomplete.

So one of the first things we would have to do is ensure the labels are distinct and clear for the different results. We've talked about doing this dynamically since 2016 in pelias/labels#8. We might need something that fancy, or we might just need to alter the names for records that we would not want deduped. The deduplication code already treats records as different if their names are different, so that would actually be very simple.

For example, the county record of Gosford doesn't have something like an official name or preferred variant that looks like "Gosford Local Government Area", so with some discussion with the WOF team, we could add the appropriate names.

Then by looking at a combination of the name, whether the records geometry is coterminous with another similarly named record, and perhaps a bit of configuration for preferences in different countries or regions, we can make Pelias do the right thing.

@NickStallman
Copy link
Author

Ah if the Pelias results are being used raw by the autocomplete then yep I can see how that would be mostly useless.

In my usecase we are formatting the autocomplete results in JS before presenting them, so we get something nice like this:
image
Since we are presenting the layer (with localised names instead of the raw Pelias names) it's immediately clear what the user is looking at.

I'm not sure changing the WOF data is the right answer in this case. While the government's official terminology is LGA (Local Government Area), this isn't used in every day usage. "Gosford" or "Gosford Council" would be the every day terms with just plain "Gosford" being most common.

So would it be accurate to say that the deduplication of different layers in autocomplete is certainly the incorrect behaviour, the actual problem is how to distinquish them in the results?
That's trickier, especially with different localisation terms.
It could be added to WOF like you suggested, but that's a bit of work to do and would mess with how the results could be formatted like in my screenshot. It would be like burning subtitles in to a video file - it makes it so you can't remove them.

@NickStallman
Copy link
Author

@orangejulius do you know which commit made this change? I'd like to undo it on my end until a better permanent solution can be found.

@orangejulius
Copy link
Member

@NickStallman the changes were merged in #1230. I'd suggest avoiding reverting the entire PR if you can, but you might have a look at the list of layers to dedupe created by that PR and trying to modify it to suit your needs.

@NickStallman
Copy link
Author

Perfect thanks @orangejulius. Commenting out localadmin, county and region has done the trick.

A long term solution could be passing postProc.dedupe(), a flag for the autocomplete route to disable such aggressive deduping, but leaving the behaviour intact for search queries?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants