Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Geo Search #426

Closed
qdequele opened this issue Dec 17, 2019 · 14 comments
Closed

Allow Geo Search #426

qdequele opened this issue Dec 17, 2019 · 14 comments

Comments

@qdequele
Copy link
Member

qdequele commented Dec 17, 2019

Geo Search is a way to filter and sort results by distance or around certain geographical locations. User can limit your results to a street, to a city or cities, to one or more parts of the world. You can sort your results according to how near or far they are to a certain defined geolocation.

Geo offers you various possibilities:

  • Filter and sort around a set of latitude and longitude coordinates
  • Filter by one or more box-shaped geographical areas (bounded boxes)
  • Filter by one or more freely-drawn geographical areas (polygons)
  • Sort around a set of latitude and longitude coordinates

It could be done by implementing s2 and Hilbert Curves algorithms.

We can force the user to reformat there data by adding a geo field in there documents:

"_geoloc": {
  "lat": 48.733333,
  "lng": -3.466667
}

Or we can add on settings the possibility to select the latitude and the longitude fields.

@qdequele qdequele added the enhancement New feature or improvement label Dec 17, 2019
@qdequele qdequele added R_2020_Q4 and removed enhancement New feature or improvement labels Mar 15, 2020
@westurner
Copy link

@martin-juul
Copy link

Postgres also offers geometric types. Also for different use cases like sizes, which could come in handy - should MeiliSearch opt for feature comparability (simpler indexing, when ingesting from a postgres store)

https://www.postgresql.org/docs/12/datatype-geometric.html

@flother
Copy link

flother commented Apr 19, 2020

There's also PostGIS. PostgreSQL's built-in geometry fields are two-dimensional geometries that use unprojected Cartesian coordinates: while they would work as expected at the equator, they would become more and more distorted closer to the poles.

What MeiliSearch needs is geospatial geometries. PostGIS implements the Open Geospatial Consortium's Simple Features. That's where the widely-used well-known text (WKT) and well-known binary (WKB) markup formats come from.

@ofebles
Copy link

ofebles commented Jun 16, 2020

When will meili include geo search? Date planned? We need it a lot for new projects. :) Thanks

@waelbettayeb
Copy link

I hope this feature will be planned for the next release

@JexPY
Copy link

JexPY commented Oct 15, 2020

I hope we will have this feature soon. I would like to use meilisearch and nothing more for indexed data.

@tchartron
Copy link

tchartron commented Nov 29, 2020

Is there is any chance to see this released any time soon as i see the 2020_Q4 label has been removed ?
Thanks for the great work on Meilisearch 💪🏼👍🏼 🇫🇷

@ManyTheFish
Copy link
Member

Hi @tchartron, thanks for your support!
About geo-search, we will not release it at the end of Q4_2020 because we are working on enhancing our tokenizer to have better support of other languages like Chinese Languages. I hope it will enhance your search experience too!

For Q1_2021, we don't know if we will work on geo-search or distributed systems, but I think it is a feature that we want to release in 2021.

I hope my answer was satisfactory,

Thanks again for your support, and stay tuned!

@tchartron
Copy link

tchartron commented Dec 4, 2020

Great thanks for these informations, I'll keep an eye on this until it's released 🚀😉

@qdequele
Copy link
Member Author

qdequele commented Dec 5, 2020

Hello,

I will close this issue in favor of our public roadmap.

I invite everyone interested in this feature to upvote it on the roadmap.

@qdequele qdequele closed this as completed Dec 5, 2020
@tomasweigenast
Copy link

By the moment, the only way is to store the latitude and the longitude as another JSON field, and when the results are retrieved, skip those who aren't in a certain area. It will be slow, but the only way

@flother
Copy link

flother commented May 17, 2021

@TomasWeg It's not the only way. One option would be to compute the geohash for each record, store the geohashes in the MeiliSearch index, and then do a prefix search.

As an example, the old observatory at Greenwich in London has a longitude of 0.0 and latitude of 51.47773. This location has the geohash u10hb50p80, and every object within ~2.5km will have a geohash with the same five-letter prefix, u10hb. Every object within ~20km will have the same four-letter prefix, u10h. And so on.

Geohashes (and space-filling curves in general) aren't perfect — it's possible they won't match some nearby objects — but this will be a much faster way of finding things nearby.

@shivaylamba
Copy link
Contributor

@qdequele Geo Search is live in the latest release : https://github.com/meilisearch/MeiliSearch/releases/tag/v0.23.1

bors bot added a commit that referenced this issue Jan 16, 2023
426: Fix search highlight for non-unicode chars r=ManyTheFish a=Samyak2

# Pull Request

## What does this PR do?
Fixes #1480
<!-- Please link the issue you're trying to fix with this PR, if none then please create an issue first. -->

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

## Changes

The `matching_bytes` function takes a `&Token` now and:
- gets the number of bytes to highlight (unchanged).
- uses `Token.num_graphemes_from_bytes` to get the number of grapheme clusters to highlight.

In essence, the `matching_bytes` function now returns the number of matching grapheme clusters instead of bytes.

Added proper highlighting in the HTTP UI:
- requires dependency on `unicode-segmentation` to extract grapheme clusters from tokens
- `<mark>` tag is put around only the matched part
    - before this change, the entire word was highlighted even if only a part of it matched

## Questions

Since `matching_bytes` does not return number of bytes but grapheme clusters, should it be renamed to something like `matching_chars` or `matching_graphemes`? Will this break the API?

Thank you very much `@ManyTheFish` for helping 😄 

Co-authored-by: Samyak S Sarnayak <samyak201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests