AscentCapstone

This is the backend for our capstone project in the VMWare Ascent Program (Bootcamp) - july / septempter 2022

My group was in charge of the search feature, developed in Java / SpringBoot & Elasticsearch (ELK stack)

I was in charge of developing an autocomplete feature for the search.

In Elasticsearch there are different ways to build an autocomplete feature. For example, we can configure a field, such as breed for example, as a ‘completion suggester’ in our index mapping. Alternatively, we could simply search all documents every time a character is typed, and populate our dropdown element with data from the returning JSON objects. It was this latter approach we implemented.

The magic happens in the queries we build. I used Kibana's visual interface extensively for testing. My main challenge at this point was to build a query that retrieved pets filtered for ‘up for adoption’, while conducting a fuzzy search in multiple fields and preserving each matching score.

A fuzzy query returns data similar to the searched term using the Damerau–Levenshtein edit distance algorithm: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-fuzzy-query.html

The edit distance is the number of one-character changes needed to turn one term into another (by deleting, inserting or transposing two adjacent characters). To find similar terms, the fuzzy query creates a set of all possible variations, or expansions, of the search term within a specified distance. The query then returns exact matches for each expansion.

This is how when you type b-e-l-g, you can find similar matches from different fields like breed, and description… … and eventually Bertrand, the iguana, shows up as a match!

Finally, we can also apply a function score to our query. That is essential when we are also filtering data, because a simple boolean match would ignore scores. In our autocomplete, I used a default script score to wrap the query, but I could have defined any mathematical expression, any weights, or even combined functions. Practically speaking, the score allows us to sort the data retrieved by relevance.

Name		Name	Last commit message	Last commit date
Latest commit History 190 Commits
gradle/wrapper		gradle/wrapper
k8s-scripts		k8s-scripts
logstash		logstash
src		src
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
Dockerfile		Dockerfile
README.md		README.md
build.gradle		build.gradle
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
gradlew		gradlew
gradlew.bat		gradlew.bat
pet-search-openapi.yaml		pet-search-openapi.yaml
schema.sql		schema.sql
search.puml		search.puml
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AscentCapstone

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

lene626/AscentCapstone

Folders and files

Latest commit

History

Repository files navigation

AscentCapstone

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages