Skip to content

hardikSinghBehl/ngram-search-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Search API to handle Spelling-Corrections

Based on N-gram index algorithm: using MySQL Ngram Full-Text Parser

Sample Screen-Recording

Screen.Recording.2021-10-21.at.1.20.43.PM.mov

Tech stack

  • MySQL Ngram Text Parser (REFERENCE)
  • EntityManager (JpaRepository can also be used)
  • Flyway
  • Spring-boot to expose REST API

Important files

Customizing response through score values

ngram index algorithm computes scores for every record in accordance with the provided keyword. Below attached is a sample screenshot to show the results (look at the score column)

Screenshot 2021-10-21 at 6 43 37 PM

Depending on the usecase in front and functionality to be achieved by a particular application, we can put a condition in our native sql query to return only those records which have a computed score greater than {value-needed: example 1.5, 0.7 etc}

Attaching sample query

SELECT id, name, bio,
    MATCH(name,bio) 
    AGAINST('James Potter') as score
FROM characters 
WHERE 
    MATCH(name,bio) 
    AGAINST('James Potter') >1.4 ORDER BY score DESC;

Screenshot 2021-10-21 at 6 52 41 PM

The count of the data was reduced significantly just by introducing the greater than score condition We can even make the value dynamic depending upon the size of records being matched or have the frontend provide the value using request-parameter to the API

Local Setup

  • Install Java 17 (recommended to use SdkMan)

sdk install java 17-open

  • Install Maven (recommended to use SdkMan)

sdk install maven

  • Clone the repo and configure datasource values in the application.properties file, run the below command

mvn clean install

  • To start the application

mvn spring-boot:run &

  • Access the sole API on below path(s)

    • to view all character records
    http://localhost:8080/characters
    
    • to view character records based on a searchable keyword (page starts from index 1 and count is set to 10 of no count param is provided)
    http://localhost:8080/characters?keyword=harry&count=20&page=1
    

About

Search API with spelling correction using ngram-index algorithm: implementation using Java Spring-boot and MySQL ngram full text search index

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages