Simish is a sentence similarity and matching microservice service built using golang. https://golang.org/
Installing Simish is simple. Assuming you have golang installed simply run.
go get github.com/tiltfactor/simish
To update to new version add the -u flag
go get -u github.com/tiltfactor/simish
Assuming that a MySQL database is being used as storage you can run
simish init
To generate a configuration file. See Databases section for information on extending support for other databases.
{
"username": "db_username",
"password": "db_password",
"ip_addr": "db_ipAddress eg 127.0.0.1",
"database": "db_name",
"db_port": "9000",
"server_port": "9000"
}
Once this file is generated
simish start
will start the simish service. To run it as a background process:
nohup simish start &
Simish comes with support for MySQL but can be easily extended to support other databases and storage methods.
To extend Simish with support for other storage systems simply implement the InputOutputStore interface (found in domain/InputResponse.go).
type InputOutputStore interface {
SaveInputOutput(InputOutput) error
Response(string, int64) (InputOutput, float64)
}
If you are extending the features supported by the storage (eg adding upvotes and downvotes storage) please be sure to also extend the InputOutputStore interface. This will make sure that the contract stays the same between storage backends and that the main program does not have to worry about the implementation details of the storage mechanism
To change the soft match algorithm edit the SoftMatch function in the domain/InputResponse file. Currently the service uses the JaroWinklerDistance to calculate the closeness of two sentences but this can easily be changed or extended by editing the SoftMatch function in domain/InputResponse.go
GET http://localhost:8000/api/v1/response?input=Hello&room=1
{
"input": "Hello",
"response": "#splat",
"match": "Hello",
"room": "1",
"score":1
}
The response consists of the provided input, the response that was found for that input, the existing input that was found for it, the room number used for searching, and the score (closeness) of the two inputs.