Skip to content
This repository has been archived by the owner. It is now read-only.
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time



Author: Lars Yencken <>
Date: 21st Jan 2011


SimSearch is a dictionary search-by-similarity interface for Japanese kanji, providing a nice front-end for Kanjidic. It lets you find a kanji you don't know, using kanji that are visually similar.

If you're viewing this source code, you should be a developer, or someone at least a little comfortable with Python.


This is a quick guide to getting SimSearch up and running locally.


SimSearch uses MongoDB as its database backend. If you don't already have it, install MongoDB first. By default, it will create and use a database called simsearch in MongoDB.

Next, you need Python (2.6/2.7), pip and virtualenv. Then you can install the necessary packages in an environment for simsearch:

$ pip -E ss-env install ./simsearch

Occasionally a dependency will fail to install cleanly (e.g. NLTK). In that case, you will need to download a package for it, enter the virtual environment and install the package from there:

$ tar xfz nltk-v2.08b.tgz
$ cd nltk-v2.08b
$ source /path/to/simsearch/ss-env/bin/activate
(ss-env) $ python install

Building and running

Once installed, build the database with:

$ python -m simsearch.models
Building similarity matrix
Building neighbourhood graph

You can then run the debug server with the command The server will be available at http://localhost:5000/.


Please see Flask documentation around deployment. Feel free to email me as well, if you have any issues.

You can’t perform that action at this time.