Skip to content
prototype of a session-based semantic search engine
JavaScript Python HTML CSS
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


epic search is the prototype of a session-based semantic search engine.

Session-based search means that we include previous queries from the user's query history into an expanded search query, so that we can interpret the user's requests in context. Semantic means that we don't just use keyword matching to get search results, but also rely on probabilistic topic models to find relevant search results. Our goal is to support the user in complex exploratory search tasks.

The concepts are described in detail in the thesis Towards Collaborative Session-based Semantic Search by Sebastian Straub.

The prototype of the search engine consists of a web application written in Python 3 that uses the micro web framework Flask to process queries and serve search results.


To install the application with all required dependencies:

python3 install --user

Now it can be started with


Point your browser to http://localhost:1337 to see the homepage of the search engine. You won't get any search results until a search index is up and running (see next section).

You may specify a configuration file using the --config parameter that overrides the default options defined in search_ui/res/base.conf

Search Index

Please refer to the documentation of the search index builder to build your own search index.

If you have a working search index, make sure Elasticsearch is running on the same machine on port 9200 or specify and elastic.port in the configuration.

There is one more thing you need for the search engine to work: a topic model. You can get it from the search index builder, it is stored in a file named *-topics.json, where * is the prefix you pass to the topic model script. By default, it is expected to be placed in data/topics.json, but you can adjust the path in the config by specifying another path under the key data.topics.file.

This topic model will be parsed on first launch, which can take about a minute, but then it is cached in a binary format, so the time to launch the search engine with a cached topic model should never exceed a few hundret milliseconds.


On some distributions, additional packages might be required that are not installed by default. E.g. on Debian:

apt install python3-dev python3-pip hunspell


Copyright 2016-2017 Sebastian Straub

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
See the License for the specific language governing permissions and
limitations under the License.

Different licensing conditions may apply to bundled libraries and web fonts.

You can’t perform that action at this time.