Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Cloud Mining automatically builds exploratory faceted search systems.
JavaScript Python CSS
branch: master

Building the DBLP instance

- updated the data with 2.4M documents
- improved documentation
- improved scripts required to build the instance
- clean up and typos (see diff)
latest commit 8eba3b597d
Alex Ksikes authored
Failed to load latest commit information.
cloudmining Updated for changes in SimSearch
examples Building the DBLP instance
scraping Building the DBLP instance
tests Minor commit changes
tools first commit
INSTALL.md Minor commit changes
LICENSE first commit
README.md documentation

README.md

Cloud Mining automatically builds exploratory faceted search systems. It leverages Sphinx as a full text retrieval engine and fSphinx for faceted search. SimSearch is used for item based search. The aim is to provide an interface which will encourage nonlinear search and data exploration. The facets support different visualizations such as tag clouds, histogram counts or a rose diagram and can be extended with pluggins.

Create a file called application.py with the following lines:

from cloudmining import CloudMiningApp

# create a new CloudMining web application
app = CloudMiningApp()

# create a FSphinxClient from a configuration file
cl = FSphinxClient.FromConfig('/path/to/config/sphinx_client.py')

# set the fsphinx client of the app
app.set_fsphinx_client(cl)

Execute application.py and aim your browser at http://localhost:8080:

python application.py

On data from IMDb, you obtain the following interface:

Cloud Mining Generic Interface

And after customization, you get:

Cloud Mining Customized Interface

Check out some instances, here and there. Have a look at the api for customization and look into some of the example instances provided.

Thank you to Andy Gott for the logo design, FAMFAMFAM and Fugue for the icons. Rose diagram thanks to RGraph.

Something went wrong with that request. Please try again.