Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Python
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
analyzers
indexing
searchers
tokens
.gitignore
README.rst
feeds
load.py
requirements.txt

README.rst

Arya - Web search prototype built on Mongo

Overview

This project was a way to explore Mongodb a bit and its Map Reduce system. Seemed like a fun project and it is far from complete or really that interesting. It has a basic Indexer that, given a url creates a document in Mongo and then indexes the content of the page and makes it searchable. The search provides a simple interface to run the query.

Install

Download, install and run Mongodb http://www.mongodb.org/downloads

pip install -r requirements.txt

With that installed you are now ready to interact with it via the python console

>>> import indexing.indexer as indexer
>>> import searchers.searcher as s
>>> idx = indexer.Indexer()
>>> idx.index_url('http://www.mongodb.org/display/DOCS/Philosophy')
>>> idx.index_url('http://www.mongodb.org/display/DOCS/Use+Cases')
>>> [(x['document']['title'], x['score']) for x in s.Searcher().search('cloud')]
>>> [(x['document']['title'], x['score']) for x in s.Searcher().search('database power')]
Something went wrong with that request. Please try again.