Skip to content

Solr object mapping and schema generation without all the XML

License

Notifications You must be signed in to change notification settings

albarrentine/bebop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

                                          .-'''-.                           
                                         '   _    \                         
/|              __.....__     /|       /   /` '.   \_________   _...._      
||          .-''         '.   ||      .   |     \  '\        |.'      '-.   
||         /     .-''"'-.  `. ||      |   '      |  '\        .'```'.    '. 
||  __    /     /________\   \||  __  \    \     / /  \      |       \     \
||/'__ '. |                  |||/'__ '.`.   ` ..' /    |     |        |    |
|:/`  '. '\    .-------------'|:/`  '. '  '-...-'`     |      \      /    . 
||     | | \    '-.____...---.||     | |               |     |\`'-.-'   .'  
||\    / '  `.             .' ||\    / '               |     | '-....-'`    
|/\'..' /     `''-...... -'   |/\'..' /               .'     '.             
'  `'-'`                      '  `'-'`              '-----------'           

----------------------------------------------------------------------------

Bebop is a Solr abstraction library written for the busy developer.

pysolr is great, and is in fact a dependency for this library, so not trying to duplicate effort on that front. I'd say pysolr is to MySQLdb as bebop is to SQLAlchemy. Bebop is designed to be a full-fledged ORM (OIM? Object-Index-Mapper?) for translating between your domain models and your Solr Index.

Similar projects include Django's haystack, although that is not specifically designed with Solr in mind and cannot generate Solr's various XML schemas nor produce its queries in as tailored a manner as can bebop. I'm not particularly concerned with Whoosh and Xapian and Sphinx.

Despite the dynamic duo reference, bebop is in no way dependent on rocksteady and is agnostic as to which domain models you're trying to create.

I've had bad experiences with trying to do a Solr import all in one SQL query in the past. I'll try to provide some examples with server-side cursors to show what a chunked import looks like.

TODO: USAGE EXAMPLES!!!!

Some dependency stuff:

1) The schema generation stuff uses lxml. It's a dependable and super-fast library, but requires C compilation which is always a bitch, so make sure you have the following packages installed (using Debian/Ubuntu names because that's what I know): libxml2, libxslt1, libxml2-dev libxslt1-dev

2) pysolr is currently the underlying driver for this project. That's the kind of thing that might benefit from a C implementation, although I suspect that if you need that at the "driver" level you're probably doing something wrong.

Running the tests:

cd bebop
nosetests ./test

On generated schemas:

I've debated how much deploy stuff to put in this lib, and have decided to stay out of it wherever possible. I'm not going to touch setting up Jetty/Tomcat/Resin and where you put your data vs. schema etc. In the simplest case, just copy the example stuff from your Solr distro to /opt/solr and replace the "solr" directory with what bebop generates for you. Then you should be able to start your servlet container (e.g. "java -jar start.jar" in the case of Jetty) pretty seamlessly.

Word.

About

Solr object mapping and schema generation without all the XML

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages