When indexing a small academic college library's MARC records, I ran into more Unicode issues related to misencoded MARC records. Part of these problems were a result of Python 2.6+ versions string verses unicode handling that I was able to elimate because with Python 3 all strings are Unicode.
The py3-marc-indexer project is licensed under Apache 2 open source license.
The recommended way to install py3-marc-indexer is to first created a Python 3 virtualenv first.
- Clone the forked pymarc project and install pymarc with the following commands.(Assumes you running a Linux VM)
- $ git clone firstname.lastname@example.org:jermnelson/pymarc.git $ cd pymarc $ git branch py3 $ python setup.py install
- Clone the py3-marc-indexer project with the these commands.
- $ git clone email@example.com:jermnelson/py3-marc-indexer.git $ cd py3-marc-indexer $ git branch multiprocess
To index MARC records, run the index command along with a path to you MARC records file. If your Solr server resides on a different machine, you'll need to change the stub_conf file located in index.py file. The next refactoring of this code base will likely to be integrate it into a Python 3 version of the FRBR-Redis-Datastore's Aristotle Django environment.
- $ python index.py /path-to-marc-file/test.mrc