Reading MARC record files
After all of the command line options and their respective arguments the remainder of the command line specifies one or more MARC record files that are to be read and processed. These files can be:
- Binary MARC records using the MARC8 character encoding
- Binary MARC records using the UTF-8 character encoding
- Binary MARC records using some other character encoding
- MARCXML record files
- MarcInJSON record files
- MarcBreaker (.mrk) ASCII-encoded MARC record files as produced by MarcEdit
- MarcBreaker (.mrk8) UTF-8 encoded ASCII MARC record files as produced by MarcEdit
The MarcReaderFactory
is called for each filename to determine what specific type of MarcReader
should be created. Then all of the MarcReaders are placed in a MarcMultiplexReader
object that will return records from the first reader until it is empty and then switch to the next reader, and so on, until all records from all files have been read.
It might be possible to have multiple readers running in parallel, each sending records to the readQ, but the Marc4j classes are likely not thread-safe, and the reading of records doesn't seem to be the bottleneck in any case.
###Features
###Installing & Configuring
- Simple Installation
- Index Specification File
- Field based specifications
- Conditional specifications
- Mapping values
- Specification Modifiers
- Using Custom Methods
- Pre-defined Custom Methods
###Running SolrMarc
- Command line options
- Other Command line options
- Reading Marc records
- marcreader.properties
- Logging via log4j
###Developing SolrMarc