a reader for MARC 21 files
PHP CSS
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
css
js
mrc
.gitattributes
.gitignore
GitHub.png
LICENSE
README.md
index.php
marc21.php

README.md

MARC 21

MAchine-Readable Cataloging record.

marc21.php a class to iterate over records of a file in marc 21 format

A description of the MARC 21 format can be found 
here https://www.loc.gov/marc/umb/

Testdata located in directory mrc are derived 
from http://datendienst.d-nb.de/

NOTE: The data files hold characters in utf 8 decomposed form. Because of this the string is normalized using 
a call to php function  normalizer_normalize  to bring it back into utf8 composed form, before it is stored. 
The reason for doing this is, comparing strings from the data file, against strings used by PHP.
In decomposed form, the encoding of special characters like umlauts ä ö ü etc uses more bytes, than 
in composed form that PHP is using. Next, some fonts might get in trouble to render decomposed utf 8 correctly.   



More on the subject of decomposed, composed utf 8 here: unicode.org  
Here is more excellent information about this subject  http://kunststube.net/encoding/
and even more interessting if you have to store utf-8 within a data base http://kunststube.net/frontback/




Added code to  skip over NON-SORT BEGIN,NON-SORT END characters.
 

For usage of that reader look into index.php
A demo is here https://vz139.worldserver.net/marc21/

NOTE: I do not check for any syntax errors within a file, I just assume,
that the file exists is readable and syntacticaly correct .

NOTE: NEVER ASSUME