Skip to content

eiro/p5-marc-mir

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is MARC::MIR

The current repo comes with 2 things that must be clearly seperated at some point:

  • A specification of in memory, acmeic representation of MARC records.
  • A Perl implementation of this spec. with
    • a very fast ISO2709 serialization/deserialization module
    • very nice set of helpers that make MIR manipulations very easy.

see lib/MARC/MIR/Tutorial.pod for futher informations.

MARC::MIR out of the Perl world

  • MARC::MIR python port was written during pycon.fr 2013. I also seen a proto of the MARC::MIR::Template port.

  • I wrote (and have to release) a port of MARC::MIR in ... java :) because it was an excuse to learn about NIO2 and it would give us opportunity to use MIR from clojure and Perl6. As i have no usage of MARC::MIR anymore, motivation is hard to find.

Quo vadis, MARC::MIR ?

First steps of this project came fast and promising before it was discontinued about 2 years ago (last CPAN release in 2013 january).

  • All those code were preliminaries of a big project to merge 4 ILS at Strasbourg University. This project was canceled as the university decided to use commercial solutions to achieve this work.

  • Since then, i occasionally work on MARC records and MARC::MIR is usable enough.

  • I didn't have the occasion to work directly with librarians to setup a complex use case and ensure the whole toolchain can be used without the help of a programmer

  • As i really expect MARC (as well as some librarians) to die, i had no interest on working on this project. even on spare time.

So if someone want to take over my modules, please feel free. Also, there are 2 ways to increase my own motivation:

  • invite me to a workshop with some librarians to work on usecases (the mechanics is in good enough shape to come to the battlefield)
  • donate or hire me as contractor (the whole todo list, including documentations and test suites, is about 1 full-time month, some tasks are just few hours).

Todo

  • make a clear separation between specs and implementation
  • complete the reference documentation
  • write/improve tutorials
  • complete test suites
    • it SHOULD be done an acmeic way
    • it MUST be battle tested in the case of records written from scratch (as it's the case i almost never had to handle)
  • finish canceled projects

Canceled projects

normalize, validate and query MARC records

As a MIR is just an array of array, very simple validations can be written using any existing tool which can query/validate a datastructure. I gave Kwalify a try, you can find the [eg/validate](documented source).

Although, some other pages deserve further reads and tests:

But i guess this method would consume a lot of ressources to validate or query a large set. Plus, rules can be really boring to write if you come with some cases rarely seen out of the MARC world. What it you want to find a title matching "rome", "romulus" or "remus" (assuming title can be 200$a, 200$b or 200$c).

For those range of problems, i use MARC::MIR helpers directly from Perl

use MARC::MIR;
use Perlude;

sub keep_roman_books {
    print if 
            grep $_
            , map_values { /rome|romulus|remus/ }
                [ 200, [qw( a b c )] ]
                ,  from_iso2709
}

Also i can't expect librarians to do so. that's why we need more tools whom

  • MUST be acmeic (usable from any technology or programming langage)
  • MUST be GUI driven (it would be possible to write rules from a web interface)
  • MAY come with a DSL: intensive users may probably go crazy with web interfaces.

S-expr would be perfect. not only because it will be very easy to implement and port but also because it would be a very pleasant to read and edit

(any 200$[abc] (~ "rome|romulus|remus"))

Both GUI and DSL would store a query as a datastructure that can be used by validation tools. The first to come idea would be (written in YAML).

any:
  - field: 200
  - subfield: [a, b, c]
  - value: [match, "rome|romulus|remus" ]

I made a lightning talk of a prototype during the French Perl Workshop 2013 and a nicer version i now use daily for other purposes is in my github.

ISO-5426 from/to UTF-8

I wrote a ICU Unicode Charmap and tested it using Perl encoding system. It was very fast but we had to deal with some unexpected symbols so i didn't release the whole thing.

Now it's pretty clear unexpected symbols come from our ILS extension of the charmap. We came to the idea to write a 'ISO-5426-loose-unistra' table but we never took time for it.

MARC::MIR::Template

MARC::MIR::Template works very well if you don't care about field indicators. I never had to deal with those indicators so "it worked for me".

Also, there is a known bug but i documented the work around and never got time to investigate on it. I wrote a test in a suite that can be uncommented in the case someone fix it.

About

Perl lib for MARC Intermediate Representation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages