Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customize PICA::Writer::Plus format #4

Closed
nichtich opened this issue Sep 23, 2013 · 8 comments
Closed

Customize PICA::Writer::Plus format #4

nichtich opened this issue Sep 23, 2013 · 8 comments
Assignees

Comments

@nichtich
Copy link
Member

Some use \N{LINE FEED} and some use \N{INFORMATION SEPARATOR THREE} as record separator, so this should be configurable.

nichtich added a commit that referenced this issue Sep 23, 2013
This closes issue #2. For customization of PICA::Writer::Plus I created issue #4.
@ghost ghost assigned jorol Sep 24, 2013
@nichtich
Copy link
Member Author

PICA::Record contains several test files in various dialects of PICA+

@jorol
Copy link
Contributor

jorol commented Sep 24, 2013

Some of these examples contain only a single record, so I'm not sure, how multiple records are separated.

Can we categorize the different dialects?

  • Plus (/t/files/dumpformat)
    subfield separator: \N{INFORMATION SEPARATOR ONE}
    field separator: \N{INFORMATION SEPARATOR TWO}
    record separator: \N{INFORMATION SEPARATOR THREE} or \N{LINE FEED}
  • Plain (/t/files/graveyard.pica, /t/files/minimal.pica)
    subfield separator: \N{DOLLAR SIGN}
    field separator: \N{LINE FEED}
    record separator: \N{LINE FEED}\s*\N{LINE FEED} (empty line)
  • ??? ((/t/files/kochbuch.pica)
    subfield separator: \N{INFORMATION SEPARATOR ONE}
    field separator: \N{LINE FEED}
    record separator: ?

Should we implement different Parser/Writer (Plus, Plain, ...) for each dialect?

@nichtich
Copy link
Member Author

BTW in addition to a general PICA exporter/importer one could also define modules for each type (e.g. Catmandu::Importer::PICAPlus equal to type=picaplus) for less typing.

@nichtich
Copy link
Member Author

PICA::Data 0.28 add "binary" PICA with INFORMATION SEPARATOR THREE as record separator. I checked all examples from https://metacpan.org/source/VOJ/PICA-Record-0.584/t/files and could parse with the existing dialects (XML, plain, plus, binary) all but: dumpformat, winibwsave.example, winibwsave.example2 and winibwsave.example3. Let's keep this issue open unless these formats have dealed with.

@jorol
Copy link
Contributor

jorol commented Jan 29, 2018

Added PICA::Writer::Generic for custom field, subfield and record separators.

ToDo:

  • Add PICA::Writer::Generic to Catmandu::Exporter::PICA

  • PICA::Parser::WinIBW

  • PICA::Writer::WinIBW

@cKlee
Copy link
Member

cKlee commented Feb 7, 2018

PICA::Writer::WinIBW ?? What should this be good for?

@jorol
Copy link
Contributor

jorol commented Feb 7, 2018

Good question;-) I delete this until someone has a use case for this.

@nichtich
Copy link
Member Author

@cKlee @jorol can I close this issue? PICA::Data supports all variants listed at http://format.gbv.de/pica (only PICA JSON could be added to simplify access to the short form without record => wrapping element if this siplifies anyones use cases)

@jorol jorol closed this as completed May 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants