Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion of renaming the specification elements, and make it more clear #25

Open
pkiraly opened this issue Jun 8, 2018 · 4 comments

Comments

@pkiraly
Copy link

pkiraly commented Jun 8, 2018

When I learn the specification and work on the implementation I had several conclusions I would like to share with you.

Comments on existing features:

  • There are two main parts of the standard. One for specifying a given part of MARC record, and the other provides a condition. For the first one I suggest to use "path" or "address"
  • "spec" suffix is used in the specification several times, because of XPath and JSONPath is suggest to use "path" instead. Or "address" or even empty suffix (no suffix at all), which I promote in my suggestion below.
  • subspec is not very expressive name, I suggest "conditions" and "conditionSet"
  • there are two kind of conditions: existential (? and !) and comparisions. In conditions based on comparision, leftSide and rightSide is not very expressive, I suggest "marcPath" (or the name instead) and "value"
  • the value can be a reference or a literal. To denote literal values I suggest the traditional single or double quotation marks than the unusual backslash () character. Use backslash for escape things only.

Here is my formalized suggestion for renaming the specification

alphaupper         = %x41-5A
                     ; A-Z
alphalower         = %x61-7A
                     ; a-z
DIGIT              =  %x30-39
                     ; 0-9
VCHAR              =  %x21-7E
                     ; visible (printing) characters
positiveDigit      = %x31-39
                     ;  "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9"
positiveInteger    = "0" / positiveDigit [1*DIGIT]

; field
fieldTag           = 3(DIGIT / ".")
                      / "LDR"
                      / "LEADER"
position           = positiveInteger / "#"
range              = position "-" position
positionOrRange    = range
                      / position
characterSpec      = "/" positionOrRange
index              = "[" positionOrRange "]"
shortField         = index [characterSpec]
                      / characterSpec
field              = fieldTag [index] [characterSpec]

; subfield
subfieldChar       = alphaupper
                      / alphalower
                      / DIGIT
subfieldCode       = "$" subfieldChar
subfieldCodeRange  = "$" ( (alphaupper "-" alphaupper)
                      / (alphalower "-" alphalower)
                      / (DIGIT "-" DIGIT) )
                      ; [a-z]-[a-z] / [0-9]-[0-9]
shortSubfield      = (subfieldCode / subfieldCodeRange) [index] [characterSpec]
subfield           = fieldTag [index] shortSubfield

; indicator
shortIndicator     = [index] "^" ("1" / "2")
indicator          = fieldTag shortIndicator

; condition
comparisonString   = ("'" *VCHAR "'")
                      / ('"' *VCHAR '"')
operator           = "=" / "!=" / "~" / "!~" / "!" / "?"
                      ; equal / unequal / includes / not includes / not exists / exists
abbreviation       = shortField
                      / shortSubfield
                      / shortIndicator
conditionTerm      = field
                      / subfield
                      / indicatorPath
                      / comparisonString
                      / abbreviation
condition          = [ [conditionTerm] operator ] conditionTerm
conditionSet       = "{" condition *( "|" condition ) "}"

; the whole together
marcPath           = field *conditionSet
                     / (subfield *conditionSet *(shortSubfield *conditionSet))
                     / indicatorPath *conditionSet

Besides that the relationship between the "path" and the "condition" is not clear for me. There can be two interpretations relating to the conditions, and for both there are valid use cases:

  • the condition should be true somewhere in the record
  • the condition should be true inside the context the path specifies
008/18{LDR/6=\t}

Here the situation is clear: 008 and LDR are two different fields, here we should follow the first interpretation.

880$a{100$6~880$6/3-5}
020$c{020$a}

Suppose we have two 880 fields. Should we take both if the condition is true either of them, or we should take that 880 for which the condition is true? Same situation for 020 (which is repeatable field).

I would like to see a constraints in which the context is defined explicitly. We can use the following notation for the leftHandSide (or path) part:

  • self or . means the current context
    • 020$c{.="something"} - get 020$c if it's value is "something"
  • parent or .. means the parent
    • 020$c{..?$a} - get 020$c if the same 020 field has subfield $a
  • implicit path or any other explicit path: the context is the record
    • 020$c{020$a} - get 020$c if there is 020$a anywhere in the record

I admit, "make it more clear" is a very subjective statement, as we don't have absolute scale for semantic clearness. So this comment is more of a discussion opening one, than a final suggestion.

@cKlee
Copy link
Contributor

cKlee commented Jun 11, 2018

Hi Peter!

These are all great suggestions. I'm a fan of making things more clear. Let's do it.

I also love the concept of self and 'parent'. This was one of the weaknesses of the spec. I think your proposal could work. And there is no problem in mistake the . and .. with ... for wildcards in a field description because field descriptions are always three characters long.

I'm thinking a bit about the quotation marks. I thought I omitted them for a good reason. But this reason does not come to my mind anymore.

I would suggest that I alter the specification in a dev repo, so that we can discuss the changes.

I you have any other suggestions or ideas let me know.

@cKlee
Copy link
Contributor

cKlee commented Jun 11, 2018

@pkiraly see #26 for discussion of the fieldTag definition.

@cKlee
Copy link
Contributor

cKlee commented Jun 18, 2018

@pkiraly you specified

subfieldChar       = alphaupper
                  / alphalower
                  / DIGIT

Does MARC21 allow alphaupper chars for subfield codes?

@cKlee
Copy link
Contributor

cKlee commented Jun 18, 2018

@pkiraly Same question like #26:

Should MARCspec support ANSI/ISO for subfield codes?

My suggestion:

subfieldChar      = %x21-3F / %x5B-7B / %x7D-7E
                      ; ! " # $ % & ' ( ) * + , - . / 0-9 : ; < = > ? [ \ ] ^ _ \` a-z { } ~

Your suggestion

subfieldChar       = alphaupper
              / alphalower
              / DIGIT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants