Skip to content

Record types, document types, authorities, fields and relations in Muscat

Ferran Jorba edited this page May 9, 2021 · 1 revision

Record types, document types, authorities, fields and relations in Muscat

This page is a draft of the documentation I’m writing for being included upstream.

Muscat can be safely considered a fairly generic Marc application, with default music values. This tutorial will teach you how to customise Marc tags and bibliographic templates.

Marc record types and document types

Muscat uses several standard marc record types types (https://www.loc.gov/marc/): bibliographic, authority and holdings.

Bibliographic record types are split between sources (original, unique, manuscript works that exist physically in a library or archive) and publications or secondary literature (scholarly works about composers, composititions, or music catalogs). The sources is so specialised in musical manuscripts that the 100 tag is always called, and indexed as, composer. The secondary literature, instead, is generic and it can be used for any document type.

Authority records are split among personal, authority and title and holdings.

For the Marc bibliographic there are different document types or document templates. However, they all share a single Marc definition, as we’ll see.

It is important to keep those different categories in mind. When creating our own set of cataloguing templates, we must be aware which file has to be modified. Sometimes is just the template, sometimes we have to add something in the generic Marc and subfields definition.

In this document we’ll keep this terminology: record types for Marc, and document types for cataloguing templates, although Muscat is not always consistent with this.

As Muscat primary goal is to catalogue musical manuscripts, the sources bibliographic record is richer and has more features. So, in our exemples, we will use publications instead, but often using samples extracted from sources.

Where to modify Muscat default values

Starting from top to bottom, there are a couple of generic variables with a default value of default in the config/application.rb file. As we are going to perform small modifications, we will not change it. However, for major changes customisations, it may be useful to know that it can be used. In the configuration file there are pointers to documentation as well:

  # Marc tag and subfield definition list for all record types, available       
  # templates for new records, and default tags for each new record.            
  # In principle you shouldn't need to modify this. Documented in               
  # https://github.com/rism-ch/muscat/blob/develop/6-%20MARC_CONFIG.rdoc        
  MARC = "default"
  
  # Marc editor layouts, display and behaviour, autocompletion and              
  # validation rules. You may want to modify it if you need to add              
  # other editor configurations, different from upstream.  Documented           
  # in https://github.com/rism-ch/muscat/blob/develop/4-%20CONFIG.rdoc          
  EDITOR_PROFILE = "default"

The rules are described in two directory trees, one for Marc record types, and the other for document types or cataloguing templates.

Marc record types and other authority record types

In config/marc/, there are 6 configuration files ending in .yml, one for each of the Marc types. Those names correspond to the so-called Rails models, match the database tables and a good part of the software, so they cannot be changed without major changes in Muscat. Is important to know the list of visible fields because they are the ones used to make relations with other records. They can be found in the db/schema.rb, and to trace how they are created from the Marc tags and subfields, look for set_object_fields method for each model in app/models/, which then call each marc_.rb libray in lib/:

  • sources (bibliographic): https://github.com/rism-ch/muscat/blob/master/config/marc/tag_config_source.yml, with those visible fields (source.rb):
    • source_id: 001 tag (as Muscat is used for the RISM catalog, manuscript source_id matches id)
    • record_type_: from Marc leader (see match_leader in lib:marc_source.rbsource.rb)
    • std_title: 240 $a; 240 $k; 240 $o; 383 $b; 690 $a, each truncated to 50 chars
    • std_title_d_: same as stdtitle, to lowercase and removed diacritics
    • composer: 100 $a
    • composer_d: same as composer, to lowercase and removed diacritics
    • title: 245 $a or 240 $a
    • title_d: same as title, to lowercase and removed diacritics
    • shelf_mark: 852 $c
    • language: (not used any more?)
    • date_from: 260 $c
    • date_to: 260 $c
    • lib_siglum: 852 $a
    • marc_source: the whole Marc record as a string
    • parent: 973 $u + 973 $3 (?)
  • publications (bibliographic): https://github.com/rism-ch/muscat/blob/master/config/marc/tag_config_publication.yml, with those visible fields (“publication.rb”: https://github.com/rism-digital/muscat/blob/master/app/models/publication.rb#L176):
    • short_name: 210 $a, truncated to 255 chars
    • author: 100 $a, truncated to 255 chars
    • description(→ title): 240 $a
    • revue_title (→ journal): 760 $t
    • volume: (unused?)
    • place: 260 $a
    • date: 260 $c
    • pages: (unused?)
    • marc_source: the whole Marc record as a string
    • parent: 773 $w
  • persons (personal authority): https://github.com/rism-ch/muscat/blob/master/config/marc/tag_config_person.yml, with those visible fields. Note that the table is called people, following Rails conventions:
    • full_name: 100 $a, truncated to 128 chars
    • full_name_d_: same as fullname, to lowercase and removed diacritics
    • life_dates: 100 $d, truncated to 24 chars
    • birth_place: 370 $a
    • gender: 375 $a
    • composer: ???
    • source (of the bio): 670 $a
    • alternate_names: 400 $a
    • alternate_dates: 400 $d
    • comments: 680 $i
    • marc_source: the whole Marc record as a string
  • institutions (institutional authority): https://github.com/rism-ch/muscat/blob/master/config/marc/tag_config_institution.yml, with those visible fields:
    • siglum: 110 $g, truncated to 32 chars
    • name: 110 $a
    • address: 371 $a, truncated to 128 chars
    • url: 371 $u, truncated to 24 chars
    • phone: ???
    • email: ???
    • place: 110 $c
    • marc_source: the whole Marc record as a string
  • works (personal works authority): https://github.com/rism-ch/muscat/blob/master/config/marc/tag_config_work.yml, with those visible fields:
    • person_id: got from 100 $a
    • title: 100 $t
    • form: ???
    • notes: ???
    • marc_source: the whole Marc record as a string
  • holdings : https://github.com/rism-ch/muscat/blob/master/config/marc/tag_config_holding.yml, with those visible fields:
    • source_id
    • lib_siglum
    • collection_id
    • marc_source

There are other record types that currently do not have a Marc representation, but can be linked to, as authority records, to some of the previous Marc types. All those records have an numeric id field, that can be used, typically as $0, to link it to a Marc bibliographic record.

  • standard_titles (uniform titles), with those visible fields:
    • title
    • variants
    • notes
    • is latin text?
    • published status
  • standard_terms (suject headings), with those visible fieds:
    • term
    • alternate terms
    • notes
    • published status
  • places, with those visible fields:
    • name
    • alternate terms
    • topic
    • subtopic
    • country
    • district
    • notes
  • liturgical_festivals, with those visible fields:
    • name
    • anternate terms
    • notes
    • published status
  • digital_objects, with
    • XXX TODO

To indicate, for each tag and each subfield, whether they are optional, needed, repeteable or not, is (https://github.com/rism-ch/muscat/blob/develop/6-%20MARC_CONFIG.rdoc#label-Configuration):

  • ? optional and not repeteable (none or one; this subtag need not be present, but at most only one occurance may exist)
  • * optional and repeteable (none or more; none, one or many may exist)
  • 1 compulsory and not repeteable (one occurance must exist; no more than one is allowed)
  • + compulsory and repeteable (one or more must exist; at least one occurance must be present, more occurrences are allowed)
  • 0 not allowed (none; used only to keep tag/subtag configurations that are unused and therefore discarded on import)

Let’s take an example from config/marc/tag_config_source.yml and comment it out following the documentation. First, a simple tag with no relations, like 041:

  "041":                                # 041 tag definition
    :master: a                          # ?? TODO
    :indicator: ["0#", "1#"]            # indicators can be either 0# or 1#
    :occurrences: "*"                   # 041 is optional and repeteable
    :fields:                            # Allowed subfield lists
    - - a                               # $a subfield
      - :occurrences: "*"               # $a is optional and repeteable
    - - e                               # $e subfield
      - :occurrences: "*"               # $e is optional and repeteable
    - - h                               # $h subfield
      - :occurrences: "*"               # $h is optional and repeteable

Let’s take a look at the main 100 author field from config/marc/tag_config_person.yml table. It is also quite simple. Then we’ll learn to link it with a bibliographic record.

  "100":                                # 100 tag definition
    :master: a                          # ?? TODO
    :indicator: "1#"                    # The only valid indicators are 1#
    :occurrences: "?"                   # 100 tag is optional and not repeteable
    :fields:                            # Allowed subfield lists
    - - a                               # Subfield $a
      - :occurrences: "?"               # $a is optional and not repeteable
    - - c                               # Subfield $c
      - :occurrences: "*"               # $c is optional and repeteable
    - - d                               # Subfield $d
      - :occurrences: "?"               # $d is optional and not repeteable
        :browse_inline: true            # TODO
    - - w                               # Subfield $w
      - :occurrences: "?"               # $d is optional and not repeteable
        :no_show: true                  # TODO
    - - y                               # Subfield $y
      - :occurrences: "?"               # $d is optional and not repeteable

Relations between bibliographic and authority records

Now, from the config/marc/tag_config_source.yml an personal author tag, that relates to a person authority record that we have just seen before. We’ll comment only the new parameters, related to links to authority records:

  "100":
    :master: "0"                        # $0 is the subfeld under authority control (link to another record)
    :indicator: "1#"
    :occurrences: "?"
    :fields:
    - - "0"
      - :occurrences: "?"
        :foreign_class: Person          # If it exists, it is related to Person authority
        :foreign_field: id              # $0 in this tag is related to id column in the Person table ($0 subfield)
        :no_show: true                  # TODO :no_show useless?
    - - a
      - :occurrences: "?"
        :foreign_class: ^0              # From Person recid with this 100 $0 id ...
        :foreign_field: full_name       # ... take full_name ($a) subfield and copy to this 100 $a
    - - d
      - :occurrences: "?"
        :browse_inline: true            # TODO
        :foreign_class: ^0              # From the same Person recid with this 100 $0 id ...
        :foreign_field: life_dates      # ... take life_dates ($d) subfield and copy it here
        :foreign_alternates: alternate_dates # XXX: foreign_alternates useless?
    - - j
      - :occurrences: "?"

We can see that the both the $a and $d subfields from the person authority record are copied to the bibliographic one.

Now let’s explore now a title authority link. In the current (7.x) version of Muscat, standard titles do not have a Marc representation. They appear as authorities, they are edited like the others, but internally, in the database, they have a simpler structure; you’ll see that that in the Muscat editor there are no tag numbers. (Works, instead, are related with a 100 tag to an composer or author, and do have a Marc representation.)

  "240":
    :master: "0"                        # $0 is the subfeld under authority control (link to another record)
    :indicator: "10"
    :occurrences: "?"
    :fields:
    - - "0"
      - :occurrences: "?"
        :foreign_class: StandardTitle   # If it exists, it is related to standard titles (not a Marc record!)
        :foreign_field: id              # $0 in this tag is related to id field in the StandardTitle table
        :no_show: true                  # This subfield is not going to be shown (see always_hide?)
    - - a
      - :occurrences: "?"
        :foreign_class: ^0              # From StandardTitle id 240 $0 id ...
        :foreign_field: title           # ... take title field and copy from there to here
    - - k
      - :occurrences: "?"
    - - m
      - :occurrences: "*"
    - - o
      - :occurrences: "?"
    - - r
      - :occurrences: "?"

In the subdirectory config/marc/default live the minimum default tags and subfields for each of the record types. Constant values can be defined here. Those files are always called default.marc for all record types, except for bibliographic (sources) that there is one for each document type (https://github.com/rism-ch/muscat/tree/master/config/marc/default/source).

Those Marc record definitions are quite extensive, and our modifications will be minimal. But some additions will be needed, like 245 $b, that is not defined, or research ids like DOI, Orcid, etc. The following fragment illustrates how some title tags, indicators and subfields are described; which are accepted, repeteable, etc. (https://github.com/rism-ch/muscat/blob/master/config/marc/tag_config_source.yml#L214):

In order to do our changes,let’s copy the to our specific repository directories, so we can work on our tag set without modifying upstream values.

$ mkdir -pv config/marc/repository/source/
$ cp -piv   config/marc/default/source/template_configuration.yml config/marc/repository/source/

Document types and cataloguing templates

Which tags and subfields are required for each document types are defined in the config/editor_profiles/.

The file config/marc/default/source/template_configuration.yml is the template that lists which templates or document types (both terms are synonyms) are shown in the /admin/sources/select_new_template menu (https://github.com/rism-ch/muscat/blob/develop/app/views/admin/sources/select_new_template.html.erb).

TODO: At the same time, we will have to modify the document types to take away the uniform title (240) tag.

Document types have an associated id that exists in several places in Muscat sources, so it is better not to overwrite them, unless there is a good reason.

  • a field in the Sources table (https://github.com/rism-ch/muscat/blob/develop/db/schema.rb#L442), although there it is called record_type. Keep that in mind because when importing bibligraphic records (sources) this value will be zero unless modified later.
  create_table "sources", id: :integer, force: :cascade, options: "ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci" do |t|
    t.integer "source_id"
    t.integer "record_type", limit: 1, default: 0
  • in an internal list of ids in https://github.com/rism-ch/muscat/blob/develop/lib/marc_source.rb
  # record_type mapping
  RECORD_TYPES = {
    unspecified: 0,
    collection: 1,
    source: 2,
    edition_content: 3,
    libretto_source: 4,
    libretto_edition: 5,
    theoretica_source: 6,
    theoretica_edition: 7,
    edition: 8,
    libretto_edition_content: 9,
    theoretica_edition_content: 10,
    composite_volume: 11,
  }
  • the same list, using abbreviations, in https://github.com/rism-ch/muscat/blob/develop/config/locales/en.yml#L765
######################################################################################################
# Record types, English only
      
  record_types_codes:
    "0": "UNK"
    "1": "COL"
    "2": "MSR"
    "3": "SUB"
    "4": "LSR"
    "5": "LEC"
    "6": "TSR"
    "7": "TEC"
    "8": "EDT"
    "11": "CMP"
  • in https://github.com/rism-ch/muscat/blob/develop/config/marc/default/source/template_configuration.yml#L60
default_mapping:
  collection: "000_collection"
  source: "002_source"
  edition_content: "013_edition_content"
  libretto_source: "004_libretto_source"
  libretto_edition: "015_libretto_edition"
  theoretica_source: "006_theoretica_source"
  theoretica_edition: "017_theoretica_edition"
  edition: "011_edition"
  libretto_edition_content: "015_libretto_edition"
  theoretica_edition_content: "017_theoretica_edition"
  composite_volume: "019_composite"

Incidentally, those equivalences are the ones that creates the minimum records and the default values in https://github.com/rism-ch/muscat/tree/develop/config/marc/default/source.

Given that those names and ids are also found in several places in Muscat, we will start configuring our templates reusing those document types.

Let’s make a copy of the default files as repository :

$ mkdir -pv config/editor_profiles/repository/configurations/
$ cp -piv config/editor_profiles/default/profiles.yml config/editor_profiles/repository/
$ cp -piv config/editor_profiles/default/configurations/Source*.yml config/editor_profiles/repository/configurations/