Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing issue title and identifier in the METS-files of issues #3634

Closed
andre-hohmann opened this issue May 18, 2020 · 17 comments · Fixed by #3856 or #4076
Closed

Missing issue title and identifier in the METS-files of issues #3634

andre-hohmann opened this issue May 18, 2020 · 17 comments · Fixed by #3856 or #4076
Assignees
Labels

Comments

@andre-hohmann
Copy link
Collaborator

andre-hohmann commented May 18, 2020

Problem

In the exported METS-files of issues, the title of the issues, as for example "01-Orchesterkonzert" or "02-Abendausgabe" is missing in the following METS and MODS elements: 

  1. MODS: <dmdSec>
    mods:titleInfo/mods:title

  2. METS: <mets:structMap TYPE="LOGICAL">
    LABEL

In addition, the identifier (catalogIDDigital) is not created, wich leads to: 

  1. missing record-identifier, PURL, URN, in the <dmdSec>-section
  2. missing values in <dv:presentation> and <dv:reference> in the <amdSec>

In all cases, the values are missing in the internal METS-file, too. 

Solution

The issue title and CatalogIDDigital should be written in the METS-file. 

Example

Examples 3.x

DresKr_880547324-1850100402_02-o.xml

Click to show the exported METS-File:
<mods:mods>
	<mods:accessCondition type="use and reproduction" xlink:href="http://creativecommons.org/publicdomain/mark/1.0/">Public Domain Mark 1.0</mods:accessCondition>
	<mods:accessCondition displayLabel="Access Status" type="restriction on access" xlink:href="http://purl.org/coar/access_right/c_abf2">Open Access</mods:accessCondition>
	<mods:language>
		<mods:scriptTerm type="code" authority="iso15924">Latf</mods:scriptTerm>
		<mods:languageTerm type="code" authority="rfc3066">ger</mods:languageTerm>
	</mods:language>
	<mods:relatedItem type="series">
		<mods:titleInfo>
			<mods:title lang="ger">Saxonica</mods:title>
		</mods:titleInfo>
	</mods:relatedItem>
	<mods:relatedItem type="series">
		<mods:titleInfo>
			<mods:title lang="ger">Musik</mods:title>
		</mods:titleInfo>
	</mods:relatedItem>
</mods:mods>
...
<mets:digiprovMD ID="uuid-a8d76717-1c34-3ceb-95e2-ad1f3fc2a366">
    <mets:mdWrap>
        <mets:xmlData>
            <dv:links>
                <dv:presentation>https://digital.slub-dresden.de/id</dv:presentation>
                <dv:reference>http://dienste.slub-dresden.de/cgi-bin/FOZK.pl?PPN=</dv:reference>
            </dv:links>
        </mets:xmlData>
    </mets:mdWrap>
</mets:digiprovMD>
...
<mets:div ID="uuid-dab20ebd-0e62-4f15-9221-c3966f92449c" DMDID="uuid-4c1d89f2-41e0-33f1-bb06-4b40038d5c74" TYPE="issue" ORDER="1">

Examples 2.x

Click to show the exported METS-File:
<mods:mods>
...
	<mods:titleInfo>
		<mods:title>02-Vesper</mods:title>
	</mods:titleInfo>
	<mods:part order="2" type="host"></mods:part>
...
	<mods:relatedItem type="host" >
		<mods:recordInfo>
			<mods:recordIdentifier source="http://digital.slub-dresden.de/oai/" >oai:de:slub-dresden:db:id-880547324</mods:recordIdentifier>
		</mods:recordInfo>
	</mods:relatedItem>
	<mods:recordInfo>
		<mods:recordIdentifier source="http://digital.slub-dresden.de/oai/" >oai:de:slub-dresden:db:id-880547324-1889062901</mods:recordIdentifier>
	</mods:recordInfo>
	<mods:physicalDescription>
		<mods:digitalOrigin>reformatted digital</mods:digitalOrigin>
	</mods:physicalDescription>
	<mods:identifier type="purl" >http://digital.slub-dresden.de/id880547324-1889062901</mods:identifier>
	<mods:identifier type="urn" >urn:nbn:de:bsz:14-db-id880547324-18890629010</mods:identifier>
</mods:mods>
...
<mets:digiprovMD ID="DIGIPROV" >
	<mets:mdWrap MDTYPE="OTHER" MIMETYPE="text/xml" OTHERMDTYPE="DVLINKS" >
		<mets:xmlData>
			<dv:links>
				<dv:reference>http://dienste.slub-dresden.de/cgi-bin/FOZK.pl?PPN=880547324-1796</dv:reference>
				<dv:presentation>https://digital.slub-dresden.de/id880547324-1889062901</dv:presentation>
			</dv:links>
		</mets:xmlData>
</mets:mdWrap>
</mets:digiprovMD>
...
<mets:div ADMID="AMD" DMDID="DMDLOG_0182" ID="LOG_0313" LABEL="02-Vesper" ORDER="2" TYPE="issue" ></mets:div>
@matthias-ronge
Copy link
Collaborator

I suspect they are not already in there internally?

The following is necessary:

  • metadata is entered or
  • metadata is added when created using the newspaper editor and
  • metadata is mapped in XSLT so that it is output.

@andre-hohmann
Copy link
Collaborator Author

@matthias-ronge:
Yes, you are right. The metadata is missing in the internal METS-file, We need to talk about your proposal.

From my point of view, the metadata should be added "automatically" when the processes are created using the newspaper editor - as it has been conducted in Kitodo.Production 2.x.

A manual enter is not possible and a mapping in XSLT is quite complicated, if it is possible at all without metadata.

@andre-hohmann
Copy link
Collaborator Author

Suggestion "quick" solution:

  1. Enter the issues name in the metadata as the main title
  2. Enter the identifier of the title in the metadata
  3. Generate by XSLT from the ORDERLABEL of the div and the identifier of the title the identifier of the issue
  4. Write the issues identifier by XSLT in the dmdSec and amdSec and generate PURL, URN,...

Although this might be possible it is for usability and configuration very uncomfortable. It still needs to be proved, that XSLT is capable of all transformations.

@andre-hohmann
Copy link
Collaborator Author

I think, i am not able to find a solution for the XSL-transformation to generate the recordIdentifier, as for example 880547324-1850100402 for issues or 880547324-1850 for years.

Is there really no way to extract it out of the process title, or to extract at least the process title?
How was it possible in 2.x?

Theoretically the problem can be solved by the before described measures and thus the label blocking could be removed from the issue. I would not feel well about it and all should be conscious of the consequences:

  • complicated and error prone use of calender plugin
  • someone needs to write the XSLT to create the identifier

@matthias-ronge
Copy link
Collaborator

matthias-ronge commented Jun 8, 2020

If something needs to be generated, such as record identifier, PURL, or URN, we have a generator interface for this. Someone would have to implement that.

@andre-hohmann
Copy link
Collaborator Author

andre-hohmann commented Jul 9, 2020

Comment regarding missing identifier

After consultation with @henning-gerhardt i found out, that the Identifier of the year- and the issue-processes are created in Kitodo.Production 2.x by copyData.onExport. Thus, i was wrong, as i assumed, it is generated by Kitodo.Production itself.

Nevertheless it is still a problem and the following questions has to be answered:

  1. Is the copyData.onExport still available in Kitodo.Production 3.x? I'd hoped that we could do it without.
  2. Is it possible to use it for several processes, as the information PPN, date, ... is stored in several processes?
  3. Is it possible to extract the process-title and copy it into a metadata-element in the MODS-section?

@andre-hohmann
Copy link
Collaborator Author

Comment regarding missing title

The content of the field "Issue" in "Enter course of appearance" is copied in Kitodo.Production 2.x to the metadata fields "TitleDocMain" and "PeriodicalIssue".

Ausgabe_Ephemera

<goobi:goobi xmlns:goobi="http://meta.goobi.org/v1.5.1/">
    <goobi:metadata name="TitleDocMain">01-Frühausgabe</goobi:metadata>
    <goobi:metadata name="PeriodicalIssue">01-Frühausgabe</goobi:metadata>
</goobi:goobi>

It can be added to the metadata manually, but it is an additionally effort.
A procedure as in Kitodo.Production 2.x is appreciated.

@matthias-ronge
Copy link
Collaborator

matthias-ronge commented Jul 20, 2020

[…] the following questions has to be answered:

  1. Is the copyData.onExport still available in Kitodo.Production 3.x? I'd hoped that we could do it without.

The functionality was not explicitly removed, but it died quietly. It cannot be revived in its previous form and must be implemented again. See #3368 for details.

  1. Is it possible to use it for several processes, as the information PPN, date, ... is stored in several processes?

CopyData can be used as a KitodoScript command, in this case several processes can be manipulated at once. What is currently not possible is to copy metadata from one process to another. Such a function could be added with a new implementation.

  1. Is it possible to extract the process-title and copy it into a metadata-element in the MODS-section?

This is already possible with the existing syntax. Example: /@KitodoProcessTitle = $process.title
(At the moment this is not possible because the function as a whole is broken, but if it were intact it would work that way.)

@andre-hohmann
Copy link
Collaborator Author

Which function can be repaired quicker?
Which option is the most realistic one to be implemented? Both option are quite complicated and the XSL transformation of elements from several METS-sections is not easy.

Regarding the option with the process title:

  • Is the process title internally splitted as in the ruleset? processTitle="+'-'+#YEAR+#MONTH+#DAY+#is+'_'+#issu">
  • Is it possible to extract parts of it?

@matthias-ronge
Copy link
Collaborator

We plan the following solution: In order to write the process title in the METS file, a metadata key must be defined as use="processTitle" in the ruleset. For example:

<key id="CatalogIDDigital" use="processTitle">
    <label>PPN (digital)</label>
</key>

Then such a metadata entry is created with the process title as the key when the process is created for the topmost logical <div>. You can then freely use it in XSLT for PURL or whatever you want. @andre-hohmann, would that be a viable solution?


The values for <dv:presentation> and <dv:reference> are easier. They are still configured in the project settings, as it was the case in version 2. You can use the variable (processtitle) here. Example:

Example values for Digiprov fields

The values are written as <mets:amdSec><mets:digiprovMD><kitodo:metadata name="presentation"> and <kitodo:metadata name="reference"> and just need to be mapped to <dv:presentation> and <dv:reference> in XSLT.


Automatically setting the issue label as metadata entry for each issue could be done alike as above (with a use attribute), but I would put that into a separate ticket to avoid confusion here, if you aggree to the solution in general.

@andre-hohmann
Copy link
Collaborator Author

As the export does not work currently, i cannot write an comprehensive report, but i have two remarks.

Metadata in the wrong structure element

The new metadata field for the processtitle is not written in the structure element "PeriodicalIssue", but in the structure element "Unknown structure type". In my opinion, that metadata has to be written in "PeriodicalIssue", as it describes and identifies the "PeriodicalIssue".

    <mets:dmdSec ID="uuid-b00e152f-205e-3ff9-8c36-2adb3dd8fb93">
        <mets:mdWrap>
            <mets:xmlData>
                <kitodo:kitodo>
                    <kitodo:metadata name="CatalogIDDigitalIssue">AnzefRiS_1667232932-1872012901_01-f</kitodo:metadata>
                </kitodo:kitodo>
            </mets:xmlData>
        </mets:mdWrap>
    </mets:dmdSec>
    <mets:dmdSec ID="uuid-672510b5-5793-3b83-a617-be8f0bfacdb8">
        <mets:mdWrap>
            <mets:xmlData>
                <kitodo:kitodo>
                    <kitodo:metadata name="DocLanguage">ger</kitodo:metadata>
                    <kitodo:metadata name="LegalNoteAndTermsOfUse">PDM1.0</kitodo:metadata>
                    <kitodo:metadata name="slub_script">Fraktur</kitodo:metadata>
                    <kitodo:metadata name="singleDigCollection">Saxonica</kitodo:metadata>
                </kitodo:kitodo>
            </mets:xmlData>
        </mets:mdWrap>
    </mets:dmdSec>

project settings <dv:presentation> and <dv:reference>

For the other document types the "CatalogIDDigital" is still necessary in the project settings. It should not be necessary that the processtitle of the other document types is converted to "CatalogIDDigital", although the "CatalogIDDigital" is available.

Copying the CatalogIDDigital from dmdSec seems not to work.

@matthias-ronge
Copy link
Collaborator

Metadata in the wrong structure element: I assumed that the process title describes the process, so I assigned it to the top structural element of the process. However, it can also only be assigned to lower structural elements if the elements above are typeless containers. I would guess that's more difficult to track in XSLT. Especially since you would then have to differentiate between the cases in which it is a newspaper and in which it is not a newspaper. I have doubts that this is the easiest way.
Question, what should we do if there are multiple issues in a process? Is it enough if only the first issue gets a process title? Or should all issues get the process title?

Project settings: I would avoid mixing that up. In this case, I would not save the process title in CatalogIDDigital, but under my own metadata key to avoid confusion. For example:

<key id="ProductionProcessId" use="processTitle">
    <label>Process identifier name</label>
</key>

@andre-hohmann
Copy link
Collaborator Author

Metadata in the wrong structure element:
We create always one process for each issue. It is demanded by the DFG. @subhhwendt does also not know anyone who aggregates issues in processes. Therefore, i always forget that theoretically there can be processes for with several issues. Logically, there should be then only one process title for the superordinate element.

I hope that it is not straightforward, but i suggest to write the process title in each issue. Then it is at least clear to which process an issue belongs to. Nevertheless, i hope that nobody is affected by this.

Project settings:
I have created an own metadata:

       <key id="CatalogIDDigitalIssue" use="processTitle">
            <label>PPN issue (digital copy)</label>
            <label lang="de">PPN Ausgabe (Digitalisat)</label>
        </key>

As i will use it only to generate the CatalogIDDigital for issues, i used that name. The CatalogIDDigital is still the old one:

        <key id="CatalogIDDigital">
            <label>PPN (digital copy)</label>
            <label lang="de">PPN (Digitalisat)</label>
        </key>

For monograph, manuscript, volumes, periodicals, ... i want to use the common configuration, because it contains the CatalogIDDigital:

  • METS Digiprov Presentation: $(meta.CatalogIDDigital)
  • METS Digiprov Reference: $(meta.CatalogIDDigital)

Then, i cannot use $(processtitle) or $(meta.CatalogIDDigitalIssue) for the newspaper issues and later on newspaper years. As far as i know, the project settings are not depended from the document type.

@andre-hohmann
Copy link
Collaborator Author

project settings <dv:presentation> and <dv:reference>

It is now possible to transform the processtilte from the dmdSec into the <dv:presentation> and <dv:reference> elements of the amdSec by the export XSL.

Therefore, this problem can be solved individually. Thus, this part of the issue is solved.

@matthias-ronge
Copy link
Collaborator

Reported not yet working. Must be inspected.

@matthias-ronge
Copy link
Collaborator

Does not work if the metadata entry is defined as minOccurs="1" maxOccurs="1", because in this case it isn’t considered addable because the metadata editor does already show an input field for keys defined minOccurs="1". Needs clarification if this must be changed.

@andre-hohmann
Copy link
Collaborator Author

Thanks! I changed it to <permit key="processTitle" maxOccurs="1"/> and it works now.

I will check, if minOccurs="1" is necessary. I added it to enable validation like in case of the catalogIDDigital.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment