Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate AUDIT datastream #917

Closed
mjordan opened this issue Sep 1, 2018 · 5 comments
Closed

Migrate AUDIT datastream #917

mjordan opened this issue Sep 1, 2018 · 5 comments
Assignees

Comments

@mjordan
Copy link
Contributor

mjordan commented Sep 1, 2018

Following up on #904 and related to #847. I'll start work on this after Sept. 12.

@mjordan
Copy link
Contributor Author

mjordan commented Oct 7, 2018

Coming back to this issue.... the following request will retrieve the "public" context FOXML export from Fedora 3.8, from which we can parse out the AUDIT datastream:

curl -u fedoraAdmin:fedoraAdmin "http://192.168.0.120:8080/fedora/objects/testing:1/export"

Some options we can consider:

  1. Add a fourth migration to https://github.com/Islandora-Devops/migrate_7x_claw, e.g. "Basic Image Objects AUDIT" that uses API-M to leverage what @whikloj's has done so far
  2. Bypass using Drupal Migrate to fetch the audit log and fetch it directly using the Riprap plugin at https://github.com/mjordan/riprap/blob/master/src/Command/PluginPostCheckMigrateFedora3AuditLog.php (see https://github.com/mjordan/riprap/blob/master/tests/Command/PluginPostCheckFedora3AuditLogTest.php for working code that parses out the fixity audit events)

The first option is preferable if we want to migrate the AUDIT datastream and store it intact (or even the entire public FOXML export), much like we're doing with MODS, whereas the second option might be simpler if all we're interested in is the fixity events stored in the AUDIT datastreams of Islandora 7.x repos that have Checksum Checker running and doesn't require storing the AUIDIT or FOXML XML for each 7.x object.

Any thoughts? In case anyone is interested, here's the entire AUDIT datastream XML, copied from the FOXML "public" export, for a simple object that has had Checksum Checker run against it:

<audit:auditTrail xmlns:audit="info:fedora/fedora-system:def/audit#">
<audit:record ID="AUDREC1">
<audit:process type="Fedora API-M"/>
<audit:action>addDatastream</audit:action>
<audit:componentID>TECHMD</audit:componentID>
<audit:responsibility>admin</audit:responsibility>
<audit:date>2018-10-07T17:40:14.693Z</audit:date>
<audit:justification></audit:justification>
</audit:record>
<audit:record ID="AUDREC2">
<audit:process type="Fedora API-M"/>
<audit:action>addDatastream</audit:action>
<audit:componentID>TN</audit:componentID>
<audit:responsibility>admin</audit:responsibility>
<audit:date>2018-10-07T17:40:15.737Z</audit:date>
<audit:justification></audit:justification>
</audit:record>
<audit:record ID="AUDREC3">
<audit:process type="Fedora API-M"/>
<audit:action>addDatastream</audit:action>
<audit:componentID>MEDIUM_SIZE</audit:componentID>
<audit:responsibility>admin</audit:responsibility>
<audit:date>2018-10-07T17:40:16.790Z</audit:date>
<audit:justification></audit:justification>
</audit:record>
<audit:record ID="AUDREC4">
<audit:process type="Fedora API-M"/>
<audit:action>modifyObject</audit:action>
<audit:componentID></audit:componentID>
<audit:responsibility>admin</audit:responsibility>
<audit:date>2018-10-07T17:49:01.795Z</audit:date>
<audit:justification>PREMIS:file=testing:1+MODS+MODS.0; PREMIS:eventType=fixity check; PREMIS:eventOutcome=SHA-1 checksum validated.</audit:justification>
</audit:record>
<audit:record ID="AUDREC5">
<audit:process type="Fedora API-M"/>
<audit:action>modifyObject</audit:action>
<audit:componentID></audit:componentID>
<audit:responsibility>admin</audit:responsibility>
<audit:date>2018-10-07T17:49:01.859Z</audit:date>
<audit:justification>PREMIS:file=testing:1+OBJ+OBJ.0; PREMIS:eventType=fixity check; PREMIS:eventOutcome=SHA-1 checksum validated.</audit:justification>
</audit:record>
<audit:record ID="AUDREC6">
<audit:process type="Fedora API-M"/>
<audit:action>modifyObject</audit:action>
<audit:componentID></audit:componentID>
<audit:responsibility>admin</audit:responsibility>
<audit:date>2018-10-07T18:00:03.571Z</audit:date>
<audit:justification>PREMIS:file=testing:1+MODS+MODS.0; PREMIS:eventType=fixity check; PREMIS:eventOutcome=SHA-1 checksum validated.</audit:justification>
</audit:record>
<audit:record ID="AUDREC7">
<audit:process type="Fedora API-M"/>
<audit:action>modifyObject</audit:action>
<audit:componentID></audit:componentID>
<audit:responsibility>admin</audit:responsibility>
<audit:date>2018-10-07T18:00:03.627Z</audit:date>
<audit:justification>PREMIS:file=testing:1+OBJ+OBJ.0; PREMIS:eventType=fixity check; PREMIS:eventOutcome=SHA-1 checksum validated.</audit:justification>
</audit:record>
</audit:auditTrail>

In this case, the only events recorded in the AUDIT datastream that are not put there by Checksum Checker are the addDatastream events for TECHMD and TN datastreams (this is a Basic Image object).

@dannylamb
Copy link
Contributor

The selectors need to be tweaked now that we're actually using XPath wtih simplexml. But all of that is going to change again when I roll it into Islandora-Devops/migrate_7x_claw#10, so maybe I should just do that and address your feedback there.

@dannylamb
Copy link
Contributor

Resolved via Islandora-Devops/migrate_7x_claw@052e2cc

@rosiel
Copy link
Member

rosiel commented Apr 30, 2019

For the record, resolved via Option 1, and migrate_7x_claw "gets" (extracts) the audit datastream and stores it intact. (right?)

@dannylamb
Copy link
Contributor

@rosiel It extracts it form the FOXML with this XPath: 'foxml:datastream[@ID = "AUDIT"]/foxml:datastreamVersion/foxml:xmlContent/*' and then stores it as a file+media in Drupal associated with your node.

If you're curious, you can configure that XPath here: https://github.com/Islandora-Devops/migrate_7x_claw/blob/master/modules/islandora_migrate_7x_claw_feature/config/install/migrate_plus.migration.islandora_audit_file.yml#L42

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants