Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New feature for 1.0.0 : Metadata patching via VBO #101

Open
DiegoPino opened this issue Aug 20, 2020 · 7 comments
Open

New feature for 1.0.0 : Metadata patching via VBO #101

DiegoPino opened this issue Aug 20, 2020 · 7 comments
Assignees
Labels
Documentation Document Drupal Views JSON Integration with VIEWS enhancement New feature or request JSON Preprocessors Drupal Plugins that do stuff before JSON data is saved and can shape it question Further information is requested Symfony Services Typed Data and Search
Milestone

Comments

@DiegoPino
Copy link
Member

DiegoPino commented Aug 20, 2020

Being VBO Views Batch Operations this features is quite simple. This module needs to provide a few action plugins. A base one which allows strings to be replaced by other strings in the JSON.

As simple as that, with an exception: that the end result needs to be a valid JSON before and i would love to start using more powerful options since we are JSON fans.

JSON Patching and also JSON Diffs. Why? Because the order of things in JSON can not be ensured when dealing with properties, but also because a JSON Patch allows greater complexity. My main concern is the interface. Like i would love to have a Webform similar (if not the webform itself) to apply a change to a certain field and through that built the JSON Patch.

Also we have help for that. https://github.com/swaggest/json-diff

@DiegoPino DiegoPino self-assigned this Aug 20, 2020
@DiegoPino DiegoPino added Documentation Document Drupal Views JSON Integration with VIEWS enhancement New feature or request JSON Preprocessors Drupal Plugins that do stuff before JSON data is saved and can shape it question Further information is requested Symfony Services Typed Data and Search labels Aug 20, 2020
@DiegoPino DiegoPino added this to the 1.0.0 milestone Aug 20, 2020
@DiegoPino
Copy link
Member Author

I will add an example here taken from the swaggest docs (FYI we use swaggest a lot, specially for the JSON Schema validations we do so this is an implicit dependencies we already have)

$originalJson = <<<'JSON'
{
    "key1": [4, 1, 2, 3],
    "key2": 2,
    "key3": {
        "sub0": 0,
        "sub1": "a",
        "sub2": "b"
    },
    "key4": [
        {"a":1, "b":true, "subs": [{"s":1}, {"s":2}, {"s":3}]}, {"a":2, "b":false}, {"a":3}
    ]
}
JSON;

$newJson = <<<'JSON'
{
    "key5": "wat",
    "key1": [5, 1, 2, 3],
    "key4": [
        {"c":false, "a":2}, {"a":1, "b":true, "subs": [{"s":3, "add": true}, {"s":2}, {"s":1}]}, {"c":1, "a":3}
    ],
    "key3": {
        "sub3": 0,
        "sub2": false,
        "sub1": "c"
    }
}
JSON;

$patchJson = <<<'JSON'
[
    {"value":4,"op":"test","path":"/key1/0"},
    {"value":5,"op":"replace","path":"/key1/0"},
    
    {"op":"remove","path":"/key2"},
    
    {"op":"remove","path":"/key3/sub0"},
    
    {"value":"a","op":"test","path":"/key3/sub1"},
    {"value":"c","op":"replace","path":"/key3/sub1"},
    
    {"value":"b","op":"test","path":"/key3/sub2"},
    {"value":false,"op":"replace","path":"/key3/sub2"},
    
    {"value":0,"op":"add","path":"/key3/sub3"},

    {"value":true,"op":"add","path":"/key4/0/subs/2/add"},
    
    {"op":"remove","path":"/key4/1/b"},
    
    {"value":false,"op":"add","path":"/key4/1/c"},
    
    {"value":1,"op":"add","path":"/key4/2/c"},
    
    {"value":"wat","op":"add","path":"/key5"}
]
JSON; 

@alliomeria @giancarlobi in case this is new for you. I feel the replacement logic is quite intuitive. And with the VBO and some UX tool we could have here a VERY powerful metadata cleanup/fixing/improving tool here, actually i feel this feature can become KEY.

See https://www.drupal.org/project/views_bulk_operations

PS: there is quite simplistic, per Field (and we use a single one so not of much use here), edit submodule there in that link. I tested it and the UI/UX is terrible and has many other issues. So we can even think of NOT doing it like that.

@DiegoPino
Copy link
Member Author

While i code this (like now) i found some extra interesting use case:

1.- I do not want certain users or even roles to be able to add/remove/update certain JSON Paths (in this context they are really JSON Pointers

This intersects directly with ACL permission layer i'm working on. So JSON patching would become another access plugin that would interact with a given resource (in this case aNODE:aFIELD:THEJSON)

2.- I want to exclude certain operations globally for certain keys. Like i do not want JSON Patch operation to act on certain required elements. e.g NO "op":"remove","path","/type"

3.-I want that certain operations are setting values under a controlled scenario (vocabularies or dictionaries).

Ideas:

A.- Build this into ACL permission as a parsing service plugin (i'm thinking that each type of operation for ACL purpose becomes a plugin that parses a given resource path, a given operation and can give me back, neutral, negative or positive access to that. This way i do not need to add anything in this functionality, i just evaluate whatever the input of this action is against the ACL machinery. CONS:I need to write that complex ACL machinery!

B.- Start with simple. Add a configuration entity with denied JSON Pointers (a list) per operation. (operations relevant are add, remove, replace). Then using this, every time i want apply a JSON Patch i add maybe extra test operations? or remove some of them upfront so ones that need to run will fail and the whole patch operation will be skipped?

@DiegoPino
Copy link
Member Author

Since i'm basically speaking to myself here: Another thing i wanted is of course to validate the jsonpatch operation upfront (syntax) and since i know my code, i'm doing it via a json schema validation, as i do with other JSONs in our code base. In case someone was "wondering" how that works.

@DiegoPino
Copy link
Member Author

DiegoPino commented Aug 27, 2020

As i continue my journey here i think i will add additional actions since the experience here is good!

  1. Action to reconciliate LoD. Take a Subject label, pass through WIKIDATA, AAT and LoC (configurable) the feed back! into a JSON Patch operation

  2. Action that does a JSON DIFF (compares two docs, then generates a JSON patch document out of this, and applies to another!)

  3. Simple Regular expression or text/replacement Action. But that does a validation at the end so breaking JSON is not an option

    • 👀 Got a simple action that replaces a value working. Simple and good
  4. This same Action but with an extra feature: don't patch the whole document, but pass first a selector, JMESPATH or something dynamic. E.g

    • 👀 JMESPATH into static JSON Pointers. That is what i'm going to do. Like a preparsing
  • Start inside as:images, and apply a patch operation for EACH image found, by adding a tag = ['preservation'] IF pronom contains one of the RAW JPEG ids (so can not be used to show in realtime). I LOVE THIS.haha. Because actions can also fire as event subscribers.

Documentation, a lot of that.

@alliomeria
Copy link
Contributor

I feel the replacement logic is quite intuitive. And with the VBO and some UX tool we could have here a VERY powerful metadata cleanup/fixing/improving tool here, actually i feel this feature can become KEY.

As i continue my journey here i think i will add additional actions since the experience here is good!

All this is great! Contains the essential ingredients for a key feature that empowers individuals to perform metadata work efficiently and safely. Who doesn't love batch metadata editing capabilities?

Potential food for thought for form UI: https://github.com/UCSCLibrary/BulkMetadataEditor/blob/master/libraries/BulkMetadataEditor/Form/Main.php
Thinking of the workflow process steps in particular, select items-->select fields-->define changes/actions (option to preview)-->apply changes/actions

Looking forward to testing this tool out and seeing it in action.

@DiegoPino
Copy link
Member Author

DiegoPino commented Oct 27, 2020

@alliomeria @giancarlobi

I have this working now but I'm still wondering about how to make JSONPOINTERS better suited for our needs. Let me explain:
Lets assume we have a SBF JSON (piece) like this

   "label": "My First Digital Object",
   "term_aat_getty": [],
    "ap:entitymapping": {
        "entity:file": [
            "images",
            "warcs",
            "documents",
            "audios",
            "videos",
            "models"
        ],
        "entity:node": [
            "ismemberof"
        ]
    },
    "local_identifier": "",
    "subject_wikidata": [
        {
            "uri": "http:\/\/www.wikidata.org\/entity\/Q1158971",
            "label": "Dan"
        }
    ]

And a JSON Patch to replace the FIRST subject_wikidata entry if the Object has "My First Object" label would be like this

[
  { "op": "test", "path": "/label", "value": "My First Object" },
  { "op": "replace", "path": "/subject_wikidata/0", "value":  { "uri": "http:\/\/www.wikidata.org\/entity\/Q19819764", "label":"Allison"}
 }
]

Problem with this is that normally we won't have any control on what order a specific Wikidata Entry (or any list element) appears.
JSONPOINTER is a hard specification: https://tools.ietf.org/html/rfc6901

So. What we really want is another option that we can prepare and then expand into a specific to our case JSON Pointer.

Ideas:

[
{
"pre":"find",  "path": "/subject_wikidata[*].label", "match": "Dan", "op":"replace", "value":  { "uri": "http:\/\/www.wikidata.org\/entity\/Q19819764", "label":"Allison" }
 }
]

This is of course not a valid JSONPATCH operation. Will continue explaining after a call I have.. TBC!

Ok, so back here. I envision this as an out of specs Operation that I can preparse.
let's say (pseudo code)

Foreach ADO
  Foreach operation check for a "pre" key that has "find as value"
    get "path" 
    split "path" and get position of wildcard/expansion, save all before wildcard expansion into "future_path"
    JMESPATH resolve "path" and check if return contains "match", keep position of "match" as "index[]"
    for each "index[] as "index" add at the end of  "future_path"
      recreate a valid JSON PATCH operation that points exactly for this static paths
      add new valid one to JSONPATCH document
  remove custom operation
  Run valid new JSONPATCH

@DiegoPino
Copy link
Member Author

WE did some testing today with @alliomeria and there may be a few edge cases/needs (e.g JMESPATH filters our NULLS so you do not get the actual index as it is). Also the UI (gosh the UI!) is needed. I can write JSONPATCH. nOt sure 👀 people want to write JSONPATCH! So fields and logic needs to be exposed even if internally ALL is JSONPATCH. did I mention JSONPATCH?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Document Drupal Views JSON Integration with VIEWS enhancement New feature or request JSON Preprocessors Drupal Plugins that do stuff before JSON data is saved and can shape it question Further information is requested Symfony Services Typed Data and Search
Projects
None yet
Development

No branches or pull requests

2 participants