Skip to content

perpk/json-xform

Repository files navigation

JSON transform 🤖

Overview

A library to transform a JSON structure to another one by using an intermediate JSON DSL. It shall facilitate cases where an application (or an API) needs to process several JSON files generated by other software clients and to transform it to an internal, common format for further processing.

An example for such a case would be an application which takes reports, perhaps from testing tools and generates tickets or tasks (in Jira for instance). In order to avoid coding each time when a new tool shall be integrated, the user can provide a mapping by implementing the appropriate DSL alongside with the JSON report to process.

The DSL

The DSL is implemented in JSON. Its vocabulary is limited to five words.

  1. fieldset - defines an array of objects, each object encapsulates
  2. from - defines the field to get the value from the source JSON.
  3. to - defines the field to write the value to the target JSON.
  4. withTemplate - defines an arbitrary string with template placeholders which hold references to fields in the source object to get values from.
  5. fromEach - defines an object which addresses an array in the source JSON and provides the possibility to pick particular source fields to write to the target JSON by using the fieldset again.
  6. field - defines the field in the fromEach block to get the value from.
  7. flatten - to flatten collections.

Dependencies

The essential libraries used by this project are jsonpath and jsonschema

How it works

There are 2 functions exposed which accept two JSON objects. One JSON object contains the mapping while the other one contains the source object.

mapToNewObject - Accepts JSON obects.

mapWithTemplate - Accepts JSON files.

Both functions return the transform JSON object.

Mapping

Let's say you have a JSON file which looks something like the following:

    const source = {
      highLevel: [
        {
          fieldOne: 1,
          fieldTwo: 2,
          thisDate: '1981-03-10',
          lowLevel: [
            {
              fieldThree: 3,
              fieldFour: 4,
              basement: [
                {
                  this: {
                    thing: {
                      there: 'here I am'
                    }
                  }
                }
              ]
            }
          ]
        }
      ]
    };

And for some reason (only you can know) you'd like to have something like this instead:

    const target = {
      flat: [
        {
          fieldOne: 1,
          fieldTwo: [2],
          newProp: '1 is not 2',
          thatDate: '10/03/1981',
          fieldThree: 3,
          fieldFour: 4,
          that: {
            here: 'here I am'
          }
        }
      ]
    };

You can do so by providing this mapping:

const mapping = {
      fieldset: [
        {
          fromEach: {
            field: 'highLevel',
            to: 'flat',
            flatten: true,
            fieldset: [
              {
                from: 'fieldOne'
              },
              {
                from: 'fieldTwo'
                toArray: true
              },
              {
                to: 'newProp',
                withTemplate: '${fieldOne} is not ${fieldTwo}'
              },
              {
                from: 'thisDate',
                to: 'otherDate',
                via: {
                    type: 'date',
                    sourceFormat: 'yyyy-MM-dd',
                    format: 'dd/MM/yyyy'
                }
              },
              {
                fromEach: {
                  field: 'lowLevel',
                  flatten: true,
                  fieldset: [
                    {
                      from: 'fieldThree'
                    },
                    {
                      from: 'fieldFour'
                    },
                    {
                      fromEach: {
                        field: 'basement',
                        fieldset: [
                          {
                            from: 'this.thing.there',
                            to: 'that.here'
                          }
                        ]
                      }
                    }
                  ]
                }
              }
            ]
          }
        }
      ]
    };

Woah! easy there, pilgrim! Let's break it down actually and take it from top to bottom...

  1. It all must start with a fieldset (array). It's a wrapper property which contains further mapping declarations.

  2. Then there's a fromEach (object) property following which represents a wrapper for arrays of objects.

  3. The fromEach property must include at least a field (string) property, which declares the property in the source JSON object for get the value from. It always refers to an array type since it only may appear in the context of a fromEach block.

  4. The to (string) propery may be found in a fromEach block, as well as in a fieldset. It defines the field in the target object where the value shall be writte to. It isn't mandatory if a from field is present, in case it's missing the field in the target object will have the same name as it's source counterpart.

  5. The flatten(boolean) property is optional. It can be used in case the array the fromEach refers to shall be extracted from the array. That means that the properties of any objects are extracted and placed one level above. The flatten property is only valid within the scope of a single fromEach. That means other, nested fromEach blocks aren't affected. If such nested blocks must be "flattened" as well, the flatten property may be set to true again in their own context. The default for the "flattening" is false.

  6. Then, there's fieldset again. Here the fieldset contains a precise mapping declaration for particular fields.

  7. The from property denotes the key in the source object where the value shall be taken from. We can use chaining via dot (.) to cherry pick values out from nested object structures.

  8. The to property is the key in the target object where the value shall be written to. It is not mandatory if a from property exists - in that case if it's not there, the default applies, which means that the 'write-to' property in the target will be the same as the 'from' from the source. Chaining can be applied here as well. It'll create a nested object structure with the last property to be the carrier of the value.

  9. The toArray property defines whether the referenced value from the source shall be placed into an array in the target object. This might come handy when you want to perform further processing on the transformed json and need to have arrays for the particular properties.

  10. The withTemplate property defines a template which may contain an arbitrary string with the possibility to embed references to props from the source. That comes handy if you want to construct a new property consisting of several fields from the source and maybe also some more text. In case the withTemplate prop is defined, the from property must not exist in the same scope, the two fields withTemplate and from are mutually exclusive. Another thing that changes if withTemplate is defined, is that the to property becomes mandatory and must be provided. This is because no implicit to field can be derived since the from property is not allowed in that case. Referenced properties may be nested, contain non-word characters, just like the props referenced by from are allowed to have.

  11. It is possible to define a format for values in the target object. Currently this is only possible for dates. In general formatting can be declared by using the via property which is an object that holds the type of the value, the source format to parse from and the target one to re-format to. Formatting is also possible in combination with the withTemplate property, though it is only possible to define one formatting option for all referenced values in the template. If you perhaps have 2 date values referenced both will be re-formatted with the format defined in the via property.

That's a rather complex yet complete example since it makes use of the whole range of the currently implemented vocabulary of the DSL.