Explore: Add transformations to correlation data links #61799

gelicia · 2023-01-19T17:07:26Z

What is this feature?

Transformations act as a lens in which we focus on specific pieces of source data to be used by target data. This implements the regex and logfmt transformations as a first pass.

Why do we need this feature?

We have the ability to correlate data from one datasource to another, but most of the time we will need to do something to the source data to make it suitable for the target data source. Transformations are the answer to that and this is the first example of how they may work.

Who is this feature for?

Anyone using correlations.

Which issue(s) does this PR fix?:

Fixes #60023

Special notes for your reviewer:

Example 1

Example datasource provisioning yaml with regex only

  - name: testData-correlations
    isDefault: false
    editable: true
    type: testdata
    correlations:
      - targetUID: WyFv5154z
        label: "Superhero freeform"
        description: "this is a test correlation from provisioning"
        config:
          type: query
          target:
            editorMode: "code"
            format: "table"
            rawQuery: "true"
            rawSql: "SELECT * FROM superhero WHERE name=''${name}''"
            refId: "A"
          field: "text"
          transformations:
            - type: "regex"
              expression: "(Superman|Batman)"
              variable: "name"

You will need to edit the target datasource to be one that is available in your environment. I have a postgres datasource running with a table of superhero data.

Using the above provisioned datasource, use the scenario 'CSV Content' and use the following CSV

date,text
1674078628,This is a news article about Superman. Batman was not involved at all.

This will create a link to the target datasource defined and the query will contain "Superman" instead of the variable name.

Example 2

Example datasource provisioning yaml with regex and logfmt

  - name: testData-correlations
    isDefault: false
    editable: true
    type: testdata
    correlations:
        - targetUID: WyFv5154z
          label: "Superhero 2 transformations"
          description: "2nd test transformations"
          config:
            type: query
            target:
                editorMode: "code"
                format: "table"
                rawQuery: "true"
                rawSql: "SELECT * FROM superhero WHERE name='$${name}' AND alignment='$${align}'"
                refId: "A"
            field: "text"
            transformations:
                - type: "logfmt"
                - type: "regex"
                   expression: "text=.*(good|bad).*"
                   variable: "align"

You will need to edit the target datasource to be one that is available in your environment. I have a postgres datasource running with a table of superhero data.

Using the above provisioned datasource, use the scenario 'CSV Content' and use the following CSV

date,text
1674078628,"name=Superman text=""he did a good thing"""
1674078628,"name=Thanos text=""he did a bad thing"""
1674078628,"name=Batman text=""he did a bad thing"""

This will create a link to the target datasource defined and the query will contain "Superman" instead of the variable name.

github-actions · 2023-01-19T17:45:55Z

Backend code coverage report for PR #61799
No changes

github-actions · 2023-01-19T17:45:56Z

Frontend code coverage report for PR #61799

Plugin	Main	PR	Difference
correlations	94.9%	87.23%	-7.67%
explore	86.26%	86.28%	.02%

packages/grafana-data/src/types/ScopedVars.ts

pkg/services/correlations/models.go

ifrost

Nice, with some small changes around regexp capturing (see the comment) I was able to extract some pieces of the line to get something like this working:

public/app/features/explore/utils/links.ts

… variables between links

devenv/datasources.yaml

pkg/services/correlations/models.go

gelicia · 2023-01-31T00:02:53Z

public/app/features/explore/utils/links.ts

+        if (link.internal?.transformations) {
+          const fieldValue = field.values.get(rowIndex);
+          link.internal?.transformations.forEach((transformation) => {
+            if (transformation.type === 'regex' && transformation.expression) {


Should we implement it in a cleaner way? This has the potential to grow quite a bit, maybe break the logic for transformations out to its own file?

We will defo need it in the future and maybe move it to feature/correlations but we can do it later. Also you can do it now as it may make it easier to test just the transformation logic.

ifrost

Nice, I was able to extract fields from Loki nicely with change. Probably worth adding some tests to dataLink.ts / links.ts

pkg/services/correlations/models.go

Elfo404 · 2023-02-07T16:25:19Z

.betterer.results

@@ -3038,6 +3038,9 @@ exports[`better eslint`] = {
      [0, 0, 0, "Do not use any type assertions.", "0"],
      [0, 0, 0, "Do not use any type assertions.", "1"]
    ],
+    "public/app/features/correlations/transformations.ts:5381": [
+      [0, 0, 0, "Do not use any type assertions.", "0"]


opened DefinitelyTyped/DefinitelyTyped#64263 to get rid of this

Elfo404 · 2023-02-07T16:31:49Z

public/app/core/utils/explore.ts

@@ -192,7 +192,7 @@ export const safeParseJson = (text?: string): any | undefined => {
 };

 export const safeStringifyValue = (value: any, space?: number) => {
-  if (!value) {
+  if (value === undefined || value === null) {


this slightly changes the behaviour of a bunch of things, like the panel model inspector in dashboards and variables, did we test it's safe? (especially for variables as false and 0 would have resulted in an empty string, but "false" and "0" now, not sure if that's even possible tho)

I just couldn't imagine why we would want true to evaluate to "true" and false to evaluate to an empty string. Let's look at the usages of this function outside how it is used in this PR

features/dashboard/state/PanelModel.ts - passes the entire model into this function, and that model is a required parameter. I am extremely confident that PanelModel will never evaluate to false or 0, there are a lot of required fields in it.

features/variables/utils.ts - safe-stringifies the first arg, concatenates all args together split by a space except the last one and checks that string has matches for any of the three formats for variables. I do not believe that having false or 0 evaluate to an empty string or not will have any impact on this, because the safe-stringify only runs on the first argument, and in all cases a variable reference must start with either $ or [ which is not possible with a false or 0 scenario.

features/variables/inspect/utils.ts - Similar to the above, we safe-stringify a value and check if it matches any of the variable patterns. False or 0 or empty string will all not match. That value is not used again - it will use the matching groups instead.

plugins/datasource/prometheus/datasource.tsx - This will only run if the if statement depending on the same value is true, so if the value passed in is 0 or false, it will never run in the first place

I'm very confident based on those usages that this logic will not impact anything.

gelicia · 2023-02-08T21:59:29Z

To document some discussion that happened off this PR, @Elfo404 @ifrost and I looked at this solution with the original design doc and decided on a couple things

Going with this format of transformations instead of a solution emulating promtail's parsing pipeline is fine - it will be easier to display from a UI standpoint and maybe simpler to understand. We may introduce the parsing pipeline system later.
The current regex limitations of only doing one match is insufficient. Features already exist within javascript's regex paradigm to define multiple, named captures that could be used for naming variables. We should utilize that already existing functionality out of the box.
We should not allow users to name a regex variable. In the event that multiple regex matches are made, having one variable name would be confusing

To this effect, I have kept the existing solution but implemented logic to take advantage of named capture groups. I have added a test to show how this works, and an example provisioning and CSV pair will be available below.

However, when looking into this, I really would like reviewers to reconsider removing the regex variable name from what can be defined. In my opinion, in order of increasing complexity, regexs could fall into the following categories

An unnamed single capture group with a variable name defined in the transformation
A named single capture group
- if a variable is defined, it is ignored in favor of the capture group name
Multiple unnamed capture groups
- not currently supported, but could be supported in the format of variableName[0] and so on
Multiple named capture groups where order is enforced
- This means that your defined capture groups need to appear in the order they were defined for the match. So for example, with (?<align>(good|bad))(?<name>(Superman|Batman) good Superman would work and Batman bad would not.
- I feel most people wanting to make correlations would not want to require a consistent ordering of their data since data ordering can change over time
Multiple named capture groups where order does not matter
- if a variable is defined, it is ignored in favor of the capture group name
- It is worth noting that making multiple transformations of type 2 would be far easier than including backreferences to every capture group. I do think we should support this, but consider it an edge case.
- An example regex for this is (?=.*(?<align>(good|bad)))(?=.*(?<name>(Superman|Batman))).

There could also be combinations of the above in the same regex, but the pattern would always be that named capture groups would get those names and be guaranteed to work, and any unnamed capture groups in the same expression would most likely not work.

I think we should prioritize the user experience of that first category. I predict users that can figure out named capture groups will understand that the names they define override what is set in their configuration, vs a user who is struggling to understand why their variable has to match their field name with no alternative. I also think that a variable definition gives us a reasonable path forward to supporting multiple unnamed capture groups.

In short (ha!) I think users will most likely take the path of creating 10 simple regex transformations rather than 1 complex regex expression, and we should prioritize making that as simple as possible.

Currently the PR keeps the transformation variable name option. If you disagree with the above, I can remove it.

Example datasource provisioning yaml with named regex capture groups

date,text
1674078628,"Superman did a good thing"
1674078628,"bad Batman did a thing"

        - targetUID: WyFv5154z
        label: "Superhero complicated regex"
        description: "this is a test regex correlation from provisioning"
        config:
          type: query
          target:
            editorMode: "code"
            format: "table"
            rawQuery: "true"
            rawSql: "SELECT * FROM superhero WHERE name='$${name}' AND alignment='$${align}'"
            refId: "A"
          field: "text"
          transformations:
            - type: "regex"
              expression: "(?=.*(?<align>(good|bad)))(?=.*(?<name>(Superman|Batman)))"

gelicia · 2023-02-08T22:20:56Z

Also, none of this covers another use case that we are holding off on, which is when one capture group matches multiple times 🙃

…na/transformation

ifrost · 2023-02-09T09:00:58Z

It's gonna be difficult to have one transformation to support all use cases. I like that we start with something powerful that can be simplified to the user later.

Also we know that regex is complex to many users anyway. That's why we're providing logfmt and we can provide more in the future based on use cases (e.g. extracting key value pairs, and labels). We don't want users to use only regex.

I can imagine we can add a new transformation (simpleregex) that would be the default allowing to write an expression and name of the first match (basically what you had had before the change).

A simpleregex won't prevent as from doing transformations in the future, I can imagine a data frame transformation that would behave this way. The only thing we need to remember is to keep configuration decoupled from the current logic - so as we said we shouldn't use "variable" property in the settings yet (but we could call it "name" to indicate it's a name of the first matching).

Anyway, I feel its powerful and flexible with multiple matches and still open to go in many directions in the future.

gelicia · 2023-02-09T13:11:39Z

The only thing we need to remember is to keep configuration decoupled from the current logic - so as we said we shouldn't use "variable" property in the settings yet

@ifrost In my opinion, adding a variable name option doesn't tie configuration to logic any more than any other definitions we specify, but I trust in your vision of this feature and will remove it.

Additionally, this limitation now means users cannot do multiple transformations with unnamed capture groups - it will override the fieldName variable with the last transformation. Again, I anticipate the most likely use case will be multiple simple regexes, and this will prohibit that from working without defining the capture group name.

ifrost · 2023-02-09T13:49:36Z

@ifrost In my opinion, adding a variable name option doesn't tie configuration to logic any more than any other definitions we specify, but I trust in your vision of this feature and will remove it.

I was only concerned about the naming. Let's say we have provisioning looking like:

  - name: testData-correlations
    correlations:
      - targetUID: WyFv5154z
        config:
          type: query
          target: ...
          field: "text"
          transformations:
            - type: "regex"
              expression: "(Superman|Batman)"
              variable: "name"

And in the future, we want to use real transformations where regex does not create a variable, but a new field so we'd like to express it in configuration and call it fieldName or name, e.g.:

  - name: testData-correlations
    correlations:
      - targetUID: WyFv5154z
        config:
          type: query
          target: ...
          field: "text"
          transformations:
            - type: "regex"
              expression: "(Superman|Batman)"
              fieldName: "name"

Only the name of the last property changes. From the user perspective and other configuration options, nothing changes and we can start adding more super-sophisticated transformations to the mix.

Of course, it's just a minor thing, we can deprecate the config, and support variable for some along with fieldName.

Totally agree it "doesn't tie configuration to logic". It seems like a valid transformation that takes a regex, simple expression, and creates a field with name provided by the user.

gelicia · 2023-02-09T16:00:48Z

Alrighty, after another conversation with @ifrost and @Elfo404 - we agreed that having something that allows users to map a regex to a variable name, but naming it mapValue is best for keeping the plurality and definition future proof.

ifrost

Great stuff 🎉

I think it'd be great to have better provisioning validation, but we can add it in a separate PR.

For example we should check if the correct structure of the config is provided. I was testing it with the example in the PR description and it created a link without transformations (they are incorrectly nested in the yaml example, so please update it). I also mistyped transformation type (regexp instead of regex) and got a link with no transformations and no errors. We do it with some other properties (e.g. we validate if a correct correlation type, i.e. query) is provided and return an error when provisioning is parsed.

ifrost · 2023-02-10T13:26:35Z

packages/grafana-data/src/types/dataLink.ts

@@ -40,12 +40,21 @@ export interface DataLink<T extends DataQuery = any> {
  internal?: InternalDataLink<T>;
 }

+/** @internal */
+export interface DataLinkTransformationConfig {
+  type: 'regex' | 'logfmt';


Minor nit: It could be an enum

…or bad type name and format

gelicia · 2023-02-11T20:16:33Z

@ifrost I added the provisioning checks you asked for (although the errors do seem to come out a little garbled, not sure if there's something I'm missing with that) and the enum change. I checked on how one could do enums in go and it didn't seem as straightforward as typescript so I left it as is - let me know if you have any thoughts about that.

…na/transformation

gelicia · 2023-02-16T07:17:08Z

@ryantxu This is what we discussed yesterday - let me know what you think would be a better term for this feature

ifrost

regex validation doesn't seem to check if expression is provided (it's required by regex transformation to make it work)

…na/transformation

* bring in source from database * bring in transformations from database * add regex transformations to scopevar * Consolidate types, add better example, cleanup * Add var only if match * Change ScopedVar to not require text, do not leak transformation-made variables between links * Add mappings and start implementing logfmt * Add mappings and start implementing logfmt * Remove mappings, turn off global regex * Add example yaml and omit transformations if empty * Fix the yaml * Add logfmt transformation * Cleanup transformations and yaml * add transformation field to FE types and use it, safeStringify logfmt values * Add tests, only safe stringify if non-string, fix bug with safe stringify where it would return empty string with false value * Add test for transformation field * Do not add null transformations object * Break out transformation logic, add tests to backend code * Fix lint errors I understand 😅 * Fix the backend lint error * Remove unnecessary code and mark new Transformations object as internal * Add support for named capture groups * Remove type assertion * Remove variable name from transformation * Add test for overriding regexes * Add back variable name field, but change to mapValue * fix go api test * Change transformation types to enum, add better provisioning checks for bad type name and format * Check for expression with regex transformations

grafanabot added area/backend area/explore area/frontend labels Jan 19, 2023

gelicia force-pushed the kristina/transformation branch from 55724fb to fac0c56 Compare January 19, 2023 17:15

grafanabot added the enterprise-failed label Jan 19, 2023

gelicia commented Jan 19, 2023

View reviewed changes

packages/grafana-data/src/types/ScopedVars.ts Show resolved Hide resolved

gelicia commented Jan 19, 2023

View reviewed changes

pkg/services/correlations/models.go Outdated Show resolved Hide resolved

ifrost reviewed Jan 24, 2023

View reviewed changes

public/app/features/explore/utils/links.ts Outdated Show resolved Hide resolved

ifrost reviewed Jan 25, 2023

View reviewed changes

public/app/features/explore/utils/links.ts Outdated Show resolved Hide resolved

gelicia requested a review from ifrost January 30, 2023 14:03

gelicia added 12 commits January 30, 2023 17:54

bring in source from database

4b7f38d

bring in transformations from database

54fa1e1

add regex transformations to scopevar

35f2c6a

Consolidate types, add better example, cleanup

80369bd

Add var only if match

2ec4f1f

Change ScopedVar to not require text, do not leak transformation-made…

d76656c

… variables between links

Add mappings and start implementing logfmt

2aef4d9

Add mappings and start implementing logfmt

daa9370

Remove mappings, turn off global regex

4250d75

Add example yaml and omit transformations if empty

9e979f7

Fix the yaml

c838601

Add logfmt transformation

b1cf69b

gelicia force-pushed the kristina/transformation branch from 65561b4 to b1cf69b Compare January 30, 2023 23:55

gelicia commented Jan 31, 2023

View reviewed changes

devenv/datasources.yaml Outdated Show resolved Hide resolved

gelicia commented Jan 31, 2023

View reviewed changes

ifrost reviewed Jan 31, 2023

View reviewed changes

pkg/services/correlations/models.go Outdated Show resolved Hide resolved

gelicia added this to the 9.5.0 milestone Jan 31, 2023

gelicia added the no-backport Skip backport of PR label Jan 31, 2023

Elfo404 reviewed Feb 7, 2023

View reviewed changes

gelicia requested a review from Elfo404 February 7, 2023 18:13

Add support for named capture groups

220a9ee

gelicia requested a review from a team as a code owner February 8, 2023 22:08

gelicia added 2 commits February 8, 2023 18:10

Remove type assertion

091149e

Merge branch 'main' of https://github.com/grafana/grafana into kristi…

23c77fe

…na/transformation

gelicia added 2 commits February 9, 2023 07:26

Remove variable name from transformation

df36e99

Add test for overriding regexes

321d489

Add back variable name field, but change to mapValue

e695298

fix go api test

8abfa8a

ifrost approved these changes Feb 10, 2023

View reviewed changes

Change transformation types to enum, add better provisioning checks f…

c743f42

…or bad type name and format

gelicia requested a review from a team as a code owner February 10, 2023 23:31

gelicia requested a review from ifrost February 11, 2023 20:15

Merge branch 'main' of https://github.com/grafana/grafana into kristi…

ec1e426

…na/transformation

ifrost approved these changes Feb 20, 2023

View reviewed changes

gelicia added 2 commits February 21, 2023 12:16

Merge branch 'main' of https://github.com/grafana/grafana into kristi…

023fd66

…na/transformation

Check for expression with regex transformations

2a08a5a

gelicia merged commit 06dfe21 into main Feb 22, 2023

gelicia deleted the kristina/transformation branch February 22, 2023 12:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore: Add transformations to correlation data links #61799

Explore: Add transformations to correlation data links #61799

gelicia commented Jan 19, 2023 •

edited

Loading

github-actions bot commented Jan 19, 2023

github-actions bot commented Jan 19, 2023 •

edited

Loading

ifrost left a comment •

edited

Loading

gelicia Jan 31, 2023

ifrost Feb 1, 2023

ifrost left a comment

Elfo404 Feb 7, 2023

Elfo404 Feb 7, 2023

gelicia Feb 7, 2023 •

edited

Loading

gelicia commented Feb 8, 2023 •

edited

Loading

gelicia commented Feb 8, 2023 •

edited

Loading

ifrost commented Feb 9, 2023

gelicia commented Feb 9, 2023 •

edited

Loading

ifrost commented Feb 9, 2023

gelicia commented Feb 9, 2023

ifrost left a comment

ifrost Feb 10, 2023

gelicia commented Feb 11, 2023

gelicia commented Feb 16, 2023

ifrost left a comment •

edited

Loading

Explore: Add transformations to correlation data links #61799

Explore: Add transformations to correlation data links #61799

Conversation

gelicia commented Jan 19, 2023 • edited Loading

Example 1

Example 2

github-actions bot commented Jan 19, 2023

github-actions bot commented Jan 19, 2023 • edited Loading

ifrost left a comment • edited Loading

Choose a reason for hiding this comment

gelicia Jan 31, 2023

Choose a reason for hiding this comment

ifrost Feb 1, 2023

Choose a reason for hiding this comment

ifrost left a comment

Choose a reason for hiding this comment

Elfo404 Feb 7, 2023

Choose a reason for hiding this comment

Elfo404 Feb 7, 2023

Choose a reason for hiding this comment

gelicia Feb 7, 2023 • edited Loading

Choose a reason for hiding this comment

gelicia commented Feb 8, 2023 • edited Loading

gelicia commented Feb 8, 2023 • edited Loading

ifrost commented Feb 9, 2023

gelicia commented Feb 9, 2023 • edited Loading

ifrost commented Feb 9, 2023

gelicia commented Feb 9, 2023

ifrost left a comment

Choose a reason for hiding this comment

ifrost Feb 10, 2023

Choose a reason for hiding this comment

gelicia commented Feb 11, 2023

gelicia commented Feb 16, 2023

ifrost left a comment • edited Loading

Choose a reason for hiding this comment

gelicia commented Jan 19, 2023 •

edited

Loading

github-actions bot commented Jan 19, 2023 •

edited

Loading

ifrost left a comment •

edited

Loading

gelicia Feb 7, 2023 •

edited

Loading

gelicia commented Feb 8, 2023 •

edited

Loading

gelicia commented Feb 8, 2023 •

edited

Loading

gelicia commented Feb 9, 2023 •

edited

Loading

ifrost left a comment •

edited

Loading