Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flag categorize labels on streams response #10419

Merged
merged 38 commits into from
Oct 25, 2023

Conversation

salvacorts
Copy link
Contributor

@salvacorts salvacorts commented Sep 1, 2023

We recently introduced support for ingesting and querying structured metadata in Loki. This adds a new dimension to Loki's labels since now we arguably have three categories of labels: stream, structured metadata, and parsed labels.

Depending on the origin of the labels, they should be used in LogQL expressions differently to achieve optimal performance. stream labels should be added to stream matchers, structured metadata labels should be used in a filter expression before any parsing expression, and parsed labels should be placed after the parser expression extracting them.

The Grafana UI has a hard time dealing with this same problem. Before grafana/grafana#73955, the filtering functionality in Grafana was broken since it was not able to distinguish between stream and structured metadata labels. Also, as soon as a parser expression was added to the query, filters added by Grafana would be appended to the end of the query regardless of the label category. The PR above implements a workaround for this problem but needs a better API on Loki's end to mitigate all corner cases.

Loki currently returns the following JSON for log queries:

...
{
	"stream": {
	  "cluster": "us-central",
	  "container": "query-frontend",
	  "namespace": "loki",
	  "level": "info",
	  "traceID": "68810cf0c94bfcca"
	},
	 "values": [
                    [
                        "1693996529000222496",
                        "1693996529000222496 aaaaaaaaa.....\n"
                    ],
          ...
},
{
	"stream": {
	  "cluster": "us-central",
	  "container": "query-frontend",
	  "namespace": "loki",
	  "level": "debug",
	  "traceID": "a7116cj54c4bjz8s"
	},
	 "values": [
                    [
                        "1693996529000222497",
                        "1693996529000222497 bbbbbbbbb.....\n"
                    ],
          ...
},
...

As can be seen, there is no way we can distinguish the category of each label.

This PR introduces a new flag X-Loki-Response-Encoding-Flags: categorize-labels that makes Loki return categorized labels as follows:

...
{
        "stream": {
	  "cluster": "us-central",
	  "container": "query-frontend",
	  "namespace": "loki",
	},
	"values": [
                    [
                        "1693996529000222496",
                        "1693996529000222496 aaaaaaaaa.....\n",
                       {
                           "structuredMetadata": {
                                "traceID": "68810cf0c94bfcca"
                           },
                          "parsed": {
                                 "level": "info"
                          }
                       }
                    ],
                    [
                        "1693996529000222497",
                        "1693996529000222497 bbbbbbbbb.....\n",
                       {
                           "structuredMetadata": {
                                "traceID": "a7116cj54c4bjz8s"
                           },
                          "parsed": {
                                 "level": "debug"
                          }
                       }
                    ],
    ...
},
...

Note that this PR only supports log queries, not metric queries. From a UX perspective, being able to categorize labels in metric queries doesn't have any benefit yet. Having said that, supporting this for metric queries would require some minor refactoring on top of what has been implemented here. If we decide to do that, I think we should do it on a separate PR to avoid making this PR even larger.

I also decided to leave out support for Tail queries to avoid making this PR even larger. Once this one gets merged, we can work to support tailing.


Note to reviewers

This PR is huge since we need to forward categorized all over the codebase (from parsing logs all the way to marshaling), fortunately, many of the changes come from updating tests and refactoring iterators.

Tested out in a dev cell with query '{stream="stdout"} | label_format new="text".

  • Without the new flag:
$ http http://127.0.0.1:3100/loki/api/v1/query_range\?direction\=BACKWARD\&end\=1693996529322486000\&limit\=30\&query\=%7Bstream%3D%22stdout%22%7D+%7C+label_format+new%3D%22text%22\&start\=1693992929322486000 X-Scope-Orgid:REDACTED
{
    "data": {
        "result": [
            {
                "stream": {
                    "new": "text",
                    "pod": "loki-canary-986bd6f4b-xqmb7",
                    "stream": "stdout"
                },
                "values": [
                    [
                        "1693996529000222496",
                        "1693996529000222496 pppppppppppp...\n"
                    ],
                    [
                        "1693996528499160852",
                        "1693996528499160852 pppppppppppp...\n"
                    ],
...
  • With the new flag
$ http http://127.0.0.1:3100/loki/api/v1/query_range\?direction\=BACKWARD\&end\=1693996529322486000\&limit\=30\&query\=%7Bstream%3D%22stdout%22%7D+%7C+label_format+new%3D%22text%22\&start\=1693992929322486000 X-Scope-Orgid:REDACTED X-Loki-Response-Encoding-Flags:categorize-labels
{
    "data": {
        "encodingFlags": [
            "categorize-labels"
        ],
        "result": [
            {
                "stream": {
                    "pod": "loki-canary-986bd6f4b-xqmb7",
                    "stream": "stdout"
                },
                "values": [
                    [
                        "1693996529000222496",
                        "1693996529000222496 pppppppppppp...\n",
                        {
                            "parsed": {
                                "new": "text"
                            }
                        }
                    ],
                    [
                        "1693996528499160852",
                        "1693996528499160852 pppppppppppp...\n",
                        {
                            "parsed": {
                                "new": "text"
                            }
                        }
                    ],
...

@salvacorts salvacorts force-pushed the salvacorts/flag-group-labels-response branch from d67e257 to b3cff8a Compare September 5, 2023 10:53
@salvacorts salvacorts marked this pull request as ready for review September 7, 2023 06:55
@salvacorts salvacorts requested a review from a team as a code owner September 7, 2023 06:55
Copy link
Contributor

@vlad-diachenko vlad-diachenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks brilliant 💎 . great job @salvacorts

@@ -710,6 +711,272 @@ func TestQueryTSDB_WithCachedPostings(t *testing.T) {

}

func TestCategorizedLabels(t *testing.T) {
clu := cluster.New(nil, cluster.SchemaWithTDSBAndNonIndexedLabels)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's rename it to SchemaWithTDSBAndStructuredMetadata

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea, but I'd personally prefer something that's not a header to determine version (version parameter in query string|body, different api version prefix in path), but that's a nit.

I also think it makes more sense and will be more efficient for us to not treat each stream distinct based on (labels, structuredMetadata, parsed), but rather labels alone. It also feels more semantically correct as different log lines with separate structuredMetadata or parsed labels are still part of the same stream.

For instance:

{
  "stream": {
    "labels": {
      "cluster": "us-central",
      "namespace": "loki",
      "container": "query-frontend",
    },
    "structuredMetadata": {
        "traceID": "68810cf0c94bfcca"
    },
    "parsed": {
         "level": "info"
     }
  },
  "values": [...]
}

should become

{
  "stream": {
    "cluster": "us-central",
    "namespace": "loki",
    "container": "query-frontend",
  },
  "values": [
    {
      "ts": 0,
      "entry": "entry0",
      "structuredMetadata": {
        "traceID": "68810cf0c94bfcca"
      },
      "parsed": {
        "level": "info"
      }
    },
    ...
  ]
}

There are a few places in codecs, response merging (iirc) that will benefit from the reduced number of streams this will produces. Since it also feels more semantically correct, I think it's the way to go.

@sandeepsukhani
Copy link
Contributor

sandeepsukhani commented Oct 3, 2023

I also think it makes more sense and will be more efficient for us to not treat each stream distinct based on (labels, structuredMetadata, parsed), but rather labels alone. It also feels more semantically correct as different log lines with separate structuredMetadata or parsed labels are still part of the same stream.

Would it not mean a difference in behaviour between metric and log queries? I can't think of a way to do a similar categorization of labels in metric queries. I think stream should have all the labels merged together.

Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work Salva! I have left some minor suggestions. I would like to do another review once those are taken care of.

Comment on lines 49 to 59
tsdbShipperSchemaWithStructuredMetadataConfigTemplate = `
schema_config:
configs:
- from: {{.curPeriodStart}}
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need not add this, you can just do:

	clu := cluster.New(nil, cluster.SchemaWithTSDB, func(c *cluster.Cluster) {
		c.SetSchemaVer("v13")
	})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thank you!

now = time.Now()
require.NoError(t, cliDistributor.PushLogLineWithTimestamp("lineA", now.Add(-1*time.Second), map[string]string{"job": "fake"}))
require.NoError(t, cliDistributor.PushLogLineWithTimestampAndStructuredMetadata("lineB", now.Add(-2*time.Second), map[string]string{"traceID": "123", "user": "a"}, map[string]string{"job": "fake"}))
require.NoError(t, tIngester.Restart())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you are restarting ingester to test your changes with data served from queriers, I think you need to disable WAL as well. You can do it by setting "-ingester.wal-enabled=false" flag on the ingesters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thank you!

pkg/loghttp/entry.go Show resolved Hide resolved
pkg/loghttp/entry.go Show resolved Hide resolved
pkg/loghttp/entry.go Show resolved Hide resolved
return cached
}

return NewLabelsResult(b.buf.String(), hash, stream, structuredMetadata, parsed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are not filling back the cache here, is it intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Thank you!

// cached.
assertLabelResult(t, expected, b.LabelsResult())
assertLabelResult(t, expected, actual)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't look like it would test the cache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, thank you.

Comment on lines +49 to +52
repeated LabelPairAdapter parsed = 4 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "parsed,omitempty"
];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to avoid doing this change since it adds an unexpected field to the push payload. I know it is hard but maybe leave a todo and let us try fixing it in a separate PR to avoid overcomplicating this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to avoid doing this chang

Likewise! The problem is this is the object we move around everywhere an throughout our components so we need to send it around somewhere.

it adds an unexpected field to the push payload

I don't think that's a problem. Protobuf is extensible and backwards compatible when adding fields to the end of an object definition. So in this case, it's still backwards compatible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not worried about backwards compatibility, my ony concern is having an unwanted field on the proto definitions on the write path. I understand it is hard to do it so maybe leave a todo for now.

@@ -310,16 +346,30 @@ func encodeStream(stream logproto.Stream, s *jsoniter.Stream) error {
s.WriteRaw(`"`)
s.WriteMore()
s.WriteStringWithHTMLEscaped(e.Line)
if len(e.StructuredMetadata) > 0 {

if categorizeLabels && (len(e.StructuredMetadata) > 0 || len(e.Parsed) > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for consistency we should write empty object({}) instead of skipping it when we have not structured metadata and parsed labels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean always serializing:

{
    "structuredMetadata": {},
    "parsed": {}
}

If so, I think it's better not to do it. It will add a lot of overhead to the response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviving this conversation due to @svennergr expecting {} to be returned here.

I'll create a follow-up PR to return a third item {} when there is no structured metadata nor parsed labels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes look great. Left just a minor comments and then we are good to merge it.

if err := jsonparser.ObjectEach(value, func(key []byte, value []byte, dataType jsonparser.ValueType, _ int) error {
if dataType != jsonparser.String {
return jsonparser.MalformedStringError
if dataType == jsonparser.Object {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please leave a comment saying we parse both push request and query response json here to make it clear for others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +535 to +541
builder := labels.NewBuilder(lbls)
for _, label := range entry.StructuredMetadata {
builder.Del(label.Name)
}
for _, label := range entry.Parsed {
builder.Del(label.Name)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is on a hot path, it would be good to avoid doing this and try building the final response in the iterators. Maybe leave a todo and try doing it in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a todo

Comment on lines +49 to +52
repeated LabelPairAdapter parsed = 4 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "parsed,omitempty"
];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not worried about backwards compatibility, my ony concern is having an unwanted field on the proto definitions on the write path. I understand it is hard to do it so maybe leave a todo for now.

hash := b.hasher.Hash(buf)
if cached, ok := b.resultCache[hash]; ok {
return cached
}
res := NewLabelsResult(buf.Copy(), hash)

res := NewLabelsResult(buf.String(), hash, buf, nil, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be doing a copy of the buf since it is reused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, done

Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sandeepsukhani sandeepsukhani merged commit 52a3f16 into main Oct 25, 2023
4 checks passed
@sandeepsukhani sandeepsukhani deleted the salvacorts/flag-group-labels-response branch October 25, 2023 10:16
salvacorts added a commit that referenced this pull request Oct 31, 2023
**What this PR does / why we need it**:

This is a follow-up PR for #10419
adding support for tailing.

I tested it on a dev cell and works fine.
<img width="1296" alt="image"
src="https://github.com/grafana/loki/assets/8354290/6177e0ca-02ce-48cd-a17f-0739dc3caa0a">


**Note**: With these changes, the JSON marshal unmarshal functions for
the tail are no longer used ([example][1]) so I think we can remove
them. Also, the new Tail response is no longer used, so we can also make
it an alias to the _legacy_ one. Let's do it on a follow-up PR to avoid
making this one bigger.

[1]:
https://github.com/grafana/loki/blob/52a3f16039dd5ff655fc3681257d99794f620ec4/pkg/loghttp/entry.go#L210-L238
salvacorts added a commit that referenced this pull request Nov 28, 2023
…red metadata labels (#11325)

This is a followup PR for #10419 wrt
this comment
#10419 (comment).

If there is no parsed nor structured-metadata labels in a given entry,
and the categorize-labels header is set, add an empty object to the
entry array. For example, we currently return:

```json
{
    "data": {
        "encodingFlags": [
            "categorize-labels"
        ],
        "result": [
            {
                "stream": {
                    "agent": "promtail",
                    "filename": "/var/log/nginx/json_access.log",
                    "host": "appfelstrudel",
                    "job": "nginx_access_log"
                },
                "values": [
                    [
                        "1701096036028751750",
                        "example line"
                    ]
                ]
            }
        ]
    }
}
```

But we want:
```json
{
    "data": {
        "encodingFlags": [
            "categorize-labels"
        ],
        "result": [
            {
                "stream": {
                    "agent": "promtail",
                    "filename": "/var/log/nginx/json_access.log",
                    "host": "appfelstrudel",
                    "job": "nginx_access_log"
                },
                "values": [
                    [
                        "1701096036028751750",
                        "example line",
                        {} <<<--- WE WANT THIS
                    ]
                ]
            }
        ]
    }
}
```
grafanabot pushed a commit that referenced this pull request Nov 28, 2023
…red metadata labels (#11325)

This is a followup PR for #10419 wrt
this comment
#10419 (comment).

If there is no parsed nor structured-metadata labels in a given entry,
and the categorize-labels header is set, add an empty object to the
entry array. For example, we currently return:

```json
{
    "data": {
        "encodingFlags": [
            "categorize-labels"
        ],
        "result": [
            {
                "stream": {
                    "agent": "promtail",
                    "filename": "/var/log/nginx/json_access.log",
                    "host": "appfelstrudel",
                    "job": "nginx_access_log"
                },
                "values": [
                    [
                        "1701096036028751750",
                        "example line"
                    ]
                ]
            }
        ]
    }
}
```

But we want:
```json
{
    "data": {
        "encodingFlags": [
            "categorize-labels"
        ],
        "result": [
            {
                "stream": {
                    "agent": "promtail",
                    "filename": "/var/log/nginx/json_access.log",
                    "host": "appfelstrudel",
                    "job": "nginx_access_log"
                },
                "values": [
                    [
                        "1701096036028751750",
                        "example line",
                        {} <<<--- WE WANT THIS
                    ]
                ]
            }
        ]
    }
}
```

(cherry picked from commit 10210b8)
salvacorts added a commit that referenced this pull request Nov 28, 2023
…structured metadata labels (#11331)

Backport 10210b8 from #11325

---

This is a followup PR for #10419 wrt
this comment
#10419 (comment).

If there is no parsed nor structured-metadata labels in a given entry,
and the categorize-labels header is set, add an empty object to the
entry array. For example, we currently return:

```json
{
    "data": {
        "encodingFlags": [
            "categorize-labels"
        ],
        "result": [
            {
                "stream": {
                    "agent": "promtail",
                    "filename": "/var/log/nginx/json_access.log",
                    "host": "appfelstrudel",
                    "job": "nginx_access_log"
                },
                "values": [
                    [
                        "1701096036028751750",
                        "example line"
                    ]
                ]
            }
        ]
    }
}
```

But we want:
```json
{
    "data": {
        "encodingFlags": [
            "categorize-labels"
        ],
        "result": [
            {
                "stream": {
                    "agent": "promtail",
                    "filename": "/var/log/nginx/json_access.log",
                    "host": "appfelstrudel",
                    "job": "nginx_access_log"
                },
                "values": [
                    [
                        "1701096036028751750",
                        "example line",
                        {} <<<--- WE WANT THIS
                    ]
                ]
            }
        ]
    }
}
```

Co-authored-by: Salva Corts <salva.corts@grafana.com>
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
We recently introduced support for ingesting and querying structured
metadata in Loki. This adds a new dimension to Loki's labels since now
we arguably have three categories of labels: _stream_, _structured
metadata_, and _parsed_ labels.

Depending on the origin of the labels, they should be used in LogQL
expressions differently to achieve optimal performance. _stream_ labels
should be added to stream matchers, _structured metadata_ labels should
be used in a filter expression before any parsing expression, and
_parsed_ labels should be placed after the parser expression extracting
them.

The Grafana UI has a hard time dealing with this same problem. Before
grafana/grafana#73955, the filtering
functionality in Grafana was broken since it was not able to distinguish
between _stream_ and _structured metadata_ labels. Also, as soon as a
parser expression was added to the query, filters added by Grafana would
be appended to the end of the query regardless of the label category.
The PR above implements a workaround for this problem but needs a better
API on Loki's end to mitigate all corner cases.

Loki currently returns the following JSON for log queries:
```json
...
{
	"stream": {
	  "cluster": "us-central",
	  "container": "query-frontend",
	  "namespace": "loki",
	  "level": "info",
	  "traceID": "68810cf0c94bfcca"
	},
	 "values": [
                    [
                        "1693996529000222496",
                        "1693996529000222496 aaaaaaaaa.....\n"
                    ],
          ...
},
{
	"stream": {
	  "cluster": "us-central",
	  "container": "query-frontend",
	  "namespace": "loki",
	  "level": "debug",
	  "traceID": "a7116cj54c4bjz8s"
	},
	 "values": [
                    [
                        "1693996529000222497",
                        "1693996529000222497 bbbbbbbbb.....\n"
                    ],
          ...
},
...
```

As can be seen, there is no way we can distinguish the category of each
label.

This PR introduces a new flag `X-Loki-Response-Encoding-Flags:
categorize-labels` that makes Loki return categorized labels as follows:

```json
...
{
        "stream": {
	  "cluster": "us-central",
	  "container": "query-frontend",
	  "namespace": "loki",
	},
	"values": [
                    [
                        "1693996529000222496",
                        "1693996529000222496 aaaaaaaaa.....\n",
                       {
                           "structuredMetadata": {
                                "traceID": "68810cf0c94bfcca"
                           },
                          "parsed": {
                                 "level": "info"
                          }
                       }
                    ],
                    [
                        "1693996529000222497",
                        "1693996529000222497 bbbbbbbbb.....\n",
                       {
                           "structuredMetadata": {
                                "traceID": "a7116cj54c4bjz8s"
                           },
                          "parsed": {
                                 "level": "debug"
                          }
                       }
                    ],
    ...
},
...
```

Note that this PR only supports log queries, not metric queries. From a
UX perspective, being able to categorize labels in metric queries
doesn't have any benefit yet. Having said that, supporting this for
metric queries would require some minor refactoring on top of what has
been implemented here. If we decide to do that, I think we should do it
on a separate PR to avoid making this PR even larger.

I also decided to leave out support for Tail queries to avoid making
this PR even larger. Once this one gets merged, we can work to support
tailing.

---

**Note to reviewers**

This PR is huge since we need to forward categorized all over the
codebase (from parsing logs all the way to marshaling), fortunately,
many of the changes come from updating tests and refactoring iterators.

Tested out in a dev cell with query `'{stream="stdout"} | label_format
new="text"`.
- Without the new flag:
```
$ http http://127.0.0.1:3100/loki/api/v1/query_range\?direction\=BACKWARD\&end\=1693996529322486000\&limit\=30\&query\=%7Bstream%3D%22stdout%22%7D+%7C+label_format+new%3D%22text%22\&start\=1693992929322486000 X-Scope-Orgid:REDACTED
{
    "data": {
        "result": [
            {
                "stream": {
                    "new": "text",
                    "pod": "loki-canary-986bd6f4b-xqmb7",
                    "stream": "stdout"
                },
                "values": [
                    [
                        "1693996529000222496",
                        "1693996529000222496 pppppppppppp...\n"
                    ],
                    [
                        "1693996528499160852",
                        "1693996528499160852 pppppppppppp...\n"
                    ],
...
```

- With the new flag
```
$ http http://127.0.0.1:3100/loki/api/v1/query_range\?direction\=BACKWARD\&end\=1693996529322486000\&limit\=30\&query\=%7Bstream%3D%22stdout%22%7D+%7C+label_format+new%3D%22text%22\&start\=1693992929322486000 X-Scope-Orgid:REDACTED X-Loki-Response-Encoding-Flags:categorize-labels
{
    "data": {
        "encodingFlags": [
            "categorize-labels"
        ],
        "result": [
            {
                "stream": {
                    "pod": "loki-canary-986bd6f4b-xqmb7",
                    "stream": "stdout"
                },
                "values": [
                    [
                        "1693996529000222496",
                        "1693996529000222496 pppppppppppp...\n",
                        {
                            "parsed": {
                                "new": "text"
                            }
                        }
                    ],
                    [
                        "1693996528499160852",
                        "1693996528499160852 pppppppppppp...\n",
                        {
                            "parsed": {
                                "new": "text"
                            }
                        }
                    ],
...
```
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
**What this PR does / why we need it**:

This is a follow-up PR for grafana#10419
adding support for tailing.

I tested it on a dev cell and works fine.
<img width="1296" alt="image"
src="https://github.com/grafana/loki/assets/8354290/6177e0ca-02ce-48cd-a17f-0739dc3caa0a">


**Note**: With these changes, the JSON marshal unmarshal functions for
the tail are no longer used ([example][1]) so I think we can remove
them. Also, the new Tail response is no longer used, so we can also make
it an alias to the _legacy_ one. Let's do it on a follow-up PR to avoid
making this one bigger.

[1]:
https://github.com/grafana/loki/blob/52a3f16039dd5ff655fc3681257d99794f620ec4/pkg/loghttp/entry.go#L210-L238
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
…red metadata labels (grafana#11325)

This is a followup PR for grafana#10419 wrt
this comment
grafana#10419 (comment).

If there is no parsed nor structured-metadata labels in a given entry,
and the categorize-labels header is set, add an empty object to the
entry array. For example, we currently return:

```json
{
    "data": {
        "encodingFlags": [
            "categorize-labels"
        ],
        "result": [
            {
                "stream": {
                    "agent": "promtail",
                    "filename": "/var/log/nginx/json_access.log",
                    "host": "appfelstrudel",
                    "job": "nginx_access_log"
                },
                "values": [
                    [
                        "1701096036028751750",
                        "example line"
                    ]
                ]
            }
        ]
    }
}
```

But we want:
```json
{
    "data": {
        "encodingFlags": [
            "categorize-labels"
        ],
        "result": [
            {
                "stream": {
                    "agent": "promtail",
                    "filename": "/var/log/nginx/json_access.log",
                    "host": "appfelstrudel",
                    "job": "nginx_access_log"
                },
                "values": [
                    [
                        "1701096036028751750",
                        "example line",
                        {} <<<--- WE WANT THIS
                    ]
                ]
            }
        ]
    }
}
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants