Description
I would like to propose adding a new JQnl
(or JQLines
) filter to the script package. The filter would process newline-delimited JSON data by applying a JQ query to each JSON object in the input. It should work with pretty-printed and compact (jq -c
; one JSON object per line) JSON input.
Newline-delimited JSON is widely used for streaming data where each line represents a self-contained JSON object. The current JQ
method, unlike the jq
command line utility, only processes a single JSON object from the input, making it difficult to process streams of data.
Use cases
-
Log Analysis
Application logs are often output as JSON objects, one per line. With
JQnl
, users could extract and transform specific fields from log files, filtering for specific log levels, error messages, etc. -
Processing Paginated API Results
When dealing with APIs that return paginated results,
JQnl
would allow processing paginated reponses as part of a script pipeline. -
Data ETL Workflows
For Extract-Transform-Load workflows where each record is a separate JSON object, the method would streamline the processing of large data sets by applying transformations to each record in the stream without loading everything into memory at once or rather before applying the
JQ
filter.
Concrete example for the Log Analisys use case
Extrct all warning and error messages from a JSON log file, e.g. output of slog
.
/tmp/log.json
- The mixed compact-prettyprint-compact format is intentional for illustration purposes.
{"time": "2025-03-17T18:04:26.534789-07:00", "level": "INFO", "msg": "info message"}
{
"time": "2025-03-17T18:04:26.534946-07:00",
"level": "WARN",
"msg": "warn message"
}
{"time": "2025-03-17T18:04:26.534953-07:00", "level": "ERROR", "msg": "error message"}
jq
- command line utility for reference
$ cat /tmp/log.json | jq 'select(.level=="WARN" or .level=="ERROR") | .msg'
"warn message"
"error message"
JQ
- script.Stdin().JQ(os.Args[1]).Stdout()
$ cat /tmp/log.json | ./scriptJQ 'select(.level=="WARN" or .level=="ERROR") | .msg'
$ # no output, since `JQ` only processes the first JSON object
JQnl
- script.Stdin().JQnl(os.Args[1]).Stdout()
$ cat /tmp/log.json | ./scriptJQnl 'select(.level=="WARN" or .level=="ERROR") | .msg'
"warn message"
"error message"
Sample implementation of JQnl
:
func (p *Pipe) JQnl(query string) *Pipe {
return p.Filter(func(r io.Reader, w io.Writer) error {
q, err := gojq.Parse(query)
if err != nil {
return err
}
code, err := gojq.Compile(q)
if err != nil {
return err
}
dec := json.NewDecoder(r)
for dec.More() {
var input interface{}
err := dec.Decode(&input)
if err != nil {
return err
}
iter := code.Run(input)
for {
v, ok := iter.Next()
if !ok {
break
}
if err, ok := v.(error); ok {
return err
}
result, err := gojq.Marshal(v)
if err != nil {
return err
}
_, err = fmt.Fprintln(w, string(result))
if err != nil {
return err
}
}
}
return nil
})
}