ja2l

JSON array to lines. Expects a JSON array like this on standard input:

[
{"foo":"bar"},
{"abc":"xyz"},
{"whatever":0}
]

and converts it into a stream of JSON values on standard output:

{"foo":"bar"}
{"abc":"xyz"}
{"whatever":0}

This stream is suitable for processing with jq, and can also be split up for parallelized processing. (This format is also variously known as line delimited JSON (ldjson), newline delimited JSON (ndjson), or JSON lines (jsonl).)

Project status

Functional, but more tests and various other goodies would be nice.

Build instructions

make && sudo make install. Needs glibc and GCC.

The Makefile follows standard GNU conventions; for instance, packagers can use something like make DESTDIR="$pkgdir" prefix=/usr install. (Note that standard $(bindir), $(man1dir), etc. directories are expected to exist, because the GNU Make manual says to not use the nonstandard mkdir -p and I don’t know how to create the directories without it.)

Requirements

glibc (error, getopt_long)
Linux (/proc/sys/fs/pipe-max-size, fcntl(F_SETPIPE_SZ))
GCC (-fanalyzer; if you remove that from CFLAGS, clang may or may not work)
optional: dgsh

The reason lists in parentheses are probably not exhaustive. (The most likely reason for that is that I’ll probably forget to update them as I update the program.)

dgsh support

ja2l can be built with dgsh support. When used in a dgsh pipeline, it accepts zero or one inputs (depending on whether or not a file name was specified on the command line) and scatters the JSON values across any (nonzero) number of outputs.

To build ja2l with dgsh support, add -DUSE_DGSH to the CPPFLAGS and -ldgsh to the LDLIBS, e. g. like this:

make CPPFLAGS=-DUSE_DGSH LDLIBS=-ldgsh clean all

This can be used to speed up processing of the JSON data with jq, similar to this script:

function countElements {
    jq -r '
      .elements |
      .[]
    ' | awk '
      {
        a[$0]++
      }
      END {
        for (k in a)
          print a[k] "\t" k
      }
    '
}

function summarizeElements {
    awk -F'\t' '
      {
        a[$2] += $1
      }
      END {
          for (k in a)
            if (a[k] >= 1000)
              print a[k] "\t" k
      }
    ' |
    sort -nr
}

ja2l | {{
        countElements &
        countElements &
        countElements &
        countElements &
    }} |
    cat |
    summarizeElements

This prints the most common “elements” in the JSON input, parallelizing the extraction and counting of elements across four countElements invocations. The results of those invocations are then aggregated into a single result list again. If the processing is CPU-bound, jq is the expensive part, and you have four processors or processor cores (without counting hyper-threading), this should speed up processing by about a factor of four.

Attribution

The cleanup.h header file is based on systemd header files, which are published under the LGPL2.1+.

The install-related variables in the Makefile are copied from the GNU Make manual, which is published under the FDL1.3+.

License

The content of this repository is released under the AGPL3+ as provided in the LICENSE file that accompanied this code.

By submitting a “pull request” or otherwise contributing to this repository, you agree to license your contribution under the license mentioned above.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
.mailmap		.mailmap
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cleanup.h		cleanup.h
ja2l.1		ja2l.1
ja2l.c		ja2l.c
mkosi.build		mkosi.build
mkosi.default		mkosi.default
myio.c		myio.c
myio.h		myio.h
options.c		options.c
options.h		options.h
pipe.c		pipe.c
pipe.h		pipe.h
test		test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ja2l

Project status

Build instructions

Requirements

dgsh support

Attribution

License

About

Releases

Packages

Languages

License

lucaswerkmeister/ja2l

Folders and files

Latest commit

History

Repository files navigation

ja2l

Project status

Build instructions

Requirements

dgsh support

Attribution

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages