Releases: ldn-softdev/jtc
Latest Builds location
Holding latest builds (the latest build: October 5, 2020)
Changes up till now:
- issues #16, #17, #18: no functional impact, code safety improvements
- compiling issues #19, #20
- issue #21: fixed an occasional uncaught exception might be thrown in peculiar walks (UT'd)
- issue #22: fixed a nasty performance regression noticeable on big
JSON
s for lexemes supporting interpolation:<..>R
,<..>L
,<..>D
,<..>j
(UT'd) - fix for the generated auto-tokens issue introduced in the prior build (UT'd)
- fixed issues #27, #28 (affecting Linux only)
- fixed issues #29, #32, improved per-walk template behavior #31
- fixed/improved template-argument behavior in options
-u
/-i
/-c
: the behavior should match the behavior of-T
option (string-interpolation of iterables might have produced different results) - fixed a crash potentially might be occurring when a JSON root undergoes interpolation
- fixed an issue (#33), where a non-initial Regex lexeme might not be getting engaged (that's the regression from v1.76)
- added template auto-token
$wuid
which refers to deterministic walk's unique id for each walk given by user (handy for making JSON elements collections per each walk) - introduced flow-control for walk loops using
<>f .. ><f
pairs: this is a use-case to resolve recursive lookup chains - improvements:
- improved namespace passing between option chain-sets and for
-p
/-s
options - improved trailing backslash parsing in all lexemes
- enabled walks over a templated argument in
-i
/-u
/-c
options (as well enabled namespaces passing to the template) - reinstated namespace passing between interleaved walks (it's a regression - the functionality was lost after re-designing namespaces in
v1.76
) - improved behavior for lexeme
<..>I
when initializing namespace - enhanced performance for tokens
{{}}
,{{..}}
(when tokens used as standalone then no interpolation is needed andJSON
can be retrieved directly from walks) - added a use-case for label interpolations when string-interpolating iterables (handy for generating headers from labels/indices for CSV output), e.g.:
<<<'{"tbl":["a","b","c"]}' jtc -w'[tbl]<:>k' -qqT'"{}"'
will generate0, 1, 2
output (instead ofa b c
) - that is predicated by last walked<:>k
directive (lexeme spelling in this case is limited to:
). - improved label ordering in JSON objects: now numerical labels (those made of digits only) are ordered numerically, while all other labels ordered literally
- improved namespace passing between option chain-sets and for
items to accomplish before the next release:
- introduced
$@
auto-REGEX namespace - it holds all the RE matches (entire matches, or group matches) in a JSON array. That way it's easy to split strings, e.g.:<<<'"abc, def, ghi"' jtc -w'<[^, ]+>R' -rT'{{$@}}'
produces output:[ "abc", "def", "ghi" ]
- redesign and enhance internal template-interpolate logic: currently all interpolations are done via JSON serialization / deserialization, which is a slow way - rework
Json
class to allow parsing templates and rewrite interpolation so that it's done via binary construction (serdes way will remain only for string interpolations). - introduce couple variants of the
<..>v
directive:<var:<JSN/TMP>>v1
allow saving a JSON spelled literally, or out of a template right into a namespace (currently any lexeme value in<..>v
directive is either a JSON or promoted to a JSON string)<var:[{{$PATH}}, <JSON/TMP>]>v2
- the JSON in this form allows reconstructing a JSON in a namespace (i.e., incrementally build up a JSON in the namespace)
- implement streamed parsing of JSON (i.e. in the format similar produced by this walk:
jtc -rw'<>e:' -T'[{{$PATH}}, {{}}]'
- this would allow processing a virtually endless JSONs w/o any memory pressure. (parsing of such streamed JSON will be done in a concurrent thread)
1.76
Release Notes for jtc
v.1.76
New features and enhancements:
- when multiple files given,
jtc
now will read/parse all files concurrently (on multi-core cpu); to disable multithreading (and process files sequentially) give option-a
(normally, the option is implied and redundant when multiple files given) - a new lexeme directive
<..>S
- complements directive<..>W
: walks JSON tree as per the preserved path - when file argument for options
-i
/-u
/-c
contains a stream of JSONs, it's automatically converted into an array of JSONs - template operations enhancements:
- an argument for options
-i
/-u
/-c
now additionally can hold a template (e.g.:-u0 -T<template>
now could be collapsed into-u<template>
) - regex search lexemes (
<..>R
,<..>L
,<..>D
) now are subjected to template interpolation as well, though namespace usage in such lexemes is limited to alphabetical names only ('cause numeric names would clash with regex quantifiers) - template interpolation obviously occurs before regex applied - auto-generated label tokens for template interpolation (
$A
,$B
, etc) now also hold indices if the respected values are in array (it used to work only for objects) - walked atomic values now also can be represented in templates using auto-generated tokens (
$A
and$a
for a label/index and a value respectively) for easier template-interpolation operations - setting namespace
$?
to any value (even empty one) in a walk triggers resetting of the respective auto-token$?
(which holds historical values) to the default value""
(it's a user-controlled way to reset the token, in addition to the existing trigger - template interpolation failure) - when string-interpolating an iterable (array or object) via
"{}"
token, all atomic values within the iterable get interpolated into the string recursively - improved template stringification (
>{{}}<
) - operation now is consistent across all JSON types (null / bool / numeric used to behave differently) - limited usage of auto-generated tokens (e.g.:
$abc
) to 3 letters only (to avoid clashing with tokens like$file
and all future tokens) - the use case for auto-generated token is template-interpolation for relatively short arrays / objects, thus 3 letters is sufficient to address iterables up to 18278 values in size) - extended range of auto-tokens representation in iterables (
$a
,$b
, etc): initially each token represents a a respective top level JSON element of the iterable, beyond that range each next auto-token will represent an atomic value of the JSON tree as if it walked recursively
- an argument for options
Improvements, changes, fixes:
-
behavior improvements:
- redesigned and improved processing of options chain-sets logic: lifted a caveat of using -
J
/-j
/-a
in intermediate chain-sets (now it works inline with the expected option behavior in any of the option sets) - when unquoting strings with
-qq
a translation of UTF-8 code points (e.g.:\uD123
), as well as correct processing of UTF-8 surrogate pairs added - improved label update operations: now also any atomic value (null / boolean / numeric) can update a label (before labels could be updated only with string types)
- improved namespace behavior for
-p
/-s
operations (now namespaces from the respective walks are not lost in such operations and could be reused later)
- redesigned and improved processing of options chain-sets logic: lifted a caveat of using -
-
performance improvements:
- redesigned and improved namespaces storage policy so that it does not slow down walks (used to be the case, noticeable when storing big JSONs)
- optimized performance for
-e
with-i
/-u
shell executions, where all such walks are attempted to be executed in a single run (popen
session), otherwise defaulted to a legacy (slower) way (to enforce the legacy way give-ee
)
-
code design improvements:
- added compile options:
-DBG_dTS
(effective only in junction with-DBG_mTS
or-DBG_uTS
) - debug timestamp display delta instead of absolute stamps (handy for cpu profiling)-DNDBG_PARSER
: disables parsing debugs - handy when deep debugging huge JSONs (to skip the parsing part)
- speed up template interpolations (by breaking away from catching JSON parsing exceptions towards processing parsing by return value)
- improved performance when outputting walked elements (
-w
) - improved debug outputs when displaying JSONs longer than the term width (the same update ensures correct displaying of UTF-8 strings)
- added compile options:
-
various fixes:
- fixed locality of
<>q
,<>Q
searches: it accidentally became global after last redesign of lexeme implementation, now it's local to the search tree (UT'ed) - fixed accidentally broken options translation in the built-in mini-guide (
-g
) - fixed a rogue debug level when debugging
-e
option - fixed a very corner crash occurring upon
-u
/-i
based source walks predicated-pp
option usage and only when resulted walks gets invalidated by any of the prior walks (UT'ed of course) - fixed an issue when last walk control (
-x0
or-x/-1
) worked in the first JSON but did not work in any subsequent -if there were multiple (UT'ed)
- fixed locality of
1.75d
Release Notes for jtc
v.1.75d
New features:
- performance improvements and some more fixes:
Improvements, changes, fixes:
- completely reworked the logic of
<>g
,<>G
,<>q
,<>Q
lexemes by externalizing their storages into standalone caches, that made them run as fast as a bare metal sort and not slowing down walking - removed some superfluous optimization in the interpolation logic (it was limiting some corner use-cases)
- parsed quoted solidus (
\/
) now always translated into a unquoted (/
), unless-q
is given which restricts behavior to quoted-only - option
-nn
does not engulf-n
now (i.e. if both behaviors required then both to be spelled:-nnn
) - added a token
$file
holding the name of a currently processed input file - so that it could be interpolated if required - improved
$PATH
token interpolation so that the namespace$#
also could be utilized with it (upon interpolation into a string template) - reinstated
-mm
behavior (advertised in the last version but missed) - fixed engagement of lexeme
<..>u
in interim options sets - fixed quite a rare misbehavior of branching lexemes
<>f ... <>F
1.75c
Release Notes for jtc
v.1.75c
New features:
- No new features, some more minor improvements and fixes:
Improvements, changes, fixes:
- for all iterables undergoing template interpolation generate auto-tokens
$a
,$b
, etc (and$A
,$B
, etc) for all values (and for objects' respective labels) - for lexemes setting JSON in the namespace, e.g.:
<ns:..>v
if parsing JSON value fails - try promoting it to JSON string first, and only if it fails too then throw an exception - made options
-z
,-zz
non-transient (i.e. to be used only in the final options set) - some code fixes for MacPorts compatibility
- fixed issue: interpolation of
$?
token should work even w/o-x0
(-x/-1
) option
1.75b
Release Notes for jtc
v.1.75b
New features:
- Quick fixes for overlooks in a design of the new features, which sneaked past UT:
Improvements, changes, fixes:
- fixed issue: when shell evaluation fails, it might break options
-ei
/-eu
logic - fixed/improved handling of
;
char in shell eval operations-ei
,-eu
: treat only a standalone occurrence of\;
as terminating symbol (and not when it's a trailing character - to allow cli chaining in argument) - fixed issue: all non-transient output view options
-qq
,-r
,-t
and-f
, plus a bare qualifier-
- should be ignored in all the interim chain sets, but the last one (except the bare qualifier-
- it has a global scope, i.e. cited in any of chained option set will force initial reading fromstdin
) - fixed issue: accidentally broken bare qualifier
-
(input redirect) - fixed issue:
-f
option for chained sets, also extended-f
: now it forces any output to file, allowing redirecting even walks - option
-z
now outputs size in a JSON compatible format, e.g.:{ "size": 100 }
jtc v1.75 - JSON transformational chains
Release Notes for jtc
v.1.75
New features:
-
introduced a new semi-compact printing view. The view is engaged when the suffix
c
is appended to-t
option, e.g.:-t2c
,-tc
. The semi-compact view is a middle ground between compact (-r
) and pretty-printed (-t
, default) views: when a JSON iterable is made of atomic values only (and/or empty iterables{}
,[]
), it will be printed in a compact (one-line) format, the rest is pretty-printed -
introduced operations chaining via delimiter
/
:- chaining delimiter(s) pretty much replaces
jtc ... | jtc ... | jtc ...
notation withjtc ... / ... / ...
- the advantage is huge:jtc
now is capable of processing multiple chained operations w/o printing-parsing interim JSONs (which is quite expensive operation) - that speeds up operations and simplifies notation
Another benefit is that it becomes possible to pass namespace(s) from one chain set into another (which is impossible with piping notation) - chain-delimiter
/
only splits options notations, not working when cited among file arguments
- chaining delimiter(s) pretty much replaces
-
introduced an optional step notation in range subscripts and search lexemes qualifiers:
[N:M:S]
,<..>N:M:S
:S
must be strictly positive value. In search quantifiers<..>::{S}
if after interpolation the value happens to be negative (or zero) then the default step1
is applied -
new search lexemes
<..>g
and<..>G
allow going over JSON elements in a sorted order (ascending and descending respectively). When applied w/o quantifiers allow finding min and max values respectively -
a new directive
<..>Z
- preserves into a namespace a selected (walked) JSON entry size (a recursive and non-recursive behaviors applied respectively).<..>Z1
lexeme (i.e., with quantifier1
) - saves into a namespace a currently walked JSON string size (if the walked JSON is not a string, the value-1
is saved) -
a new lexeme
<..>W
- preserves a current walk-path (as a JSON array) into a namespace variable -
introduced a new parsing behavior (
-mm
) allowing accepting ill-formed JSONs with clashing labels by collecting them into arrays (e.g.:{ "a": 1, "a": 2 }
will be parsed into{ "a": [ 1, 2] }
-
rebranded
jtc
into JSON transformational chains to reflect better tool's purpose and capability
Improvements, changes, fixes:
-
enhanced template interpolation (
-T...
):- removed prior limitations: now, application of templates is universal to all operations - executed as a last step for the respective walk(s)
- extended template-interpolations of JSON iterables into strings: the former could be interpolated into the string values as enumerations: the enumeration separator value (default
", "
) will be taken from newly introduced namespace$#
-
new namespaces added:
$#
: holds the separator used when a JSON iterable is interpolated into a JSON string (default value", "
)$_
: holds the separator used when$path
is interpolated to join path tokens (default value"_"
)$$?
: holds the separator used upon template expansion when interpolation token{$?}
is used (default value","
)
-
introduced quantifiers for
F
directive (both recursive and non-recursive):- a new semantic for
<>Fn
quantifier: ifn
>0
(i.e. non-default), it will let continue walking past<>Fn
directive skipping ton
th lexeme fromF
: e.g.:<>F1
- will continue walking right from the immediately following lexeme,<>F2
will continue walking from 2nd lexeme past<>Fn
(i.e., skipping the first one), etc. - a new semantic for
><Fn
quantifier: ifn
>0
(i.e. non-default), it allows additional replications of the entire walk (before the lexeme><F
)n
times
- a new semantic for
-
enhanced
<..>I
directive behavior:- initialization of the namespace value could be done now within the lexeme itself, e.g.:
<c:100>I1
- will initialize counterc
with the value100
before the directive executes (unlike typical behavior where namespace initialization/preservation is applied as the last step end of lexeme walking) - a new additional semantic for
<..>In:m
quantifier, wheren
is an increment step (as before, no changes here),m
- is a new multiplier (integer only), e.g.:<a:10>I5:2
, after the first walking the namespacea
will hold(10 + 5) * 2 = 30
- in such notation, first the increment is applied and then the multiplier - the directive also understands now an empty token
{}
for the increment and/or the multiplier :<..>I{}:{}
- the empty token will will refer to the currently walked (numeric) value - this is the only lexeme where such empty token notion makes sense and supported
- initialization of the namespace value could be done now within the lexeme itself, e.g.:
-
improved
-jj
option behavior: now the clashing labels will override each other (thus, only the last value will be retained), to collect even clashing labels (into an array) use-m
modifier -
improved behavior of
-ll
toggle - now it gleans all the labels, not just the first one (as before) - typically used together with-j
-
performance improvements:
- in the JSON library, for
ARY
/OBJ
declarations stepped away fromstd::initializer_list
to variadic templated arguments (that permits use of move semantic now in the initialization notations, which simplified the usage and improved performance - improved performance of buffered read from
<stdin>
(now, it's almost as fast as the read from files) - same way improved performance of file read in options (
-i
,-u
,-c
) - drastically improved performance of
<>q
,<>Q
searches by making them cacheable: they are still quite memory hungry, still are the slowest among all searches, but now they are not prone to exponential decay and can be used on big JSONs with a predictable processing time
- in the JSON library, for
-
added a few compilation options:
-DBG_FLOW
: a new debug of the execution flows (tracing an entry and exit point of everyDEBUGGABLE
function/method). Add-DBG_FLOW
flag when compiling to effectuate such debugging - complements nicely-DBG_CC
flag when debugging copy-constructors for optimization-DBG_mTS
: lets debugging output to have time-stamp with milliseconds accuracy-DBG_uTS
: lets debugging output to have time-stamp with microseconds accuracy
-
debugability improvements:
- added printing backtrace in the unlikely event of a crash (only when debug is enabled). On MacOs/BSD it will print demangled back-tracing
- improved parsing output when debugged - now it'll be auto-adjusted to terminal's width
-
program design improvements:
- simplified program design for all cases of source/destination walks - that also fixed the prior caveat with labels updates through the shell evaluation (now even nested labels could be updated, the caveat is removed)
- enhanced a logical way of handlings for all directives where applicable - now, the directive is activated only once per a walk pass (applied to directives
z
,Z
,W
,v
,k
,I
) - improved/fixed behavior for shell evaluation (
-e
with-i
/-u
) argument parsing behavior for Linux/GNU only (Macos/BSD were fine -getopt()
GNU implementation works differently than MacOS/BSD's)
-
more fixes and enhancements:
- fixed issue: directives
<..>I
and<..>u
also must support interpolated name-spaced quantifiers:<..>I{ns}
,<...>u{ns}
- fixed issue: fail-safe
<>f
directive should not fire after there have been successful matches in iterables - fixed/improved parsing of
<..>j
search lexeme when the content is a template - fixed Linux options parsing (to behave the same way like on Macos/BSD)
- fixed a corner crash when move semantic applied on multiple walks and the prior walk deletes the object pointed by the subsequent, interleaved walk
- fixed a corner crash when a search lexeme (
i
,o
,c
, etc) was matching a root iterable (array/object) and at the same time attempted saving it into a namespace - fixed a crash when blank (or white space only) input was combined with the streamed read (
-a
)
- fixed issue: directives
JSON for performance tests
This is a sample JSON used in performance testings (the JSON was generated from the XML file)
jtc v1.74 - JSON test console
Release Notes for jtc
v.1.74
New features:
- No new features, some enhancements and stability improvements
Improvements, changes, fixes:
- improved handling of
<>q
and<>Q
lexemes drastically (performance and memory utilization-wise), also now those lexemes may be empty (before it was mandatory to give a namespace in the lexemes) - option
-t
now can be used to control spacing for the compact (one-row) view, e.g.:-r -t0
will print a very compact one-liner JSON, w/o spaces; when used together (-r
and-t
), it will also control spacing in stringification of JSON in template operations - introduced a support for flags in Regular Expressions (namely:
INOCESXAGP
); flags can be given only as trailing part of the RE (they will be removed from the RE itself after parsing), e.g.:<...\I\O>R:
; also, flagsESXAGP
facilitate various REGEX grammars, those flags will be processed only once (i.e., only the first setup grammar flag will have an effect, all subsequent will be ignored) - enhanced behavior of empty
<>k
lexeme - now it also has an effect when placed in front><F
lexeme (i.e. logical end of walking), not only at the syntactical end of the walk-path - enhanced interpolation behavior of
{}
token: when interpolation of a JSON object fails, it will be re-attempted to strip the JSON object as an array - effectively allowing conversion of JSON objects into JSON arrays in templates. - fixed an issue when a "move" - semantic (
-p
) applied to update (-u
)/insert (-i
) operations: if the walks of the latter fails entirely then a purge should not be applied on destination walks (UT'ed)
jtc v1.73 - JSON test console
Release Notes for jtc
v.1.73
New features:
- No new features, some enhancements and stability improvements
Improvements, changes, fixes:
- lifted label update operation when
-u
is used to update a label (when a walk-path is ending with an empty...<>k
lexeme):
now it's possible to update/rewrite recursively even nested labels w/o failures - converted walking (walk iteration) to a non-recursive loop, now walks are virtually endless (i.e. able to walk JSONs of virtually
ANY size and depth) and not restricted by a depth of a stack -T
processing for-i<walk>
and-u<walk>
operations is enhanced to match the same behavior as for-w<walk>
:
templates are interpolated per walks now (if a count of templates and walks matches), or round-robin fashion otherwise
(before, for some weird reasons all templates were applied for each such walk)- fixed insertion (
-i
) when the last lexeme of a walk is non-empty<..>k
then no label reinterpretation occurs
(so it's consistent now with the same behavior of-u
) - removed support for the empty
<>z
notation form of the lexeme: erasing entire namespace is idiomatically inconsistent with the
walk design (and might lead to confusion or misunderstanding of the expected behavior), so only non-empty lexemes<..>z
are
supported now (and restricted to) - fixed a crash when debugging is on (quite a corner case though)
- fixed a programmatic error (rarely occurs only in API calls) where Json class would falsely expect
<stdin>
in the event when parsing constructor throws
jtc v1.72a - JSON test console
Release Notes for jtc
v.1.72a (NOTE: The Release is republished, as prior binaries were incorrect ones)
New features:
- introduced a new directive
I
which let incrementing/decrementing numerical JSONs preserved in the namespace (and ignore other
JSON types), e.g.:<var>I3
,<var>I-1
. Ifvar
wasn't defined before, the iteration begins with0
;
however, it's possible to initialize it with other than0
values - see User Guide - introduce an auto-namespace variable
$?
to reference the last processed walk, this facilitates use-cases when converting
input JSON to.csv
format; see User Guide for more - introduced new lexemes
<..>P
,<..>N
to match any JSON strings and JSON numerical types respectively. Before, to facilitate the
same, REGEX lexemes were used:<.*>R
and<.*>D
respectively, but new lexemes work faster and allow storing matched values in
the namespace) - Template-interpolation was enhanced with new capability to jsonize JSON strings (containing embedded JSONs) and stringify JSONs -
similar to respective options-qq
and-rr
but now programmatically. See User Guide for the syntax and examples - added a new semantic to
-x
option:-xN[/M]
notation lets specifying a frequency of walks to be displayed - (every Nth walk) staring
from the optional offsetM
(zero based); e.g.:-x4
- display every 4th walk, while-x4/1
will do the same starting from the
2nd (index is zero based) walk.
Also, note a special notation case:-x0/N
- will displayNth
(zero based) walk only once, this could be abbreviated to-x/N
;
N
is positive, but also supported-1
value - to display the last walk
Improvements, changes, fixes:
- improved
-jl
options combination behavior: in some cases it wasn't robust and failed providing the expected result.
Plus, introduced a new merge format:-jlnn
- all clashing values will be aggregated (disrespecting JSON structured grouping vs. as in
the case of-jl
) - lifted handling of atomic JSONs - simplified the code allow applying walk-paths now even onto the atomic JSON values
- extended null-interpolation for JSON strings: before it was applied for JSON arrays and JSON objects only).
Now, the empty variable interpolation in the string, following either of,
,;
will be taken into account,
e.g.:-T'"{}, "'
- if{}
is empty, then result of interpolation will be empty too:""
- improved buffered file read speed (3 times faster) and stdin buffered speed (1.5-2 times faster), improved handling of
non-existent/bad file-arguments (when multiple given) - enhanced move semantic of
-u
,-i
operations, so that when used together with-pp
it also works as expected with those options
(before it was only working for-p
and-pp
notation was ignored) - a last in the walk
k
lexeme (e.g.:-w'... <>k'
) is only subjected for re-interpretation of the label if it's empty (<>k
) now;
the non-empty<..>k
lexeme then doesn't (the value is preserved in the namespace, so no need to re-interpret it then) - template pertain per walk feature used to work only for interleaved walks and
-n
was cancelling it (template then were applied
round-robin). Now, even with-n
(i.e., for sequenced walk processing) templates are also pertained per walks,
in the unlikely event when round robin behavior is required-nn
notation will support it.
Also, template pertain per walk feature is enhanced to be engaged only when number of templates (-T
)
matches the number of provided walks (-w
), otherwise round-robin template application behavior is engaged. - put a hard cutoff on a too deep recursion shall any unforeseen case (while walking) occurs in the future;
the same enhancement has fixed a case of too deep recursion (with subsequent stack overflow) for a corner case of lexeme<>F
usage occurring in processing really huge JSONs only - the message "notice: option -J cancels streaming input" is printed now only when
-a
+<stdin>
were selected explicitly together
with-J
(and not when-a
is implicitly imposed upon-J
) - fixed parsing debug (offset for a streamed read now shows a correct value - from the beginning of a stream, instead of
the beginning of an internal circular buffer, other read debugs (buffered, cin) are unaffected) - fixed empty
<>b
lexeme - it was not working (as documented), now it matches any boolean JSON value (UT'ed)