C API: jv
jv is jq's internal JSON library. All jv objects are immutable, which is a requirement if you want to implement jq's backtracking while remaining approximately sane.
This means that functions operating on jv values tend to be referentially transparent: you can't pass an empty array to a function and expect it to be filled in when the function returns. If you want a function to return some information, it has to actually return a new object since it can't go modifying its arguments.
This means that some of the API usage will look a little odd. For instance, the functions
jv_array_set can be used to get and set elements of an array. The usage of
jv_array_get is fairly standard:
jv elem = jv_array_get(array, 42);
But to use
jv_array_set, you have to know that it returns the new array. You can't ignore the return value.
array = jv_array_set(array, 42, elem);
The "kind" of a jv value is one of the following, defined in the enum
All but the first represent normal JSON values. The next section explains invalid objects. You can check the kind of an object by calling
As well as the normal kinds of JSON values (array, bool, string, etc.), jv supports objects of kind
JV_KIND_INVALID. Such objects are used to signal errors. Some of them carry error messages, which may be an arbitrary JSON value (you can check with
jv_invalid_has_msg and retrieve it with
Generally, the kind-specific functions like
jv_array_get require that their argument be of the correct kind, and trigger an assertion failure aborting the program if not. That is, the program will crash if you pass a string to
jv_array_get: you must check that the argument is an array before using this function.
The functions are forgiving as long as the kinds are right. If you call jv_array_get with an out-of-bounds index, then you will get an object of kind JV_KIND_INVALID back. This definitely indicates an invalid index; it is impossible to store an object of kind JV_KIND_INVALID in an array (or in anything else, for that matter).
You may find it more convenient to use the higher-level functions from jv_aux.h, which do more runtime error-checking and are implemented in terms of the primitives in jv.h. For instance,
jv_get from jv_aux.h takes a value and an index. If the value is an array and the index is in-bounds, it returns the corresponding element. If the value is an object and the index is a valid string key, it returns the corresponding entry. Otherwise, it returns a JV_KIND_INVALID with a suitable error message.
jv refcounts all heap-allocated objects. The usual objection to refcounting is that it fails when objects contain cycles. This is true. Luckily, due to the immutability of jv objects, it's impossible to create a cycle.
This is a pleasant property; as well as getting rid of pointer aliasing (a fertile source of bugs), it also limits us to acyclic heap structures. Since JSON does not support cyclic structures, this means that any jv object can be rendered as JSON.
Most jv functions are said to "consume" their arguments. That is, once you have passed the arguments to the function you may no longer use them and their memory may be reused. For instance, in the
jv_array_get example above, it is invalid to use the variable
array after that line has executed. If you need to reuse a jv value, you can call
jv_copy to get a second copy of it.
jv_copy does not consume its argument.
It may seem like
jv_copy does a deep copy of the object. It certainly behaves in this way, and if you keep that model in mind when writing jv code you'll get the right answer. However, jv_copy is in fact very cheap, see below for how it works.
You must consume every jv value, otherwise there may be memory leaks (the tests won't pass if so, as they're run under valgrind). If you have nothing else to do with a value, pass it to
jv_free, which consumes its argument and does nothing with it.
The jv API can be used as though every operation copied the entire object and jv_copy did a deep-copy. That's a useful mental model to program with, but it would be horrendously slow. Instead, jv uses a copy-on-write scheme for all objects.
In the worst case, the jv functions will need to copy their input object. However, most of the time there's no reason to keep the old version around as it will never be used again. In this case, the refcount of the input object will be 1 (only one reference) and the function would have to free it. So, all of the functions that return new version of an object (e.g.
jv_array_set) first check whether the refcount is 1. If so, they know they can safely modify the object in-place without allocating any new memory. Thus, most of the time,
jv_array_set won't copy anything.
jv_copy is then implemented by increasing the reference count by 1. This means that the object won't be modified by future calls to
jv_array_set and the like. Instead,
jv_array_set will copy the object and modify that.