-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation/Feature request: Clarify expressions/scope #1326
Comments
Hmm, OK. I suppose the docs do need some clarification. I do think that @stedolan was trying to go for an intuitive description of an intuitive language. However, jq is a rather powerful language with aspects that are not obvious at first glance.
Function arguments might be particularly confusing. It's best to think of functions as having ONE (and only one) value argument and zero, one, or more function arguments. E.g., Parenthesis can also be used to group expressions. E.g., Parenthesis can be important to deal with precedence issues. E.g., |
Thanks for your input! Keep it coming. It will make jq better. |
@nicowilliams Thanks for the quick response. I get what they were going for, I just felt like it kinda tripped over itself a little to get there. The language is intuitive and simple, I just with it had been explained better. I ignored
I definitely have more ideas, but they are more focused around enhancing the programming language aspects. I wanted to see how receptive the community is first, before going further. |
@lylemoffitt - I'm not sure this is relevant, but since you wrote:
I thought I'd mention that a jq documentation effort has just started at stackoverflow.com. Maybe it could be justified by adopt a "programming language" approach? An entry point: https://stackoverflow.com/documentation/jq/topics |
@pkoppstein - Thanks for mentioning that. Wasn't aware of that feature on stackoverflow. It isn't really what I had in mind though.
I think it would be better than the current one, but that's not really my call. I'm also not saying the current approach is bad either; I just don't think it's as effective as it could be. Like sed, jq is a great CLI tool that with an embedded DSL. In sed's documentation (its man page), they took the approach of emphasizing the DSL over the CLI. This (IMO) is probably what led to the long-term success of sed as a tool, but it also has the downside of making it harder to approach. I myself only recently understood the deeper nature of sed beyond its TLDR - It's a tradeoff. |
@lylemoffitt - I have no idea how the jq documentation at stackoverflow.com will pan out, but I like the combination of brevity and accessibility that characterizes the current "manual", so in a way it would make sense for the more "programming language" orientation that you have in mind to have a home at stackoverflow.com, if there is to be additional documentation there. (Currently, as you may know, the home for the more technical aspects and details is the jq wiki. Maybe you'd like to start a "jq for Programmers" page there? The potential downside of that is the risk that things could get confusing with an official tutorial, an official manual, another manual on the jq wiki, and still another manual of sorts on stackoverflow ...) My orientation is heavily influenced by the documentation I worked on for a large proprietary language. There were three distinct volumes:
|
I'm inclined to agree with you here. I'm not 100% sure what the right approach is given that each has its own set of trade-offs.
I hadn't actually seen the wiki before. Like most projects on GitHub, I had assumed it was empty of full of incomplete/outdated information. This one has some good information that is appropriate placed there. A "jq for Programmers" page there would probably be better than stackoverflow. Either way, it's always second-class to the reference material provided with a distribution. Ideally, there should be a quick-reference that's just as accessible as the current man page, but aimed at more experienced users. Perhaps a good solution would be to have two separate man-pages? The current |
@pkoppstein What's the copyright licensing associated with SO docs? |
@nicowilliams - As best I can tell, the rules are elaborated in Section 3 ("Subscriber Content") of http://stackexchange.com/legal. The key point seems to be "all Subscriber Content that You contribute to the Network is perpetually and irrevocably licensed to Stack Exchange under the Creative Commons Attribution Share Alike license." My (somewhat cursory) reading is that the contributor retains copyright and is not expected to grant an exclusive license. |
@pkoppstein Excellent. Thanks. |
I've pushed a partial fix for this, 6f9646a. |
In relation to operators precedence, I found this table at Rosetta code:
|
Yes, I will change "least interesting filter" with Two important predefined filters are "." (pass), the filter that does nothing, and "empty", the filter that never produces values. The main laws for those filters and the
By the way, for my sanity I decided to put names to all filters and operators
The manual seems to deliberately avoid naming all things! JJOR |
@fadado wrote:
Yes, that's one way the manual achieves a brilliant economy of expression and avoids the "cognitive burden" that comes with naming, especially if the names are potentially misleading, as is the case with "pass" for ".". Readers can be encouraged to pronounce the single-character punctuation operators in accordance with their preferences for pronouncing the punctuation characters themselves (e.g. "dot" for ".", "pipe" for "|", and "comma" for ","). Please note that The name "alternative" for "//" is appropriate as it is a two-character operator with a meaning that is unrelated to "/". |
That's interesting, and helpful, thanks. I was surprised to see that the alternator was right associative. Isn't it defined to evaluate left to right?
This. This is more of the kind of thing I was talking about. Helpful, clear, concise. Even if this is alien to a normal user, it's still worth putting in, because of how innocuous it is. |
Generally, easing cognitive burden goes hand in hand with low expressive power. The man page may come off as an easy read, but it does so at the cost of length and verbosity. If you're set on reading it, the length may not be important, but it's certainly off-putting. Part of the trade of for writing to a low bar is that, while it makes on-boarding easier, it dampens the long-term effectiveness. Now that I understand the language better, I would much rather have a normal function reference, but my only choice is to scroll through a lot of text trying to remember which section the function I'm looking for is under.
The problem with "call it whatever you want" mentality is that you lack community agreement. Especially if you want people to be able to find reference materials on stack overflow, they are going to need a common name to google. Searching for "jq slash-slash" is going to end in a bad user experience. Moreover, all of this is done in the name of bowing to fear that users will flee because you made them learn the names for things. If you structure the man page uniformly, they won't even notice the names. Once they get the formatting their eyes will just jump to the section they care about.
I believe we are all in agreement. The man page uses the terms operator, filter, and function somewhat interchangeably. I believe, the general rule it follows is that filters have word-names, functions have word-names and explicit arguments in parens, and operators are symbols. When it comes to learning how to use a tool, none of this complexity really matters. All you really want to know is how to grep the fields out of the stupid json. But when it comes to learning how to use a language, it's all very important. As I said before, jq's problem is that it's both. I remain with my estimation that the best approach is to split the two aspects into their own pages. |
Ok, if it is a feature and not a bug I will reframe my mind, and I can say the dot operator is like an all-pass filter... |
@fadado wrote:
Thanks for the willingness to see it from another perspective.
Yes, readers of the English-language edition of the jq documentation will have no trouble understanding references of the form "the _ operator", where _ is "dot", "comma", "pipe", or "query", and writing "the dot operator" rather than "the As for describing "." as an all-pass filter --- I am wondering whether the audience who will benefit from such a description is largely the same audience who will understand https://en.wikipedia.org/wiki/All-pass_filter ? |
You are rigth, but while in XSLT we say " The phrase "
Can I say null filter? Or perhaps input value? |
Have you considered "Identity filter"? It is, after all, an identity function. |
I think identity filter is a great idea here, though, admittedly, I'm also
in the group that would understand "all pass filter".
As for the manual, I like that the main manual ignores some of the language
aspects in favor of brevity and clarity. It makes it easy to jump into
using jq. However, I think a second man page that focuses on the language
as a language is a great idea. I know there's a lot of work going on at
stack overflow, but to use that in the manual likely requires us to gain
licensing from the individual authors.
Anyways, I'd like to see us split out the manual into two parts, one on
`jq(1)` (the binary, and some basic usage), and another on `jq.lang(8?)`,
(the language, maybe also builtins)
…On Sun, Feb 12, 2017, 07:24 Santiago Lapresta ***@***.***> wrote:
Have you considered "Identity filter"? It is, after all, an identity
function <https://en.wikipedia.org/wiki/Identity_function>.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1326 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADQ4V2P_madeyCv_Hr2GjuKQSTzIpfKvks5rbvoVgaJpZM4LwT_k>
.
|
@fadado - In explaining the identity filter, ".", it would be helpful to mention that it echoes each JSON value presented as input in turn. Indeed, in my opinion, the main area for improvement of the manual is explaining the stream-oriented aspect of jq. (See https://github.com/stedolan/jq/wiki/Advanced-Topics#streams) |
I like "all pass filter" in part because of its intuitive interpretation, but it has a lot of namespace conflict, and should someone google it they'd be given a lot of misdirection. Using "identity function" is a good choice, on par with "current working directory", and "current node". The problem is more of which analogy you want to go with. Analogizing with the shell would be the best choice IMO here, because of the synergy with explaining the pipe operator, similarity in formatting and operation. I agree that "null filter" would probably be a source of confusion. Using "input value" probably works, but then you also need to explain that it's largely unnecessary to provide an input value, since it's automatically interpreted/provided for you most of the time. |
@lylemoffitt - Obviously "all-pass filter" is clear to some, but even for those with a signal-processing background, might not the bit about phase change be a potential source of confusion? More importantly, two of the primary meanings of "pass" are:
(Source: https://www.ahdictionary.com/word/search.html?q=pass) |
@pkoppstein -- We are in agreement. That's pretty much what I was getting at. Though, your point about "pass" is important, too. I was thinking from a more common understanding, e.g. "all things pass through it". Either way, it's probably not a good way to go. Drawing analogies to the shell and imperative/functional languages are probably the safest bets. |
@fadado |
I do like some of the suggestions here. Certainly a table of operator precedence would be nice, and some of the "laws" that @fadado proposes would be useful to include. I too would rather not "name everything". For now anyways. |
I'm inclined to agree with this, as there are more important issues IMO, but the push for Stack Overflow kinda necessitates that we have common pronounceable names for all the fundamental operations in jq. The rest of the discussion about what to call them should be focused on how to explain them first, and then suggest alternative operator names only by way of analogy. Currently, all of the functions are easily searchable, and most of the operators actually have explicit names given. But, a (perhaps) surprising number are without names. Instead, they are repeatedly referred to as "the
I digress... The following are all taken from the man page. Type denotes what noun is used with a given symbol. The suggested name attempts to find something close to what people colloquially call the given operation, while also avoiding name conflicts and providing a minimum of specificity.
|
I'm not sure that classification as "syntax" vs. "operator" makes sense. It's all syntactic. Some of these things are "operators" in the mathematical sense, but maybe all of them are (except for |
There's also |
Yup. Totally missed
Which? I believe, all remaining operators have explicit names already provided in the manual. It's not super obvious, but it is there or in the context. Double checking, these are the exceptions:
A table would be nice, but I think these names should also be put in the section labels. This is clear and consistent with the other operators that are named, like Addition and Array construction. |
@lylemoffitt Well, there's also the array-collect operator ( I'm going to have to learn whether the doc system supports tables... |
The
|
Looks like the answer is no (see ronn-format). Maybe another form would do? You could change the entry format for building the manpage to something like: f.puts "### #{entry['title']['symbol'] -- entry['title']['name']}\n" And change the yaml to match: entries:
- title:
- name: "Index Operator"
symbol: "`.[EXP]`"
body: |
You can also look up fields of an object using syntax like... |
I thought those were already named well enough by context, but thanks for adding them. Side note: The rules for what constitutes an acceptable |
Yes, there are places where not the full range of expressions is permitted, most notably the object constructor, for the subtle reason that it's impossible to avoid ambiguities in the grammar. Thanks for checking doc support for tables. Adding that is going to be a low priority for me for now, unless someone offers a PR. |
@nicowilliams -- If you're fine with my solution to the tables (or something like it), I can certainly put in that PR for you. I don't think we're set on the content yet, though. |
@lylemoffitt Can we get a preview of what a rendered manpage would look like? |
@lylemoffitt Er, actually, ronnformat does seem to support tables, since it claims that "[a]ll markdown(7) linking features are supported." |
That looks like they only have support for markdown's |
@lylemoffitt Oy, yes, I misread that. But elsewhere it says:
|
I tried it, and... no dice, ronn does not seem to support tables. rtomayko/ronn#99 |
Also, whatever is done for manpages has to work for the HTML-rendered manual as well. |
@lylemoffitt #1340 is a PR with some modest enhancements based on this issue and #1337. |
Exigent Question:
At any point in a jq script what does the filter
.
return? It may be easy for an experienced user, but it's not clear from the documentation. Put another way: what defines an expression? What delimits scope? The answers to these questions are implied, but not explicitly or clearly stated by the documentation. It's ironic that the dot filter is referred to the "least interesting filter", because it is the key to understanding the transformation of data through the script.Problems:
The man page doesn't really say a whole lot about parenthesis. They pretty much only show up in function signatures and in examples. Yet, they have a fundamental relationship with the dot filter, and thus a critical role in the functioning the script. Their usage should be clarified. It would also be helpful to clarify their relationship with the object constructors,
[]
and{}
, as all three are used to create sub-expressions and return objects.The easy thing to do here would be to just create a section where you define
()
as an expression operator or scope operator, and then stick all the missing explanation there. This might solve the immediate issue, but you could do a lot better. I'm trying to stick with one problem here, but in general the manual could be a lot clearer. I don't know if you're trying to intentionally hide that jq is a full-blown language, but it would certainly be a lot cleaner if you approached explaining the query language like it was the pure-function programming language it is.Suggested Solutions:
Define the operator
()
as a Value Constructor and put it in the Types and Values section. It constructs a value from the output of the contained expression. The only thing that would be needed to be changed about its existing functionality in order to bring it in line with the other constructor operators is that it must also work when the expression is empty. Analogous to[]
and{}
, this should be implemented to construct anull
value.Example:
Add a section Operator Precedence and Expression Evaluation (or something to that effect) with the following:
Define how filters and operators are composed into expressions and how the expressions are applied to the input JSON string to create the output JSON string. An explicitly codified type-transform like the following (written in pseudo-Haskell) would be one way to do it and be enormously helpful in terms of reasoning about a jq script.
Define operator precedence. I know it's basically just left to right and parenthesis first, but it's important to explicitly state these things. This is where the type-transform will come in handy again, because it help elucidate why different sets of operators have different semantics. For example, constructors (like
[]
and{}
), which are called operators, are actually closures. This explains why they have totally different semantics.Define scoping rules. The effect of
()
on scope is briefly mentioned in the Variables sub-section, but never talked about directly. The relationship between constructors and scope is never mentioned at all. Discussion of the relationship between.
and the concept of scope should also be discussed. Again, closures will help here.The text was updated successfully, but these errors were encountered: