Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jq integration #2319

Closed
wants to merge 11 commits into from
Closed

jq integration #2319

wants to merge 11 commits into from

Conversation

tzz
Copy link
Contributor

@tzz tzz commented Sep 16, 2015

see https://dev.cfengine.com/issues/7435

Here's a minimal example to test things:

bundle agent main
{
  vars:
      "items" data => '{"user":"stedolan","titles":["JQ Primer", "More JQ"]}';
      "items2" data => mapdata("jq", "{user, title: .titles[]}", items);
      "items2_str" string => format("%S", items2);

  reports:
      "we have item $(items)";
      "we have items2 $(items2_str)";
}

output:

R: we have item stedolan
R: we have items2 [{"title":"JQ Primer","user":"stedolan"},{"title":"More JQ","user":"stedolan"}]

@tzz
Copy link
Contributor Author

tzz commented Sep 16, 2015

@jimis I'm not sure how to proceed, can you take a look?

@cfengine-review-bot
Copy link

Can one of the admins verify this patch?

@jimis
Copy link
Contributor

jimis commented Sep 17, 2015

Argh, linking statically is not an easy job. Both CFEngine and jq are using yacc and lex it seems:

3rdparty/jq-1.5/parser.y:131: multiple definition of `yyerror'
jq-1.5/parser.y:145: multiple definition of `yylex'
jq-1.5/parser.c:2066: multiple definition of `yyparse'

In this case I don't think it's worth making them cooperate.

If I abandon this path, the only other way is to link dynamically to libjq.so, which means that we should also install libjq.so, which also means that we should prefer system's libjq.so if available, as I commented on my other comment.

@jimis
Copy link
Contributor

jimis commented Sep 18, 2015

@tzz you think you can try patching CFEngine in order to change the lexer/parser keywords, for example yyerror() to cf_yyerror(). I'm quite sure this is supported as simple as a configuration option to our lex and yacc files. If this proves easy, I can help you even further in the static build route.

In addition, can you see if there is a way to pass down options to the ./configure execution in jq's directory? The following options shall be needed --disable-shared --with-pic to force static linking, and --disable-maintainer-mode in order to not require bison.

Dynamic linking seems to work for me, and "make install" installs both libpromises and libjq. But I'm skeptical of conflicts with systems that already have libjq installed.

@tzz
Copy link
Contributor Author

tzz commented Sep 26, 2015

I think static linking is better for CFEngine specifically. We don't want to lose core functionality by surprise.

@tzz tzz force-pushed the feature/jq branch 2 times, most recently from 41a3255 to b5e1f07 Compare September 26, 2015 00:27
@tzz
Copy link
Contributor Author

tzz commented Sep 26, 2015

I imported the actual Git contents of jq 1.5 and get the ./configure error:

configure: WARNING: no configuration information is in 3rdparty/jq-1.5

I had to run setup.sh manually inside 3rdparty/jq-1.5. That built the library and the executable.

I was then able to immediately build evalfunction.c and the rest of CFEngine without changing the configure options or the Bison prefix or doing anything else special. I don't know what Autoconf did internally but it seemed to work: the jv_* functions were called properly. So this will almost certainly require more work from @jimis but I'm unblocked to work on the actual jq integration now.

@tzz
Copy link
Contributor Author

tzz commented Sep 26, 2015

The example at the beginning was updated to show a successful invocation. This is very exciting!

The current data passing mechanism is inefficient, dumping the data to JSON back and forth. I'd love for someone else to hack on that.

The function always returns an array. I think that's OK. The real jq can return multiple JSON documents so it doesn't really map here.

Still missing: acceptance tests and docs. If the current state of the code is acceptable I can write those next.

@jimis
Copy link
Contributor

jimis commented Oct 26, 2015

As I commented in the ticket, I'm not so sure we want such a huge library imported in our source tree. It is a big bunch of code to import unreviewed, and it's not really used everywhere to just trust it.

@tzz
Copy link
Contributor Author

tzz commented Dec 2, 2015

@jimis I need your approval of the pipe approach here or in the ticket; then I will resubmit this PR and we can proceed.

@@ -2775,6 +2847,22 @@ static FnCallResult FnCallMapData(EvalContext *ctx, ARG_UNUSED const Policy *pol
return FnFailure();
}

if (mapdatamode && jqmode)
{
// TODO: make this customizable
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need advice on how to make this customizable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the jq options should also be customizable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another proposal I made in the ticket https://dev.cfengine.com/issues/7435 was to support the syntax mapdata("|/usr/bin/jq", ...) which would make this moot. The user decides what to pipe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what kind of customizations you would like to have in the jq command line. For start, this looks OK. The pipe thing you brought up on the ticket is obviously the most extensible, but also the ugliest, it's up to policy writers to decide ... ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, some people may not have it in /usr/bin :)

Maybe it can be in def.jq so you can override it from def.json, if the general pipe proposal doesn't fly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm you are right. Actually it would be preferable if we were only executing "jq" and having the system look it up in PATH. Unfortunately cf_popen_full_duplex() uses execv(), what I think would make sense would be to use execvp() instead. Quoting the man page:

The execlp(), execvp(), and execvpe() functions duplicate the actions of the shell in searching for an executable file if the specified filename does not contain a slash (/) character.

Maybe change cf_popen_full_duplex() or duplicate it in an alternative? Do you think this would cover most customization issues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it should be enough for now. I just changed cf_popen_full_duplex() in a backwards-compatible way.

@tzz
Copy link
Contributor Author

tzz commented Dec 21, 2015

@jimis @kacf @estenberg I didn't hear back for months, so I've rewritten this using pipes anyway.

The change is to:

  • move all the pipe-related code to pipes_unix.c from package_module.c and change options and function names accordingly (@pasinskim please check). NB: this also needs whitespace cleanup badly, lots of trailing space. I did some minor code cleanups here but it was mostly just moving things.
  • write an implementation that runs /usr/bin/jq (which should be customizable, advice welcome) with the right parameters to make the simple test above pass. It uses the pipe code mentioned above.

I can split the pipe code move from the rest of the commit if you want.

Note that this is now a very small change as far as jq is concerned, just a new function option. All the 3rdparty stuff is gone. So reviewing and maintaining this should be much easier.

@kacf
Copy link
Contributor

kacf commented Dec 22, 2015

I have asked @jimis to look at this again.

*response = res;
return 0;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kacf you might want to review these package module changes since it is your domain. The functionality was moved to generic functions in pipes_unix.c.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can break that into a separate commit if it helps.

@tzz
Copy link
Contributor Author

tzz commented Dec 22, 2015

I fixed those two minor issues (unused context and too-verbose log mesage).

@tzz tzz force-pushed the feature/jq branch 2 times, most recently from 41a9fae to e637e88 Compare December 22, 2015 16:34
@tzz
Copy link
Contributor Author

tzz commented Dec 22, 2015

Changed to just run jq instead of /usr/bin/jq as suggested by @jimis

@@ -267,11 +269,16 @@ IOData cf_popen_full_duplex(const char *command, bool capture_stderr)
CloseChildrenFD();

char **argv = ArgSplitCommand(command);
if (execv(argv[0], argv) == -1)
if (require_full_path && execv(argv[0], argv) == -1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the logic clear, something like the following or similar:

if (require_full_path)
  res=execv();
else
  res=execvp();
if (res == -1)
  Log("err (exec: %s)");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK; done.

@kacf
Copy link
Contributor

kacf commented Apr 4, 2016

trigger build

@kacf
Copy link
Contributor

kacf commented Apr 4, 2016

00_basics/def.json/state.cf.state.cf fails with:

R: DEBUG file_make: creating /home/jenkins/workspace/testing-enterprise-pr_core/label/PACKAGES_i386_linux_debian_4/core/tests/acceptance/workdir/__00_basics_def_json_state_cf/tmp/TESTDIR.cfengine/template with contents "{{%state}}"
R: DEBUG file_make: creating /home/jenkins/workspace/testing-enterprise-pr_core/label/PACKAGES_i386_linux_debian_4/core/tests/acceptance/workdir/__00_basics_def_json_state_cf/tmp/TESTDIR.cfengine/actual with contents ""
R: FILES DIFFER BUT SHOULD BE THE SAME
R: CONTENTS OF /home/jenkins/workspace/testing-enterprise-pr_core/label/PACKAGES_i386_linux_debian_4/core/tests/acceptance/workdir/__00_basics_def_json_state_cf/tmp/TESTDIR.cfengine/actual:
{
  "acl": [
    ".*mydomain note this will remain unexpanded!!!"
  ],
  "domain": "mydomain",
  "input_name_patterns": [
    ".*\\.cf",
    ".*\\.dat",
    ".*\\.txt",
    ".*\\.conf",
    ".*\\.mustache",
    ".*\\.sh",
    ".*\\.pl",
    ".*\\.py",
    ".*\\.rb",
    "cf_promises_release_id",
    ".*\\.json",
    ".*\\.yaml",
    ".*\\.js"
  ],
  "jq": "jq --compact-output --monochrome-output --ascii-output --unbuffered --sort-keys"
}

R: CONTENTS OF /home/jenkins/workspace/testing-enterprise-pr_core/label/PACKAGES_i386_linux_debian_4/core/tests/acceptance/./00_basics/def.json/state.cf.expected.json:
{
  "acl": [
    ".*mydomain note this will remain unexpanded!!!"
  ],
  "domain": "mydomain",
  "input_name_patterns": [
    ".*\\.cf",
    ".*\\.dat",
    ".*\\.txt",
    ".*\\.conf",
    ".*\\.mustache",
    ".*\\.sh",
    ".*\\.pl",
    ".*\\.py",
    ".*\\.rb",
    "cf_promises_release_id",
    ".*\\.json",
    ".*\\.yaml",
    ".*\\.js"
  ]
}

R: --- /home/jenkins/workspace/testing-enterprise-pr_core/label/PACKAGES_i386_linux_debian_4/core/tests/acceptance/./00_basics/def.json/state.cf.expected.json  2016-04-04 09:13:33.000000000 +0200
+++ /home/jenkins/workspace/testing-enterprise-pr_core/label/PACKAGES_i386_linux_debian_4/core/tests/acceptance/workdir/__00_basics_def_json_state_cf/tmp/TESTDIR.cfengine/actual   2016-04-04 09:27:03.000000000 +0200
@@ -17,5 +17,6 @@
     ".*\\.json",
     ".*\\.yaml",
     ".*\\.js"
-  ]
+  ],
+  "jq": "jq --compact-output --monochrome-output --ascii-output --unbuffered --sort-keys"
 }
R: /home/jenkins/workspace/testing-enterprise-pr_core/label/PACKAGES_i386_linux_debian_4/core/tests/acceptance/./00_basics/def.json/state.cf FAIL
R: augments policy = /home/jenkins/workspace/testing-enterprise-pr_core/label/PACKAGES_i386_linux_debian_4/core/tests/acceptance/./00_basics/def.json/state.cf
R: domain = mydomain

I assume this is because jq was added into that namespace.

In addition, 01_vars/02_functions/mapdata_json_pipe.cf is also failing on older Linuxes (RHEL 4 and Debian 4) as well as exotic Unixes, but I cannot tell why. There is no message at all, maybe it crashes? But then again I would have expected to see that in the output. Any ideas?

@tzz
Copy link
Contributor Author

tzz commented Apr 5, 2016

I fixed up the last commit to include the state.cf fix and rebased.

I don't know why this is failing on exotics, sorry. If you can't replicate it, I'll try to spin up one of those and test there, but it will take me a while.

@kacf
Copy link
Contributor

kacf commented Apr 7, 2016

I'll see if I can find anything on the problem.

@kacf
Copy link
Contributor

kacf commented Apr 8, 2016

This is getting stranger by the minute: It only fails if run in parallel with other tests, not on its own. It's quite consistently failing though.

@kacf
Copy link
Contributor

kacf commented Apr 8, 2016

At least I'm getting some output now:

----------------------------------------------------------------------
./01_vars/02_functions/mapdata_json_pipe.cf
----------------------------------------------------------------------
Broken Pipe
   error: Policy failed validation with command '"/home/jenkins/workspace/testin
g-kristian-privateBranch/label/PACKAGES_ULTRASPARC_SOLARIS_9/core/tests/acceptan
ce/workdir/__01_vars_02_functions_mapdata_json_pipe_cf/bin/cf-promises" -c "./01
_vars/02_functions/mapdata_json_pipe.cf"'
   error: CFEngine was not able to get confirmation of promises from cf-promises
, so going to failsafe
   error: CFEngine failsafe.cf: /home/jenkins/workspace/testing-kristian-private
Branch/label/PACKAGES_ULTRASPARC_SOLARIS_9/core/tests/acceptance/workdir/__01_va
rs_02_functions_mapdata_json_pipe_cf/inputs /home/jenkins/workspace/testing-kris
tian-privateBranch/label/PACKAGES_ULTRASPARC_SOLARIS_9/core/tests/acceptance/wor
kdir/__01_vars_02_functions_mapdata_json_pipe_cf/inputs/failsafe.cf
R: No public/private key pair is loaded, please create one by running "cf-key"
   error: Fatal CFEngine error: cf-agent aborted on defined class 'no_ppkeys_ABO
RT_kept'

Return code is 1.

  ==> FAIL (UNEXPECTED FAILURE)

Any idea is welcome here....

@kacf
Copy link
Contributor

kacf commented Apr 8, 2016

It's clear that cf-promises is crashing somehow, but the core file is corrupted, so I'm not able to a backtrace.

@jimis
Copy link
Contributor

jimis commented Apr 8, 2016

I see the Broken pipe message, and I recall that we are doing quite a bit of magic with stdin/out in our acceptance tests. Maybe this messes up the jq subprocess somehow?

@tzz
Copy link
Contributor Author

tzz commented Apr 8, 2016

The acceptance test doesn't use jq itself. It just does cat test.json to simulate it.

@kacf
Copy link
Contributor

kacf commented Apr 8, 2016

It certainly can be stdin/stdout related, but it's more likely it happens in CFEngine's own handling of those descriptors, since this test in particular makes use of them (by calling cat).

This was referenced Apr 11, 2016
@kacf
Copy link
Contributor

kacf commented Apr 12, 2016

I figured out the problem, see #2571 and #2572.

@tzz
Copy link
Contributor Author

tzz commented Apr 12, 2016

Nice, thank you for working on this and finding the issue. I really appreciate it.

@kacf
Copy link
Contributor

kacf commented Apr 20, 2016

Superseded by #2572.

@kacf kacf closed this Apr 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants