Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace jsonnet with go-jsonnet #144

Closed
ramaro opened this issue Sep 3, 2018 · 31 comments
Closed

replace jsonnet with go-jsonnet #144

ramaro opened this issue Sep 3, 2018 · 31 comments
Assignees
Labels
gsoc GSoC ideas hard difficulty: hard

Comments

@ramaro
Copy link
Member

ramaro commented Sep 3, 2018

No description provided.

@uberspot uberspot added gsoc GSoC ideas medium difficulty: medium labels Jan 22, 2019
@mithil2311
Copy link

can you explain about this problem statement?

@adrianchifor
Copy link
Member

@mithil2311 We are currently using the direct Python library which uses the C++ version of jsonnet. To increase the speed of jsonnet compilation, we need to move to the Go version https://github.com/google/go-jsonnet and create Python bindings for it.

@sbarzowski
Copy link

(Jsonnet dev here)

FYI basic C bindings for go-jsonnet are probably going to be merged soon (PR: google/go-jsonnet#257). The main missing part is the support for native functions. Other than that it's a drop-in replacement.

My understanding is that the Python bindings just use C bindings internally, so no changes outside of setup.py should be necessary.

@uberspot uberspot added easy difficulty: easy and removed medium difficulty: medium labels Mar 8, 2019
@ramaro ramaro added hard difficulty: hard and removed easy difficulty: easy labels Mar 21, 2019
@yoshi-1224
Copy link
Contributor

HI @sbarzowski,

Please excuse my lack of understanding. Could you kindly explain what you mean by

the main missing part is the support for native functions

@sbarzowski
Copy link

sbarzowski commented Mar 24, 2019

Sure. I meant that this function from the libjsonnet.h is not implemented yet in the bindings:

/** Register a native extension.
 *
 * This will appear in Jsonnet as a function type and can be accessed from std.nativeExt("foo").
 *
 * DO NOT register native callbacks with side-effects!  Jsonnet is a lazy functional language and
 * will call your function when you least expect it, more times than you expect, or not at all.
 *
 * \param vm The vm.
 * \param name The name of the function as visible to Jsonnet code, e.g. "foo".
 * \param cb The PURE function that implements the behavior you want.
 * \param ctx User pointer, stash non-global state you need here.
 * \param params NULL-terminated array of the names of the params.  Must be valid identifiers.
 */
void jsonnet_native_callback(struct JsonnetVm *vm, const char *name, JsonnetNativeCallback *cb,
                             void *ctx, const char *const *params);

Native extensions are a feature which allows calling your own C function from Jsonnet - I can see that Kapitan uses it for a few functions (https://github.com/deepmind/kapitan/blob/master/kapitan/lib/kapitan.libjsonnet).

The jsonnet_json_* family of functions (which is only useful for native functions) is also not implemented in the bindings.

The other missing parts unrelated to native functions are evaluation variants *_stream and *_multi and import callbacks (jsonnet_import_callback function).

Just to be clear, we definitely intend to have all of these implemented on go-jsonnet side so that it becomes a full drop-in replacement for libjsonnet (that is for evaluation, not code reformatting). Any help is ofc welcome and will significantly speed things up.

@yoshi-1224
Copy link
Contributor

@sbarzowski
Thanks for the detailed explanation! I now see what the challenges are. If I have time to work on this issue I'll try my best to help.

@vnzongzna
Copy link
Contributor

@yoshi-1224 Can I take this issue for SoC if you aren't planning to work on it full time?
I have some knowledge of golang that might help in implemention on go-jsonnet.

@yoshi-1224
Copy link
Contributor

Hi @vaibhavk
Of course! I'm not submitting a proposal for this, but even if I were there's no need to ask because it's not first come first served.

@ramaro
Copy link
Member Author

ramaro commented Oct 17, 2019

These have now been fixed google/go-jsonnet#322 google/go-jsonnet#323

@alldroll
Copy link

I wonder, how you are going to use go-jsonnet shared library:

And if you are going to use the second approach, should we wait for go-jsonnet to add this code for python or we are going to add it in this repository?

@ramaro
Copy link
Member Author

ramaro commented Oct 21, 2019

@alldroll ctypes should work just fine - however the second approach is ideal, in which case, I agree that that it makes sense to add it to go-jsonnet

@joulaud
Copy link

joulaud commented Apr 8, 2020

Hello, is there any progress on this front? The optimisations of go-jsonnet are badly needed in some contexts.

@ramaro
Copy link
Member Author

ramaro commented Apr 8, 2020 via email

@joulaud
Copy link

joulaud commented Apr 8, 2020

if needed we can help. If you have some work in progress somewhere we can take a look at it.

@xunleii
Copy link

xunleii commented Oct 21, 2020

@ramaro We are trying to integrate gojsonnet with Kapitan, as we need it to improve our compile time (our usage of Kapitan is growing every weeks). Our work is available here.
After trying GoJsonnet bindings in python, I found no issues with it. But, when used with Kapitan, it seems to have some sort of "lock" (can be easily tested using the commit https://github.com/radiofrance/kapitan/commit/c718cd50828e88470425ad3afbfa9ad34f126328 with the files in DRAFT-GOJSONNET).

After some research, I deduce that these "deadlocks" are due to the way python handles multiprocessing by using fork. These "locks" disappear when I run multiprocessing with the spawn context, but introduces another problem; global variables are not copied (because the spawn mode creates an empty process).

Unfortunately, I didn't have time to do more investigation and I'm not sure if I'm on the right way (I'm begining in Python).

EDIT: I am a teammate of @joulaud

@ramaro
Copy link
Member Author

ramaro commented Oct 21, 2020

Hey @xunleii great to hear you're using Kapitan!

After some research, I deduce that these "deadlocks" are due to the way python handles multiprocessing by using fork. These "locks" disappear when I run multiprocessing with the spawn context, but introduces another problem; global variables are not copied (because the spawn mode creates an empty process).

That pretty much sums my findings and I started working improving multiprocessing but ran into exactly the same issues, but unfortunately haven't finalised it yet.
I will have a look at your work. Thanks

@sbarzowski
Copy link

Could it have something to do with this: google/jsonnet#814?

(It's not yet ported to go-jsonnet.)

@joulaud
Copy link

joulaud commented Dec 11, 2020

GIL handling was ported to go-jsonnet in last release
cf. google/go-jsonnet#474
and https://github.com/google/jsonnet/releases/tag/v0.17.0

@sbarzowski
Copy link

Yes. Unfortunately, it seems we still have some elusive deadlock issues: google/go-jsonnet#484. I wasn't able to dig deeply enough into it yet.

If anyone can share an idea about what that might be about, it will be really appreciated!

@joulaud
Copy link

joulaud commented Dec 13, 2020

When I took a stack-trace on the deadlock produced by @xunleii test-case above I find the exact same stack than in go-jsonnet ticket including Go runtime.gcStart, which obviously must wait for some other goroutine but there is only one thread in the affected process.

After digging a little I was arriving to the same conclusion as @githubaccount888 on google/go-jsonnet#484 (comment).
Go runtime is multithreaded by construction and have a concurrent mark-and-sweep GC. And trying to use multithread with python multiprocessing fork method instead of spawn being a pool full of shark I think that not-withstanding any improvement on go-jsonnet binding we must make kapitan work with the spawn method to have some chance to make it working reliably.

What are the real blockers to make this work? @xunleii you say there is some missing global variables in the spawn cases. Do you know what they are? Would it be difficult to add them to the "worker" closure created by "functools.partial" at https://github.com/deepmind/kapitan/blob/master/kapitan/targets.py#L101-L108 ?

EDIT: typo s/being of/being a/

@xunleii
Copy link

xunleii commented Dec 14, 2020

Unfortunately, I need to investigate another time to found them.

@joulaud
Copy link

joulaud commented Dec 17, 2020

I tried to advance on this once again.
By using the "spawn" method of multiprocessing I stumble on problems with callbacks I did not investigate further.

I just bypassed the whole multiprocessing thing in order to test with

diff --git a/kapitan/targets.py b/kapitan/targets.py
index bb952b0..9d3adbd 100644
--- a/kapitan/targets.py
+++ b/kapitan/targets.py
@@ -123,7 +125,9 @@ def compile_targets(
 
         # compile_target() returns None on success
         # so p is only not None when raising an exception
-        [p.get() for p in pool.imap_unordered(worker, target_objs) if p]
+        for x in target_objs:
+            worker(x)
 
         os.makedirs(compile_path, exist_ok=True)
 

and it mostly works with a lot of performance improvements.

On my test case target compilation itself is around 10 to 25 times faster with go-jsonnet than with cpp-jsonnet (my typical target takes around 5s to 20s with cpp-jsonnet and arount 0.10s to 0.80s with go-jsonnet).

This means than even disabling multiprocessing when launching kapitan compile on my 55 targets, I am better off with go-jsonnet and without multiprocessing than before.

  • cpp-jsonnet and multiprocessing: 228,19s user 1,95s system 337% cpu 1:08,28 total
  • go-jsonnet no multiprocessing: 53,92s user 0,95s system 112% cpu 48,861 total

This is not the case on single target compile as there seems to be some heavier startup time.

Note: I have problem with the yaml_load callback and I just disabled everything relying on it on my tests, I still must investigate on those callback problems.

@ramaro
Copy link
Member Author

ramaro commented Jan 8, 2021

Hey @joulaud happy new year!

I have very similar observations - disabling multiprocessing for compilation is actually faster and having it enabled hangs compilation when using go-jsonnet. I have yet to assess this better, but it feels to me that a future without multiprocessing (at least for compilation) will be much simpler to develop with.

I have committed my latest changes for go-jsonnet support at https://github.com/ramaro/kapitan/tree/gojsonnet
It disables multiprocessing as you suggest but also adds caching for go-jsonnet import paths and a new component to simply test the yaml_load() callback. Tests pass on Linux running Python 3.8.

Let me know if this works for you!

@xunleii
Copy link

xunleii commented Jan 19, 2021

Hi @ramaro!

Thank you very much for this New Years gift. I have been using it for a week and it works pretty well. The compilation works much better than before (16s to 0.50s on average), without making any changes after compilation.
Unfortunately on larger targets I didn't see huge differences but it's probably due to the Jsonnet code which can be optimized on my end (or a misconfiguration of which Kapitan I use).

EDIT: It's a misconfiguration on my side (I have both kapitan and for the largest one, I've used the "old" kapitan)

@uZer
Copy link

uZer commented Jan 19, 2021

From all the team and @joulaud, please accept our warmest thanks @ramaro !

@joulaud
Copy link

joulaud commented Jan 25, 2021

I repeat the thanks to @ramaro This change is a life-saver for our daily workflow and we just migrated fully to your branch.

What would be the next step to make this branch in state of being mergeable for next release of kapitan?
Can we help in anything?

@ramaro
Copy link
Member Author

ramaro commented Feb 25, 2021

@joulaud @uZer @xunleii hey apologies, I haven't been able to look at this further. I did spent sometime a few weeks ago on this and on my current infra (~140 jsonnet targets), kapitan with go-jsonnet seems to run out of memory.
So while I'm not sure go-jsonnet is ready for primetime yet (more debugging pending, could be an issue with Kapitan), maybe what we can do is have kapitan load go-jsonnet instead of jsonnet if the go-jsonnet module is present. So you could use it and help figure out any issues. How does that sound?

@sbarzowski
Copy link

If you come across anything that can be reproduced with go-jsonnet (either the command or the Python bindings), please let me know!

@Jean-Daniel
Copy link
Contributor

@joulaud @uZer @xunleii hey apologies, I haven't been able to look at this further. I did spent sometime a few weeks ago on this and on my current infra (~140 jsonnet targets), kapitan with go-jsonnet seems to run out of memory.

I did just try to compile 25 full kube-prometheus stacks and didn't notice any unbound increase in memory usage. The memory usage climb around 500MB in 4 or 5 runs and then remain at this level until end of compilation. This is with golang 1.16 and default gojsonnet release (pip install gojsonnet).

maybe what we can do is have kapitan load go-jsonnet instead of jsonnet if the go-jsonnet module is present. So you could use it and help figure out any issues. How does that sound?

That would be a great way to let more people experiment with it.

@pvanderlinden
Copy link
Contributor

@ramaro It would be great to get the option to use go-jsonnet. I'm currently in the progress of adding kube-prometheus to our stack to replace the old prometheus install, but I'm not to happy about the 2 minute extra compile time. I tested it and it solves my issue at least. Can I help you by forking your branch and make it automatically import gojsonnet over jsonnet if available so we can get it merged and release?

@pvanderlinden pvanderlinden mentioned this issue Apr 2, 2021
@MatteoVoges
Copy link
Contributor

resolved by #753

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gsoc GSoC ideas hard difficulty: hard
Projects
None yet
Development

No branches or pull requests