Object field caching #280

sbarzowski · 2019-05-25T10:19:21Z

So far I did this in a way which is intended to be minimally
invasive, even at the cost of worse performance and some weirdness.
Most importantly valueCachedObject should probably replace valueObject
interface to avoid some indirection (and it shouldn't be at the same
"level" as valueSimpleObject/valueExtendedObject (these should no longer
even be values on their own).

This is a reasonable proof of concept, which we can try benchmarking
on the real world code.

sbarzowski · 2019-05-25T10:22:02Z

Related: google/jsonnet#663

I think we can do object local caching in another change after this gets finished.

coveralls · 2019-05-25T10:25:12Z

Coverage increased (+0.06%) to 77.406% when pulling 7e5d7ff on sbarzowski:object-caching into 4996d46 on google:master.

davidreuss · 2019-06-06T13:43:03Z

We have a fairly large corpus of configuration that takes ~5s to render -- I read the article on sjsonnet (https://databricks.com/blog/2018/10/12/writing-a-faster-jsonnet-compiler.html), and realized that object caching might be what is causing our configuration to be slow to render.

We got a few native functions so we can't apply sjsonnet directly as a measure, but .. there's very real benefits with this patch for us so, that's interesting to pursue.

We go from ~4.5s to about ~2.8s with this patch applied, and it makes a noticeably difference when "developing" on the config, and trying to expand the entire thing.

Is there anything hindering this from being merged?

sparkprime · 2019-06-06T14:02:52Z

Thanks for describing your situation. I haven't reviewed it yet but it's high up my priorities. I don't suppose your config is public is it? :)

davidreuss · 2019-06-06T14:42:49Z

No it’s not — it’s non prod sensitive though (it’s generating a complete environment for our ci/e2e testing) so we can possibly figure something out if it would be of value in determining if these perf changes are working on a non-trivial config

we do have a couple of custom functions and currently we’re using an in-house wrapper for that but we can maybe figure something out if you’re interested .. might also be able to strip/customize some specific bits so a native jsonnet evaluator could be used ..

sparkprime · 2019-06-06T15:10:24Z

I suspect it'd be possible to swap out the native bits with hardcoded values or jsonnet functions without changing the performance of it

sparkprime · 2019-06-06T15:10:42Z

Also might be possible to anonymise it by changing all the strings to "DDDDDDDDDDDDD" or whatever

davidreuss · 2019-06-06T18:22:21Z

i’ll get back to you when i have something simplified to try things out with :)

sbarzowski · 2019-06-07T09:21:59Z

FYI I expect to finish the refactoring and drop WIP mark today or tomorrow.

davidreuss · 2019-06-07T09:46:17Z

I've put together a container which allows expanding our configuration with plain command line

I tried just for kicks both with the current go version and with the c++ version ...

c++

$ time ./expand.sh >/dev/null
./expand.sh > /dev/null  24.18s user 1.19s system 99% cpu 25.562 total

go

$ time ./expand.sh >/dev/null
./expand.sh > /dev/null  4.07s user 0.20s system 139% cpu 3.069 total

I'm trying to figure out how much i'm allowed to share from the configuration, or finding other ways of anonymizing the data

sbarzowski · 2019-06-07T22:05:09Z

value.go

+	valueBase
+	assertionError error
+	cache          map[objectCacheKey]value
+	underlying     objectBody


Any ideas for a better names than "underlying" and "objectBody"?

We could rename the top-level one valueCachedObject and then the other one could just be object or valueObject or uncachedObject. I don't really like the word "body" as it usually implies the bit that isn't a header but there's no header here.

The objectBody, so valueObject is out of the question. I like the uncachedObject name the most.

So I renamed objectBody to uncachedObject and underlying to uncached. I left valueObject as it was, because I think it's clear enough - it is the only representation of object values (uncachedObject is not a value on its own).

sbarzowski · 2019-06-07T22:14:03Z

value.go

@@ -647,10 +657,20 @@ func objectIndex(i *interpreter, trace TraceElement, sb selfBinding, fieldName s
 		return nil, i.Error(fmt.Sprintf("Field does not exist: %s", fieldName), trace)
 	}

+	if val, ok := sb.self.cache[objectCacheKey{field: fieldName, depth: foundAt}]; ok {


Here, in this function the actual caching happens.

sbarzowski · 2019-06-07T22:15:36Z

Ready for review

davidreuss · 2019-06-16T18:51:31Z

I’ll try and apply this tomorrow and test it out... i’ll also see if i can anonymize our code so you can see for yourselves, if you got time for it

sbarzowski · 2019-07-21T12:46:52Z

@sparkprime Are we going forward with this?

sparkprime · 2019-07-21T13:01:38Z

Yeah I started reviewing it last week

…

On Sun, Jul 21, 2019, 13:46 Stanisław Barzowski ***@***.***> wrote: @sparkprime <https://github.com/sparkprime> Are we going forward with this? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#280?email_source=notifications&email_token=AABJBXRXET3EZP7R3CQDJX3QARLD3A5CNFSM4HPUKIB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2OCYSQ#issuecomment-513551434>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABJBXSZJICBSGGO7X6UBL3QARLD3ANCNFSM4HPUKIBQ> .

sparkprime · 2019-07-23T17:21:42Z

value.go

 }

-func (*valueObjectBase) getType() *valueType {
-	return objectType
+func (obj *valueObject) inheritanceSize() int {


Does this ever get called on the top level (or if it does, do you happen to have the inside object readily available)?

It does, but only in context where we are playing with the object internals anyway, so we can reach for the uncachedObject. I removed it.

It comes with an added advantage - valueObject no longer conforms to uncachedObject interface - so they cannot be passed there by mistake (and there were some places where it was unnecessarily passed instead of uncachedObject). It is now fixed.

Thanks for noticing this.

sparkprime · 2019-07-23T17:25:01Z

LGTM

This change adds caching to objects fields, i.e. now subsequent references to an object field are going to be served from cache. Cache is kept within an object. Objects created with operator + start with a clean cache (they have to, because in general all the fields may have changed their values due to late binding). This change comes naturally with a change of structure of objects, now valueObject is a concrete struct which keeps "uncachedObject" which is roughly equivalent to old objects.

sparkprime · 2019-07-23T19:56:42Z

Alright, merge it when you're ready. Nice job!

googlebot added the cla: yes label May 25, 2019

sbarzowski closed this Jun 7, 2019

sbarzowski force-pushed the object-caching branch from 4c26dac to 33b6dcf Compare June 7, 2019 22:01

sbarzowski reopened this Jun 7, 2019

sbarzowski commented Jun 7, 2019

View reviewed changes

sbarzowski changed the title ~~[WIP] Object field caching~~ Object field caching Jun 7, 2019

sbarzowski mentioned this pull request Jun 17, 2019

Why is jsonnet so slow ? google/jsonnet#672

Closed

sparkprime reviewed Jul 23, 2019

View reviewed changes

sbarzowski force-pushed the object-caching branch from 00c8c77 to 7e5d7ff Compare July 23, 2019 19:43

sbarzowski merged commit e5e27c0 into google:master Jul 23, 2019

sbarzowski mentioned this pull request Jul 24, 2019

Fix crash when using empty object comprehension #301

Merged

sbarzowski mentioned this pull request Feb 2, 2020

Object field caching (discussion needed) #113

Closed

moleike mentioned this pull request May 2, 2021

Object field caching moleike/haskell-jsonnet#30

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Object field caching #280

Object field caching #280

sbarzowski commented May 25, 2019

sbarzowski commented May 25, 2019

coveralls commented May 25, 2019 •

edited

davidreuss commented Jun 6, 2019

sparkprime commented Jun 6, 2019

davidreuss commented Jun 6, 2019

sparkprime commented Jun 6, 2019

sparkprime commented Jun 6, 2019

davidreuss commented Jun 6, 2019

sbarzowski commented Jun 7, 2019

davidreuss commented Jun 7, 2019 •

edited

sbarzowski Jun 7, 2019

sparkprime Jul 23, 2019

sbarzowski Jul 23, 2019

sbarzowski Jun 7, 2019

sbarzowski commented Jun 7, 2019

davidreuss commented Jun 16, 2019

sbarzowski commented Jul 21, 2019

sparkprime commented Jul 21, 2019 via email

sparkprime Jul 23, 2019

sbarzowski Jul 23, 2019

sparkprime commented Jul 23, 2019

sparkprime commented Jul 23, 2019

Object field caching #280

Object field caching #280

Conversation

sbarzowski commented May 25, 2019

sbarzowski commented May 25, 2019

coveralls commented May 25, 2019 • edited

davidreuss commented Jun 6, 2019

sparkprime commented Jun 6, 2019

davidreuss commented Jun 6, 2019

sparkprime commented Jun 6, 2019

sparkprime commented Jun 6, 2019

davidreuss commented Jun 6, 2019

sbarzowski commented Jun 7, 2019

davidreuss commented Jun 7, 2019 • edited

sbarzowski Jun 7, 2019

Choose a reason for hiding this comment

sparkprime Jul 23, 2019

Choose a reason for hiding this comment

sbarzowski Jul 23, 2019

Choose a reason for hiding this comment

sbarzowski Jun 7, 2019

Choose a reason for hiding this comment

sbarzowski commented Jun 7, 2019

davidreuss commented Jun 16, 2019

sbarzowski commented Jul 21, 2019

sparkprime commented Jul 21, 2019 via email

sparkprime Jul 23, 2019

Choose a reason for hiding this comment

sbarzowski Jul 23, 2019

Choose a reason for hiding this comment

sparkprime commented Jul 23, 2019

sparkprime commented Jul 23, 2019

coveralls commented May 25, 2019 •

edited

davidreuss commented Jun 7, 2019 •

edited