-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast attempt at serialization, using SimpleValue #2739
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
1 Ignored Deployment
|
Apply Sweep Rules to your PR?
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2739 +/- ##
==========================================
- Coverage 71.31% 70.15% -1.16%
==========================================
Files 118 119 +1
Lines 6581 6748 +167
Branches 1368 1436 +68
==========================================
+ Hits 4693 4734 +41
- Misses 1880 2006 +126
Partials 8 8 ☔ View full report in Codecov by Sentry. |
Connects to #2563 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this adds JSON
in Squiggle language that turns Value
s into other simplified Value
s.
Without commenting on the implementation (which can be cleaned up, especially regarding TypeScript types):
JSON(...)
an alternative toSqValue.asJS
, I'm not sure which one is better, can see some benefits from doing it this way, there's a lot to unpack here (do we need the simple value to be available in Squiggle? do we mostly use it in DB and access it throughSqValue
?)- technically, JSON is a text format, so I initially expected that
JSON(...)
function would return a string; otherwise,toSimpleValue
name is more proper - OTOH, I don't like the idea of serializing things to strings only to parse them to JS objects again, so I'm happy that this new function doesn't do it
Another observation/reminder is that for storing things in the database, the serialization format won't be very human-readable.
One reason for this is lambdas; serialized lambdas would look like a deeply nested tree of AST nodes, at best, and eventually will look like a byte code.
Second, more important reason, is that Squiggle values are not trees but directed graphs without cycles, and serializing them to trees is too expensive; so we'll have to store them as lists of values which refer to each other by ids in that list.
So it's better to treat serialized values as black boxes. This is an argument against having JSON(...)
as a function in Squiggle: its output won't be very useful for viewing in the playground.
(Addressing your comment) I imagine we could also have stringify methods. Maybe we have:
I think the main use wouldn't be to deal with functions as much as to deal with distributions/calculators/plots and so on. Many of these map 1-1 with some JSON-like representation (minus lambdas, sometime). Some potential uses:
There is one interesting distinction to consider, that I'm not sure about. My guess is that we'll want to wrap Values, with Type information. Like,
The downside here is that this makes reading them in the Squiggle viewer a bit annoying, but the upside is that this would allow us to fully reconstruct them (minus lambdas), without otherwise knowing the type. |
Quick thought - this could also help with testing. As in, we test against the JSON(Calculator), as that would give us more information than our regular Calculator toString. |
* main: (180 commits) support 0.9.0 in ModelExportPage Bump versions after 0.9.0 release fix exec 0.0.1-0 print exec logs (should help with github releases) don't ignore versioned-components (it's private anyway) lint fix reformat changelogs improve changelog update vscode changelog entry fix unexpected character < update lockfile bundle tailwind-generated css with vscode ext simplify tailwind config for packages that use versioned-components Version Packages skip empty changesets cleanup useless changeset headers minor changelog script improvements rephase and update changesets ignore all vscode workspace files ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several things I'd change in this PR, but I'm not sure if it's worth the delay.
(I'll write these here instead of inlining, sorry; if I'll try to comment on the entire diff, this will be too long and I'd go into too many details)
Value
andSimpleValue
types could be converted on TypeScript level with conditional types, then we could replaceJSType
generic parameter inSqValue
with it, it's a bit complicated typewise, but I expect the result would be quite nice- Most of
any
types insimpleValue.ts
, I'd either replace withunknown
or improve in other ways - Maybe we don't need
ImmutableMap
in simple values, could be plain JS objects or builtin Maps? we have to useImmutableMap
inVDict
values because we want to modify Squiggle dicts quickly, but I expect that simple values will always be read-only. - I don't think you ever use
SqAbstractValue.toSimple
, and it could be moved toasJS
(like, makeasJS
non-abstract, then override it in some subclasses)
Some or all of these can be done when this PR is merged, it's not a big deal.
Another concern is that you've tagged custom object types with vType
, but dicts can have this field too; I see that you don't try to deserialize custom objects in toValue
, so I guess this format isn't stable yet and we can change it later?
But the only options that aren't ambiguous that I see here are:
- wrap all dicts in
{ type: "Dict", value: ... }
- or, replace
vType
with JS symbol; but this would work only forSimpleValue
, not for serialized JSON
I know there's also a third option, "reserve a rarely used Dict key for this", like __SQUIGGLE_TYPE__
, but I'd recommend against that workaround. It's hard to predict how serialization will be used, and doing this would mean that someone in the far future will be able to, for example, crash someone's webpage by passing this unique key, with unforeseeable consequences. (In other words, I hope that Squiggle Dicts will stay close to JS Maps in semantics, not JS Objects, which do have warts like this).
One last small request: toValue
-> simpleValueToValue
, fromValue
-> simpleValueFromValue
, etc. Short names look too generic when I look at them out of context, e.g. in SqValue/index.ts
which also has wrapValue
that's unrelated.
I'd also be fine with fromValue
being a method (toSimple()
) on Value
objects and toValue
being a static BaseVallue.fromSimple()
method, but that might entangle values and simple values too much.
* main: (31 commits) Hot fix for method scale remove import assertion Added back scale default constants Added changeset ScaleShift -> Method Added back defaults to SqScale First pass on refactoring Scale convert textmate-grammar to a public package; highlight squiggle code in markdown MarkdownViewer uses shiki instead of react-syntax-highlighter Finishing touches to Tag.doc Changing @description to @doc Fixes from CR fix version order; fix insertVersion script Fixed minor import error Refactored text size and color data to be in MarkdownViewer support decorators on exported vars Update packages/website/scripts/compileDocsForLLM.js Update tag.ts Final touches to LLMPrompt page Added LLMPrompt page ...
I want to do a few things:
similar.
This code does a first pass at (1). (1) makes (2) very easy, so that would be easy to do on top of this.
I'm unsure about how we want to go about the rest of this. Curious to get takes. My recommendation is that we keep it simple, even if kind of ugly, at this point. I'm okay with the syntax changing a bit later, my main goal is to expose a lot. Ideally many of the items in (1) could easily be de-serialized, but that's further work.
If we have (1) and (2), we could later have optional views for both, in our app.