-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature branch: zsharp #30
Conversation
51cf4d7
to
b1e23b9
Compare
- bring in ZoK 0.7.6 libraries - port in diffs from old thirdparty - first-cut const and literal support - add toposort for includes so that we can resolve const values in next pass - statement handling, include-walking fix - type should cover u64 case, too - structs: store for now - const handling - flatten import map up-front to make later derefs easier to handle - stash note - need array consts, too - rough in const array support, I think - error message - small cleanups: change type of ZGen::constants and make comments more meaningful - generics in fn calls - small: better error message - need to resolve exprs as const in const_type_ - wip simple typing pass - ast visitor wip - zokrates ast visitor 1st cut - double check that we cover all product types - const typing visitor ; visitor error handling - reorg visit_files ; unification infra wip - unification infra wip - need to walk accesses for an assignment - walk_accesses wip - small q - refactor: data structs should be Hash(Hash), not flat - walk_accesses wip - monomorphize structs on LHS of declarations - check identifiers when monomorphizing structs - unification wip - unification wip - inline struct members must declare all fields - unification for array_initializer - unify postfix - unify fdef with call - ZExpressionTyper wip - ZExpressionTyper wip - ZExpressionTyper first cut - type equality first cut - type equality improved - update struct handling - stash note - zok_fe trivial - error msg fix - params in scope during stmt visiting - stash note - and_then rather than unwrap in flatten_import_map - handle special names (e.g. EMBED) in toposort - types for EMBED - EMBED is now a valid file in stdlib - rewriter should handle Types, too - need to rewrite literals on LHS of def stmts - iteration: fix type of iter var - for now, allow indexing with both Field and u* - add warning for indexing with Field - impl pow - don't add ITE when unconditionally assigning - oops, Uint takes bv_lit not pf_lit - simplify - fix arg order to bv_lit, improve new_<integral> functions - use constant folding for term::const_int - EMBED rework wip - EMBED rework wip - EMBED rework wip - EMBED rework wip - EMBED rework wip - prep for generic inf push - generic inf wip - generic inf wip - generic inf wip - struct generic inference - struct array inference wip - make const_expr_ return Result<> rather than blowing up - finish first-cut generic inf for Array - turn on fn generic inf - rework struct monomorphization code - add support for const asserts - process declarations lexically previously we'd processed all consts, then structs, then fns. now we want to add const initialization via function call, which is easiest to implement if we simply process declarations in-order as they appear in the file and require that all uses follow defs lexically. This seems to rule out mutual recursion, so we may want to revisit. - add const initialization via ternary - const fn call wip - ZGen interior mutability this lets us call expr, const_expr_ without mut ref. this seems to be the right way to go here. - small touches after rebase - reorganize zvisit code - generic lookup: should go through whole stack! (and not panic if stack is empty) - wip bugfixes - const_identifier_ should consult generics ; generic_lookup shouldn't traverse the whole stack - expr type inf for array accesses - small - const_stmt_ infra - const_stmt_ most var infra in place - cvar_assign handles AssigneeAccesses now - split type_ and const_type_ using const generics - cvar_assign: build up list of accesses before resolving them, so that we don't try to double-borrow cvars_stack - interpreter intf - zoki --- zok interpreter front-end - const_expr direct access vs through term impl - very quick zsharp readme - feature gate ILP back-end this makes it easier to build CirC on M1 macs (otherwise, need to build coin-or from source, which is not hard but is annoying) - rename zokrates to zsharp - heavy hitting stuff here - update z# readme - turn off ci for zsharp branch - small - add cfg to switch to bn254 curve - zsharp readme quick - struct consts - handle literals on LHS of const decls - typechecker: ! can take U*, too - refactor typechecking in InlineStructExpression handling, const_eexpr_ (this is going to move, though) - support inline structs in ZConstLiteralRewriter - really turn of ci this time, don't just induce a failure - don't build circ or opa_bench examples when 'lp' option is off ; fix example builds given renamed zokrates module - unify_inline_array: respect array dims! - better InlineArray len handling - check fn return type - small, plus a few tiny test cases - redefinition is an error - update thinking / status on uglinesses - explicit generic literals are U32 - remove redundant typechecking in InlineStruct const expr handling - array and struct equality - generic-in-generic const test - inline array and struct generic tests - small - sticky notes - stash note - small cleanup in zstmtwalker - support ZSHARP_STDLIB_PATH envvar in ZStdLib - get_field_size in EMBED ; field comparisons - test runner - hex literal fix - inconsistent array test - literals test - TODO quick update
- keep plugging away at revamped generic inference - generic inf refactor wip - generic inf rework wip - stash note about divrem - generic inf wip - build up one term rather than a vec; walk struct members; remove old crap - small - generic inf rework wip - generic inf refactor wip - partially hook up new generic inf - zgenericinf: invoke solver, return result - ZGenericInf hooked up - stash a note - find_unique_model function - enable incremental mode for find_unique_model - hook up find_unique_model in zgenericinf
- unify function_call and const_function_call - unify stmt and const_stmt - type_impl_ returns Result<> - unify expr and const_expr
@alex-ozdemir I think this is getting to the point where we can think about merging it. This will also avoid continuing breakage as the branches evolve away from one another. I've just rebased onto master, but I haven't gone through all the testing yet. Probably have to do some CI fixes before we can merge, etc. Worth having a quick chat about strategy? |
I agree: better to merge sooner than later. Re: strategy talk: what did you
have in mind? I'm free in a few hours if you want to do a call, or we can
discuss it here.
…On Sun, Jan 30, 2022 at 11:19 AM Riad S. Wahby ***@***.***> wrote:
@alex-ozdemir <https://github.com/alex-ozdemir> I think this is getting
to the point where we can think about merging it. This will also avoid
continuing breakage as the branches evolve away from one another.
I've just rebased onto master, but I haven't gone through all the testing
yet. Probably have to do some CI fixes before we can merge, etc. Worth
having a quick chat about strategy?
—
Reply to this email directly, view it on GitHub
<#30 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABMKJ5W2QIH66S53QGNISUDUYWFL7ANCNFSM5J6QIQUA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
On second thought, let me at least get CI passing again and then we can figure out if there are other things before reviewing and merging. I'll ping you again later this week. |
Sounds good.
…On Sun, Jan 30, 2022 at 6:06 PM Riad S. Wahby ***@***.***> wrote:
On second thought, let me at least get CI passing again and then we can
figure out if there are other things before reviewing and merging. I'll
ping you again later this week.
—
Reply to this email directly, view it on GitHub
<#30 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABMKJ5RQY52LM562EL5OUUDUYXVBBANCNFSM5J6QIQUA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Ah, one thing I've been looking at is avoiding runaway memory consumption. Things like the constant folding cache are an issue here. As an experiment, I've made the cfold cache bounded with an LRU eviction policy (though if the cache is too small this ends up triggering quadratic behavior, so it's not a slam-dunk). I'm guessing we may eventually want the same for the type cache, but I notice that it's using FxHashMap rather than what's implemented in hashconsing. Is there a reason to prefer this? (I see also that it's a map from weak pointers rather than strong pointers, but that seems orthogonal to the hashing strategy unless I'm missing something). Put another way: would it be possible to standardize on one hashing approach, or do these really need to be different? (And if so: can we document why in the code?) |
Huh. I'd like to know more about the kinds of memory problems created by the constant folding cache. I'm not sure that dynamically evicting items is the right fix. Re: hash tables: I guess there is a little heterogeneity. The Edit: if uniformity is the only benefit, that's enough benefit for me---I ask because I want to know if there is another problem with the current set-up. |
Well, the fundamental issue is that the constant folding cache has unbounded size (its key type is I kind of lied when I said it was a simple LRU eviction policy. The policy I'm playing with now is: the cache is unbounded during any particular invocation of fold_cache, but after the outermost fold_cache call returns the cache gets pruned to some maximum number of entries (which ends up dropping a lot of heap allocated intermediate terms). Probably the number of entries should be dynamically determined, but for now 64k seems to work. The TERM and TERM_TYPES tables also grow in an unbounded way. (I'm experimenting with garbage collection heuristics.) This causes both massive slowdowns (because of repeated re-allocations of the TERM and TERM_TYPES backing stores) and huge memory use---easily tens of gigs. |
I understand. Clearing between folds sounds like the right approach. I'm glad you're looking into GC for terms. I'm sure you've already found my starting point |
OK, 5e85e56 takes a first cut at automatically invoking garbage collection. Here's how it works: the term and type caches track how large they were after the last garbage collection. When In the zsharp front-end, In particular, there's a function call depth threshold; when returning from a call shallower than the threshold, I've tweaked the parameters of the various thresholds and leaky buckets based on one or two big example programs. The garbage collection overhead is nontrivial for small programs, but easily pays for itself in terms of avoiding massive cache reallocations and memory usage blow-up in big programs. I think all of this could be incorporated into the IR itself, likely keying on circify's function return machinery. That will have to wait until we pull the interpreter-for-"free" down into the IR, though. |
@alex-ozdemir there seems to have been a brokenness in the obliv pass that more aggresive constant folding revealed. I think this code is wrong: Lines 100 to 108 in 09c9f03
By this logic and the definition of
Commit 81a0757 fixes this and recovers correct behavior, by marking both the array and the parent term if the index is non-const. At least, I think this is right, based on what's going on in the corresponding Can you double check me on this? |
Ugh, my mistake. That still doesn't fix the issue. I'm not sure why yet, but I'll look into it. Not obvious to me why const arrays shouldn't be marked non-oblivious, though. |
Yeah, this seems wrong: Lines 219 to 228 in 81a0757
This tries to replace EDIT: in this case, should |
Oof! Yes, that was a bug. Thanks for looking into it, and for the patch. Your fix has the right idea here: we should include non-oblivious selects in the non-oblivious set (even though they may not be arrays!) so that we can easily tell whether to replace them. Looking at the non-oblivious-ness of their parent array isn't sufficient b/c we never flag constant arrays as non-oblivious. I added a bit more code to your patch to reliably flag non-oblivious selects. I also documented a bunch. I still suspect that some of our array passes have bugs. The problem is that we really want to treat functional arrays as RAM, but that is surprisingly hard. It might be a fun Coq project to:
I'm sure that we'd find bugs, and fix our algorithms. I'd sleep better at night too :). It might take some work to convince others that this project matters, but perhaps we could find a way. |
for i in ${TESTDIR}/*.zx; do | ||
${ZXI} "$i" &>/dev/null | ||
if [ "$?" != "0" ]; then | ||
echo "[failure: should-pass] $i" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this script appears to be going into CI, we probably want to return a non-zero code on failure.
Sorry for the drive by comment. I just noticed this while testing.
This is a great idea. I'll add it to the TODO file! |
Thoughts on merging? Want to take a bit more time to review first? |
Yeah, I think we can merge soon. I’ll do a read-through this evening.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Feel free to squash & merge.
I left a few comments double-checking things.
Also, I benchmarked the (old) zokrates test script against the old and new binaries, to check for regression. Results:
Before:
Benchmark #1: ./scripts/zokrates_test.zsh
Time (mean ± σ): 2.325 s ± 0.064 s [User: 3.879 s, System: 0.379 s]
Range (min … max): 2.211 s … 2.424 s 10 runs
After:
Benchmark #1: ./scripts/zokrates_test.zsh
Time (mean ± σ): 2.576 s ± 0.069 s [User: 4.123 s, System: 0.386 s]
Range (min … max): 2.471 s … 2.707 s 10 runs
This looks fine to me. I'd expect a small slowdown given the new machinery we need for Z#.
Co-authored-by: Alex Ozdemir <aozdemir@hmc.edu>
* macos deps script (broken on M1 for now) * arch dep should be coin-or-cbc * ZoK 0.7.6 support megat status-commit - bring in ZoK 0.7.6 libraries - port in diffs from old thirdparty - first-cut const and literal support - add toposort for includes so that we can resolve const values in next pass - statement handling, include-walking fix - type should cover u64 case, too - structs: store for now - const handling - flatten import map up-front to make later derefs easier to handle - stash note - need array consts, too - rough in const array support, I think - error message - small cleanups: change type of ZGen::constants and make comments more meaningful - generics in fn calls - small: better error message - need to resolve exprs as const in const_type_ - wip simple typing pass - ast visitor wip - zokrates ast visitor 1st cut - double check that we cover all product types - const typing visitor ; visitor error handling - reorg visit_files ; unification infra wip - unification infra wip - need to walk accesses for an assignment - walk_accesses wip - small q - refactor: data structs should be Hash(Hash), not flat - walk_accesses wip - monomorphize structs on LHS of declarations - check identifiers when monomorphizing structs - unification wip - unification wip - inline struct members must declare all fields - unification for array_initializer - unify postfix - unify fdef with call - ZExpressionTyper wip - ZExpressionTyper wip - ZExpressionTyper first cut - type equality first cut - type equality improved - update struct handling - stash note - zok_fe trivial - error msg fix - params in scope during stmt visiting - stash note - and_then rather than unwrap in flatten_import_map - handle special names (e.g. EMBED) in toposort - types for EMBED - EMBED is now a valid file in stdlib - rewriter should handle Types, too - need to rewrite literals on LHS of def stmts - iteration: fix type of iter var - for now, allow indexing with both Field and u* - add warning for indexing with Field - impl pow - don't add ITE when unconditionally assigning - oops, Uint takes bv_lit not pf_lit - simplify - fix arg order to bv_lit, improve new_<integral> functions - use constant folding for term::const_int - EMBED rework wip - EMBED rework wip - EMBED rework wip - EMBED rework wip - EMBED rework wip - prep for generic inf push - generic inf wip - generic inf wip - generic inf wip - struct generic inference - struct array inference wip - make const_expr_ return Result<> rather than blowing up - finish first-cut generic inf for Array - turn on fn generic inf - rework struct monomorphization code - add support for const asserts - process declarations lexically previously we'd processed all consts, then structs, then fns. now we want to add const initialization via function call, which is easiest to implement if we simply process declarations in-order as they appear in the file and require that all uses follow defs lexically. This seems to rule out mutual recursion, so we may want to revisit. - add const initialization via ternary - const fn call wip - ZGen interior mutability this lets us call expr, const_expr_ without mut ref. this seems to be the right way to go here. - small touches after rebase - reorganize zvisit code - generic lookup: should go through whole stack! (and not panic if stack is empty) - wip bugfixes - const_identifier_ should consult generics ; generic_lookup shouldn't traverse the whole stack - expr type inf for array accesses - small - const_stmt_ infra - const_stmt_ most var infra in place - cvar_assign handles AssigneeAccesses now - split type_ and const_type_ using const generics - cvar_assign: build up list of accesses before resolving them, so that we don't try to double-borrow cvars_stack - interpreter intf - zoki --- zok interpreter front-end - const_expr direct access vs through term impl - very quick zsharp readme - feature gate ILP back-end this makes it easier to build CirC on M1 macs (otherwise, need to build coin-or from source, which is not hard but is annoying) - rename zokrates to zsharp - heavy hitting stuff here - update z# readme - turn off ci for zsharp branch - small - add cfg to switch to bn254 curve - zsharp readme quick - struct consts - handle literals on LHS of const decls - typechecker: ! can take U*, too - refactor typechecking in InlineStructExpression handling, const_eexpr_ (this is going to move, though) - support inline structs in ZConstLiteralRewriter - really turn of ci this time, don't just induce a failure - don't build circ or opa_bench examples when 'lp' option is off ; fix example builds given renamed zokrates module - unify_inline_array: respect array dims! - better InlineArray len handling - check fn return type - small, plus a few tiny test cases - redefinition is an error - update thinking / status on uglinesses - explicit generic literals are U32 - remove redundant typechecking in InlineStruct const expr handling - array and struct equality - generic-in-generic const test - inline array and struct generic tests - small - sticky notes - stash note - small cleanup in zstmtwalker - support ZSHARP_STDLIB_PATH envvar in ZStdLib - get_field_size in EMBED ; field comparisons - test runner - hex literal fix - inconsistent array test - literals test - TODO quick update * generic inf refactor wip * generic inf refactor - keep plugging away at revamped generic inference - generic inf refactor wip - generic inf rework wip - stash note about divrem - generic inf wip - build up one term rather than a vec; walk struct members; remove old crap - small - generic inf rework wip - generic inf refactor wip - partially hook up new generic inf - zgenericinf: invoke solver, return result - ZGenericInf hooked up - stash a note - find_unique_model function - enable incremental mode for find_unique_model - hook up find_unique_model in zgenericinf * go over TODOs and small cleanup * field % - field to bv should use full width - need to make sure MSB is 0 when lowering to R1CS\! * update todos * unify const and non-const code paths - unify function_call and const_function_call - unify stmt and const_stmt - type_impl_ returns Result<> - unify expr and const_expr * update TODO * rework after rebase * fix circ example and clippies * constant folding for array select and store * cfold: Tuple and extend Eq * more informative generic params error message in function call * array accesses should be Field if not otherwise typed * support Uxx array indexing (automatic type coercion) ; check array index and value for consistency * tuple const folding * stash note about array construction * todos * more todos * IR tuple repr: use boxed slices rather than vecs this enforces the invariant that tuple lengths cannot be changed closes circify#39 * IR tuple typing checks the value in a given tuple slot has a fixed type. this invariant isn't fully captured right now---it's up to front- and back-ends to enforce this. [ EDIT: I think the above is wrong. `ir::term::ty::rec_check_raw` appears to enforce this. ] I've added a couple extra safety checks for this. * todo update * const fold bvconcat and booltobv * IR array key_sort and bounds checks * array oob todos / tests * ZGenericInf early exit for monomorphized calls * self.unwrap cleanup * cache generic inference results * small debug/assert * array construction optimization when constructing an array, push leaf terms directly into the array rather than building up a huge term. This reduces memory pressure and reduces constant folding cost in the (common) case of large const arrays * todos * clippy * bit order consistency fix / tests to_bits and from_bits functions use msb0 ordering (i.e., index 0 of the bool array is the MSB) * clippy small * update TODOs * clippy for zsharp frontend * cargo fmt ; pretty-printing T * add span for error context in expr and stmt * update TODOs * add s_divisible and s_remainder in 'field' in stdlib * comment on signed field fns * superfluous front-end example * zxi unused imports * zxc first cut * zxc: in count mode, dump out constraints * sidestep stack blowup in ir::term::ty::check_raw * add option to skip linearity reduction in zxc with -L * debug messages... darn * lru caching for cfold * unbounded/bounded cache unbounded during a single fold_cache call, bounded between * rebase fixes feature gate aby back-end with lp changes in zxi/zxc, and some clippies * don't check-in non-top-level Cargo.lock * increase LRU cache size for cfold to avoid n^2 behavior * heuristic term/type cache collector * tidy * fmt * small bugfix * small fixes ; move zx_tests * re-enable ci * fmt * tentative obliv-fix * more obliv-fix * clippy (for tests) * Polish the obliv fix a bit. Document * Addressing my unsolicited comment * fix build * typo * stash Alex's idea about modeling RAM transformations * back to upstream hashconsing Co-authored-by: Alex Ozdemir <aozdemir@hmc.edu> Co-authored-by: Riad S. Wahby <rsw@jfet.org> Co-authored-by: Alex Ozdemir <aozdemir@hmc.edu>
This is a draft PR for the zsharp branch. Still wip, not ready to merge, etc etc etc.
Please drop zsharp-specific issues in this thread.
closes #3
closes #23