Allow for stacks to have parents #11654

rtpg · 2024-01-28T04:54:53Z

This is another attempt on #11288

This allows for a Stack to have a parent stack (behind an Arc). This is being added to avoid constant stack copying in REPL code.

Concretely the following changes are included here:

Stack can now have a parent_stack, pointing to another stack
variable lookups can fallback to this parent stack (env vars and everything else is still copied)
REPL code has been reworked so that we use parenting rather than cloning. A REPL-code-specific trait helps to ensure that we do not accidentally trigger a full clone of the main stack
A property test has been added to make sure that parenting "looks the same" as cloning for consumers of Stack objects

crates/nu-protocol/src/engine/stack.rs

crates/nu-cli/src/repl.rs

crates/nu-cli/src/prompt_update.rs

crates/nu-cli/src/repl.rs

IanManske · 2024-02-10T17:27:27Z

Thanks a lot for this! I will probably be using parent stacks to refactor a bunch of other code later. If I might make a few suggestions:

rtpg · 2024-02-11T08:45:33Z

@IanManske is it possible you didn't hit "confirm" on your review? Not seeing any comments

crates/nu-cli/src/repl.rs

rtpg · 2024-02-13T00:37:55Z

Thanks for pointing out Arc::into_inner and handling ownership that way! I reworked the code so that we can just rely on Rust's ownership and the single expect call to get what SharedStack was providing. I just pushed up a new commit that removes the trait entirely.

I also renamed stack to stack_arc when stack was actually an Arc<Stack>. This should make it more obvious that stack.clone() is not what we want here (but stack_arc.clone() will be about copying an Arc).

IanManske

Thanks for pointing out Arc::into_inner and handling ownership that way!

You're welcome!

I also renamed stack to stack_arc when stack was actually an Arc<Stack>.

Great, this will help make things more clear, thanks!

Just a few more things to cleanup:

crates/nu-cli/src/repl.rs

kubouch · 2024-02-13T19:51:01Z

From a cursory look, the general idea seems fine. Just two comments:

Can you get rid of the expect()s? You may use the report_error() function, for example, to print an error about the error state but let's not kill Nushell.
Is the proptest necessary? It introduces an extra dependency and the tests look quite confusing to me, it's not immediately clear what is being tested. Maybe there is a strong use case for it, but it seems to me like an overkill. It seems to me that proptest is used in similar cases like fuzzing (finding failing cases). This is interesting, but probably better kept for a separate PR as it concerns our testing strategy in general.

IanManske · 2024-02-13T20:29:26Z

Can you get rid of the expect()s? You may use the report_error() function, for example, to print an error about the error state but let's not kill Nushell.

Good point, maybe we could fall back to cloning the Stack instead of the Arc if the into_inner fails. I.e.:

Arc::try_unwrap(stack).unwrap_or_else(|stack| {
    // print error/warning or maybe just `debug_assert(false)` ?
    (*stack).clone()
})

Is the proptest necessary? ... probably better kept for a separate PR as it concerns our testing strategy in general.

Personally, I would like to see more use of property based testing. But I guess it would be better to open a separate PR.

kubouch · 2024-02-13T21:36:31Z

Yeah, fallback cloning could work.

We can definitely bring up the property-based testing, but it seems like a different topic than this PR. But yeah, more rigorous tests are always welcome.

rtpg · 2024-02-14T01:20:30Z

@kubouch the reasoning for the property tests:

the property test helped me find several bugs (basically around parent_deletions). Stuff I think more diligent people would find without the property test, but it feels useful (especially since this code is only being used in the REPL main loop, which is currently not run in any tests).
Originally I wanted to set this up and get environment variables to also be shared instead of cloned. I was having trouble figuring out the interaction with overlays so abandonned it for now. But if this work on environment variables is done, we might watn to tweak the existing property test a bit to do some overlay operations, but otherwise the test will work "as-is".

The property test itself is fast, the dependency is only pulled in if you're working on nu-protocol directly, and without it I would have very little confidence in my change being right (partly because an older version of this changeset was filled with bugs!)

EDIT: I would be OK sending in the proptest as a separate PR, but I would not want this PR to be merged without the proptest. I don't believe this changeset can be deemed any form of correct without it, absent some other refactoring of the variable storage.

rtpg · 2024-02-14T01:30:10Z

Regarding the panicking to avoid cloning... I do think we should treat an unintentional clone as a fatal error here. The performance is just so bad if you have cloning and anything big in the environment, that it's better for code that triggers a clone to be considered invalid from the first iteration.

That's just my opinion though. I am OK putting in just an error message, though I think either a debug assert or a test-only assert (is that a thing?) so that it blows up in some circumstances feels very valuable to short circuit work that might be doomed to fail

IanManske · 2024-02-15T00:07:39Z

I do think we should treat an unintentional clone as a fatal error here. The performance is just so bad if you have cloning and anything big in the environment, that it's better for code that triggers a clone to be considered invalid from the first iteration.

A debug assert should be fine (e.g., debug_assert(false, "failed to unwrap stack Arc")). The problem with panics in release mode is that this could potentially lock people out of their systems. I.e., nushell crashes on attempted login. Also, I don't think that a potentially slow clone of the stack should count as a fatal error.

crates/nu-protocol/src/engine/stack.rs

kubouch · 2024-02-15T20:31:19Z

We talked with the core team about the proptest, and it would be better if you could extract useful unit tests for this PR and leave proptest for a further discussion.

We used to have proptest but removed it, partially due to high execution time. More importantly, proptesting seems in principle similar to fuzzing, i.e., finding failure cases. As such, both fuzzing and proptest should run in a separate CI and discovered failure cases should be distilled as unit tests into the main test suite. We already have fuzzing in the nu-parser and nu-path crates. We could add proptest to the mix as well. And then set up a CI dedicated to these because we're not currently continuously fuzzing. But that is out of scope of this PR.

rtpg · 2024-02-16T09:26:11Z

Would you be against me checking in the property test into crates/nu-protocol/src/engine/property_test.rs, where if, say, the RUN_PROPERTY_TESTS env var isn't set, then the test just returns immediately? I'd like to keep this stuff in the codebase if possible, if only to not have to go dig it up later if the environment variable stuff isn't tackled.

rtpg · 2024-02-16T09:32:41Z

actually scratch that, I'll just delete the property test. Not that interested in fighting an uphill battle on this, and I've convinced myself the changes are OK. I'll generate some unit tests that should cover the problematic lines that I found through this.

kubouch · 2024-02-16T11:36:11Z

Adding the proptests into a separate file and putting it behind a feature (or env var, to me it seems feature would be cleaner) could be a good first step towards proptesting in Nushell in general, I wouldn't object that.

rtpg · 2024-02-16T13:36:20Z

OK I moved the property tests to a new file (property_tests.rs). I put this in crates/nu-protocol/src/engine, and made it only compile if you activate the proptest feature. It feels like this should belong in crates/nu-protocol/tests, but I don't quite know how stuff in tests even gets seen (where are the mod statements for that?), any pointers would be appreciated. Though if nobody really objects to it being in src, I don't really care that much.

All the outstanding comments should be handled, and I've added 5 unit tests directly to stack.rs.

kubouch · 2024-02-17T12:31:12Z

OK, sounds good. Could you split the proptest itself into a separate PR? We can investigate there what happens with the tests/ directory, it seems like a better fit than src/.

rtpg · 2024-02-27T13:48:53Z

Sorry, I was out and about last week.

I'm trying to rebase off of the changes in #11860, and the entire control flow got changed up. Will ruminate on this a bit, but I think everything should still flow nicely.

rtpg · 2024-02-27T14:04:43Z

Actually it's not clear to me if stack parents are workable in this unwinding model. I want to checkpoint the stacks for potential unwinds, but I want to apply modifications to the stack. If we had a panic in between "I have modified the stack" and "I have completed the loop", then we need to recover the previous iteration.

This seems to rule out the "simple" model in the current version of this branch (where we do a bunch of reads, then write at one point). I imagine this panicking is meant to catch issues with running commands, right?

Not to speak too out of turn here but would we be against massively shrinking how much code is inside the catch_unwind? Or maybe somebody smarter than me here can figure out how to untie the knot here of "I want an old version of the stack to go back to" and "I want to make changes to the stack". It feels like we have to do something at one point that involves making changes, and holding onto the older stack seems to imply that this would involve copies.

(One idea: we just panic_unwind the entire REPL. You panic? You get a full shell restart. No state management weirdness from having stale structs sticking around)

IanManske · 2024-02-27T14:36:38Z

Sorry, I was out and about last week.

All good, thanks for working on this!

Actually it's not clear to me if stack parents are workable in this unwinding model.

I think it should be possible. Here's one potential solution I have in mind:

The Stack in evaluate_repl will be the parent stack. We create new Stack with it as the parent, and pass this new Stack to loop_iteration.
This Stack passed to loop_iteration should work the same as the unique_stack in the current PR changes.
loop_iteration returns back this unique_stack, and the changes in it are applied to its parent in evaluate_repl. Not sure exactly sure how this should work. We could set the returned stack's parent to None, and then merge it into the parent stack.
If loop_iteration panicked, then we disregard the returned stack and simply reuse the parent stack.

Stacks with parents can avoid copying variables, so long as the parent is never mutated. This lets us avoid expensive cloning in the REPL

rtpg · 2024-02-28T05:27:21Z

Alright, successfully rebased. This involved adding yet another Stack operation (Stack::with_changes_from_child) to do the whole "apply changes from child back to the parent". So at a high level what is happening:

we make a child stack before entering a loop iteration, so that the parent can be our "previous stack"
that child gets used in further children as needed for the highlighter etc during the loop iteration
at the end we take the child's changes and apply them back to the parent

This involves a bit more fiddling around than before the panic catching tech, but I think it's still pretty reasonable, and at the end of the day less cloning than what we had before.

Long term... the shell is still doing a lot of shuffling of stuff around in memory. In particular I hadn't really noticed the engine state copies before. But I feel like giant env vars are less ocmmon, for example.

rtpg · 2024-02-28T07:43:56Z

the test failure seems to be unrelated? I don't know how to retrigger a build for this kind of thing though


---- commands::network::http::get::http_get_self_signed_override stdout ----
=== stderr
Error: nu::shell::network_failure

  × Network failure
   ╭─[source:1:21]
 1 │ http get --insecure https://self-signed.badssl.com/
   ·                     ───────────────┬───────────────
   ·                                    ╰── Cannot make request to https://self-signed.badssl.com/, there was an error establishing a connection.
   ╰────


thread 'commands::network::http::get::http_get_self_signed_override' panicked at crates/nu-command/tests/commands/network/http/get.rs:268:5:
assertion failed: actual.out.contains(\"<html>\")
stack backtrace:
   0: rust_begin_unwind
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/std/src/panicking.rs:597:5
   1: core::panicking::panic_fmt
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/core/src/panicking.rs:72:14
   2: core::panicking::panic
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/core/src/panicking.rs:127:5
   3: main::commands::network::http::get::http_get_self_signed_override
   4: main::commands::network::http::get::http_get_self_signed_override::{{closure}}
   5: core::ops::function::FnOnce::call_once
   6: core::ops::function::FnOnce::call_once
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


failures:
    commands::network::http::get::http_get_self_signed_override

# Description Ignores some network tests that sometimes fail in CI. E.g., in [11953](#11953 (comment)) and [11654](#11654 (comment)).

IanManske · 2024-02-28T17:12:47Z

Nice work! This looks good.

the test failure seems to be unrelated?

Yep, some network tests started to become unreliable recently, these tests have been disabled.

Long term... the shell is still doing a lot of shuffling of stuff around in memory. In particular I hadn't really noticed the engine state copies before. But I feel like giant env vars are less ocmmon, for example.

Yeah, there are a lot of clones throughout the code base for EngineState, Value, etc. These clones should hopefully be eliminated over time with PRs like this, since the cloning can actually take up an unreasonably large amount of time.

crates/nu-protocol/src/engine/stack.rs

Co-authored-by: Ian Manske <ian.manske@pm.me>

sholderbach · 2024-03-09T16:55:28Z

Thanks for implementing this @rtpg ! Let's give it a go

rtpg force-pushed the stack-sharing branch from a3fea1f to 0db4ee9 Compare January 28, 2024 06:03