Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for stacks to have parents #11654

Merged
merged 2 commits into from Mar 9, 2024
Merged

Conversation

rtpg
Copy link
Contributor

@rtpg rtpg commented Jan 28, 2024

This is another attempt on #11288

This allows for a Stack to have a parent stack (behind an Arc). This is being added to avoid constant stack copying in REPL code.

Concretely the following changes are included here:

  • Stack can now have a parent_stack, pointing to another stack
  • variable lookups can fallback to this parent stack (env vars and everything else is still copied)
  • REPL code has been reworked so that we use parenting rather than cloning. A REPL-code-specific trait helps to ensure that we do not accidentally trigger a full clone of the main stack
  • A property test has been added to make sure that parenting "looks the same" as cloning for consumers of Stack objects

crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
@IanManske
Copy link
Member

Thanks a lot for this! I will probably be using parent stacks to refactor a bunch of other code later. If I might make a few suggestions:

@rtpg
Copy link
Contributor Author

rtpg commented Feb 11, 2024

@IanManske is it possible you didn't hit "confirm" on your review? Not seeing any comments

crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
@rtpg
Copy link
Contributor Author

rtpg commented Feb 13, 2024

Thanks for pointing out Arc::into_inner and handling ownership that way! I reworked the code so that we can just rely on Rust's ownership and the single expect call to get what SharedStack was providing. I just pushed up a new commit that removes the trait entirely.

I also renamed stack to stack_arc when stack was actually an Arc<Stack>. This should make it more obvious that stack.clone() is not what we want here (but stack_arc.clone() will be about copying an Arc).

Copy link
Member

@IanManske IanManske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out Arc::into_inner and handling ownership that way!

You're welcome!

I also renamed stack to stack_arc when stack was actually an Arc<Stack>.

Great, this will help make things more clear, thanks!

Just a few more things to cleanup:

crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
crates/nu-cli/src/repl.rs Outdated Show resolved Hide resolved
@kubouch
Copy link
Contributor

kubouch commented Feb 13, 2024

From a cursory look, the general idea seems fine. Just two comments:

  1. Can you get rid of the expect()s? You may use the report_error() function, for example, to print an error about the error state but let's not kill Nushell.
  2. Is the proptest necessary? It introduces an extra dependency and the tests look quite confusing to me, it's not immediately clear what is being tested. Maybe there is a strong use case for it, but it seems to me like an overkill. It seems to me that proptest is used in similar cases like fuzzing (finding failing cases). This is interesting, but probably better kept for a separate PR as it concerns our testing strategy in general.

@IanManske
Copy link
Member

Can you get rid of the expect()s? You may use the report_error() function, for example, to print an error about the error state but let's not kill Nushell.

Good point, maybe we could fall back to cloning the Stack instead of the Arc if the into_inner fails. I.e.:

Arc::try_unwrap(stack).unwrap_or_else(|stack| {
    // print error/warning or maybe just `debug_assert(false)` ?
    (*stack).clone()
})

Is the proptest necessary? ... probably better kept for a separate PR as it concerns our testing strategy in general.

Personally, I would like to see more use of property based testing. But I guess it would be better to open a separate PR.

@kubouch
Copy link
Contributor

kubouch commented Feb 13, 2024

Yeah, fallback cloning could work.

We can definitely bring up the property-based testing, but it seems like a different topic than this PR. But yeah, more rigorous tests are always welcome.

@rtpg
Copy link
Contributor Author

rtpg commented Feb 14, 2024

@kubouch the reasoning for the property tests:

  • the property test helped me find several bugs (basically around parent_deletions). Stuff I think more diligent people would find without the property test, but it feels useful (especially since this code is only being used in the REPL main loop, which is currently not run in any tests).
  • Originally I wanted to set this up and get environment variables to also be shared instead of cloned. I was having trouble figuring out the interaction with overlays so abandonned it for now. But if this work on environment variables is done, we might watn to tweak the existing property test a bit to do some overlay operations, but otherwise the test will work "as-is".

The property test itself is fast, the dependency is only pulled in if you're working on nu-protocol directly, and without it I would have very little confidence in my change being right (partly because an older version of this changeset was filled with bugs!)

EDIT: I would be OK sending in the proptest as a separate PR, but I would not want this PR to be merged without the proptest. I don't believe this changeset can be deemed any form of correct without it, absent some other refactoring of the variable storage.

@rtpg
Copy link
Contributor Author

rtpg commented Feb 14, 2024

Regarding the panicking to avoid cloning... I do think we should treat an unintentional clone as a fatal error here. The performance is just so bad if you have cloning and anything big in the environment, that it's better for code that triggers a clone to be considered invalid from the first iteration.

That's just my opinion though. I am OK putting in just an error message, though I think either a debug assert or a test-only assert (is that a thing?) so that it blows up in some circumstances feels very valuable to short circuit work that might be doomed to fail

@IanManske
Copy link
Member

I do think we should treat an unintentional clone as a fatal error here. The performance is just so bad if you have cloning and anything big in the environment, that it's better for code that triggers a clone to be considered invalid from the first iteration.

A debug assert should be fine (e.g., debug_assert(false, "failed to unwrap stack Arc")). The problem with panics in release mode is that this could potentially lock people out of their systems. I.e., nushell crashes on attempted login. Also, I don't think that a potentially slow clone of the stack should count as a fatal error.

@kubouch
Copy link
Contributor

kubouch commented Feb 15, 2024

We talked with the core team about the proptest, and it would be better if you could extract useful unit tests for this PR and leave proptest for a further discussion.

We used to have proptest but removed it, partially due to high execution time. More importantly, proptesting seems in principle similar to fuzzing, i.e., finding failure cases. As such, both fuzzing and proptest should run in a separate CI and discovered failure cases should be distilled as unit tests into the main test suite. We already have fuzzing in the nu-parser and nu-path crates. We could add proptest to the mix as well. And then set up a CI dedicated to these because we're not currently continuously fuzzing. But that is out of scope of this PR.

@rtpg
Copy link
Contributor Author

rtpg commented Feb 16, 2024

Would you be against me checking in the property test into crates/nu-protocol/src/engine/property_test.rs, where if, say, the RUN_PROPERTY_TESTS env var isn't set, then the test just returns immediately? I'd like to keep this stuff in the codebase if possible, if only to not have to go dig it up later if the environment variable stuff isn't tackled.

@rtpg
Copy link
Contributor Author

rtpg commented Feb 16, 2024

actually scratch that, I'll just delete the property test. Not that interested in fighting an uphill battle on this, and I've convinced myself the changes are OK. I'll generate some unit tests that should cover the problematic lines that I found through this.

@kubouch
Copy link
Contributor

kubouch commented Feb 16, 2024

Adding the proptests into a separate file and putting it behind a feature (or env var, to me it seems feature would be cleaner) could be a good first step towards proptesting in Nushell in general, I wouldn't object that.

@rtpg
Copy link
Contributor Author

rtpg commented Feb 16, 2024

OK I moved the property tests to a new file (property_tests.rs). I put this in crates/nu-protocol/src/engine, and made it only compile if you activate the proptest feature. It feels like this should belong in crates/nu-protocol/tests, but I don't quite know how stuff in tests even gets seen (where are the mod statements for that?), any pointers would be appreciated. Though if nobody really objects to it being in src, I don't really care that much.

All the outstanding comments should be handled, and I've added 5 unit tests directly to stack.rs.

@kubouch
Copy link
Contributor

kubouch commented Feb 17, 2024

OK, sounds good. Could you split the proptest itself into a separate PR? We can investigate there what happens with the tests/ directory, it seems like a better fit than src/.

@rtpg
Copy link
Contributor Author

rtpg commented Feb 27, 2024

Sorry, I was out and about last week.

I'm trying to rebase off of the changes in #11860, and the entire control flow got changed up. Will ruminate on this a bit, but I think everything should still flow nicely.

@rtpg
Copy link
Contributor Author

rtpg commented Feb 27, 2024

Actually it's not clear to me if stack parents are workable in this unwinding model. I want to checkpoint the stacks for potential unwinds, but I want to apply modifications to the stack. If we had a panic in between "I have modified the stack" and "I have completed the loop", then we need to recover the previous iteration.

This seems to rule out the "simple" model in the current version of this branch (where we do a bunch of reads, then write at one point). I imagine this panicking is meant to catch issues with running commands, right?

Not to speak too out of turn here but would we be against massively shrinking how much code is inside the catch_unwind? Or maybe somebody smarter than me here can figure out how to untie the knot here of "I want an old version of the stack to go back to" and "I want to make changes to the stack". It feels like we have to do something at one point that involves making changes, and holding onto the older stack seems to imply that this would involve copies.

(One idea: we just panic_unwind the entire REPL. You panic? You get a full shell restart. No state management weirdness from having stale structs sticking around)

@IanManske
Copy link
Member

Sorry, I was out and about last week.

All good, thanks for working on this!

Actually it's not clear to me if stack parents are workable in this unwinding model.

I think it should be possible. Here's one potential solution I have in mind:

  • The Stack in evaluate_repl will be the parent stack. We create new Stack with it as the parent, and pass this new Stack to loop_iteration.
  • This Stack passed to loop_iteration should work the same as the unique_stack in the current PR changes.
  • loop_iteration returns back this unique_stack, and the changes in it are applied to its parent in evaluate_repl. Not sure exactly sure how this should work. We could set the returned stack's parent to None, and then merge it into the parent stack.
  • If loop_iteration panicked, then we disregard the returned stack and simply reuse the parent stack.

 Stacks with parents can avoid copying variables, so long as the parent
 is never mutated. This lets us avoid expensive cloning in the REPL
@rtpg
Copy link
Contributor Author

rtpg commented Feb 28, 2024

Alright, successfully rebased. This involved adding yet another Stack operation (Stack::with_changes_from_child) to do the whole "apply changes from child back to the parent". So at a high level what is happening:

  • we make a child stack before entering a loop iteration, so that the parent can be our "previous stack"
  • that child gets used in further children as needed for the highlighter etc during the loop iteration
  • at the end we take the child's changes and apply them back to the parent

This involves a bit more fiddling around than before the panic catching tech, but I think it's still pretty reasonable, and at the end of the day less cloning than what we had before.

Long term... the shell is still doing a lot of shuffling of stuff around in memory. In particular I hadn't really noticed the engine state copies before. But I feel like giant env vars are less ocmmon, for example.

@rtpg
Copy link
Contributor Author

rtpg commented Feb 28, 2024

the test failure seems to be unrelated? I don't know how to retrigger a build for this kind of thing though


---- commands::network::http::get::http_get_self_signed_override stdout ----
=== stderr
Error: nu::shell::network_failure

  × Network failure
   ╭─[source:1:21]
 1 │ http get --insecure https://self-signed.badssl.com/
   ·                     ───────────────┬───────────────
   ·                                    ╰── Cannot make request to https://self-signed.badssl.com/, there was an error establishing a connection.
   ╰────


thread 'commands::network::http::get::http_get_self_signed_override' panicked at crates/nu-command/tests/commands/network/http/get.rs:268:5:
assertion failed: actual.out.contains(\"<html>\")
stack backtrace:
   0: rust_begin_unwind
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/std/src/panicking.rs:597:5
   1: core::panicking::panic_fmt
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/core/src/panicking.rs:72:14
   2: core::panicking::panic
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/core/src/panicking.rs:127:5
   3: main::commands::network::http::get::http_get_self_signed_override
   4: main::commands::network::http::get::http_get_self_signed_override::{{closure}}
   5: core::ops::function::FnOnce::call_once
   6: core::ops::function::FnOnce::call_once
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


failures:
    commands::network::http::get::http_get_self_signed_override

IanManske added a commit that referenced this pull request Feb 28, 2024
# Description
Ignores some network tests that sometimes fail in CI. E.g., in
[11953](#11953 (comment))
and
[11654](#11654 (comment)).
@IanManske
Copy link
Member

Nice work! This looks good.

the test failure seems to be unrelated?

Yep, some network tests started to become unreliable recently, these tests have been disabled.

Long term... the shell is still doing a lot of shuffling of stuff around in memory. In particular I hadn't really noticed the engine state copies before. But I feel like giant env vars are less ocmmon, for example.

Yeah, there are a lot of clones throughout the code base for EngineState, Value, etc. These clones should hopefully be eliminated over time with PRs like this, since the cloning can actually take up an unreasonably large amount of time.

crates/nu-protocol/src/engine/stack.rs Outdated Show resolved Hide resolved
Co-authored-by: Ian Manske <ian.manske@pm.me>
@sholderbach
Copy link
Member

Thanks for implementing this @rtpg ! Let's give it a go

@sholderbach sholderbach merged commit d8f13b3 into nushell:main Mar 9, 2024
19 checks passed
@hustcer hustcer added this to the v0.92.0 milestone Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants