Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ls color problem on SubExpression #7689

Closed
wants to merge 3 commits into from

Conversation

Mehrbod2002
Copy link
Contributor

@Mehrbod2002 Mehrbod2002 commented Jan 5, 2023

Description

Fixes #7368 , #4319 , #4501 , or #7169
As ls called as value on let or parentheses () , detect ls ( added Expression after detect ls to not interrupt evaluating ) and evaluate table on it , it fixes missed metadata and attach `lscolor' PipeLineMetadata easily without conflict to core .

@Mehrbod2002
Copy link
Contributor Author

Working on exact detection to bypass this problem ...

@rgwood rgwood marked this pull request as draft January 6, 2023 00:39
@Mehrbod2002
Copy link
Contributor Author

Mehrbod2002 commented Jan 6, 2023

@rgwood Can you review this failed tests ?
I can't reproduce this error .

I throw ls from table to avoid losing pipelineMetadata ,
As you know when we use let or () , converts to Value
Thanks

Screenshot_2023-01-06_12-37-56

@Mehrbod2002 Mehrbod2002 marked this pull request as ready for review January 6, 2023 09:05
@fdncred
Copy link
Collaborator

fdncred commented Jan 6, 2023

that was a weird ci error. i restarted the failed job on the ci to see if they correct themselves. sometimes the ci just pukes. let's see if that's the case here.

Update: Failed again.

---- tests::test_parser::alias_recursion stdout ----
stdout: 
stderr: Error: nu::shell::access_beyond_end (link)

  × Row number too large (empty content).


thread 'tests::test_parser::alias_recursion' panicked at 'assertion failed: output.status.success()', src/tests.rs:113:5
s

Looking at the test, i'm not sure what 'row number too large' is supposed to mean. it seems odd that this ls in the test would return nothing, including no spaces, which i guess it what it's checking for.

#[test]
fn alias_recursion() -> TestResult {
    run_test_contains(r#"alias ls = (ls | sort-by type name -i); ls"#, " ")
}

This seems to work from the repl
alias ls = (ls | sort-by type name -i); ls | table | str contains ' '

@Mehrbod2002
Copy link
Contributor Author

@fdncred
I think problem solved .
Something was happening by overlay in table .

@fdncred
Copy link
Collaborator

fdncred commented Jan 6, 2023

Something was happening by overlay in table.

Ok, glad you tracked it down.

Am I understanding your code correctly? It appears that, on subexpressions, you're checking if the pipeline contains ls or get, and if it does, you're adding table to the end of it. Is there no need to add select to this? I also wonder about other commands that weren't working, but maybe we've fixed all those by passing the metadata properly.

@Mehrbod2002
Copy link
Contributor Author

Mehrbod2002 commented Jan 6, 2023

Something was happening by overlay in table.

Ok, glad you tracked it down.

Am I understanding your code correctly? It appears that, on subexpressions, you're checking if the pipeline contains ls or get, and if it does, you're adding table to the end of it. Is there no need to add select to this? I also wonder about other commands that weren't working, but maybe we've fixed all those by passing the metadata properly.

Yes .it's possible easily with some lines, but how can we detect ls ?
But we can attach metadata to all of our tables, and means bad to me . Another way is :
If you want to avoid losing metadata you have to change let and () , return PipelineValue instead of Value , and it means conflict with 90% core codes .
We don't know our previous steps .
Or another way is detect ls by our available Middleware, something we have only is stack but not working if we implement vector on parsing we lose data and call eliminate it .
So I think it is good way if you think so .

@fdncred
Copy link
Collaborator

fdncred commented Jan 6, 2023

I'm fine with the way it is if other core-team members sign off on it. I'm super grateful just to have ls_colors work in more places.

@kubouch
Copy link
Contributor

kubouch commented Jan 7, 2023

Unfortunately, there are several problems with this approach.

First, it does not work reliably. Auto-wrapping with table causes let x = (ls) to save the output as a string, not as a table which prevents using $x further (can't do $x | get name anymore).

Second, manually checking the pipeline for ls and get seems quite hacky and error-prone. Also, manually constructing the spans is almost guaranteed to backfire at some point (e.g., if you have an extra space somewhere).

If I understand correctly, the problem is that the pipeline metadata gets lost during when the PipelineData becomes a Value. So, I think the solution would be to find a mechanism how to preserve the metadata during the transition. From the top of my head, I'm thinking of a hash map in the Stack storing the metadata with VarId as a key. I haven't looked into this problem in detail, though, I'm not sure if that would cover all the error cases. It probably needs a bit of design work to figure out a good solution.

@kubouch kubouch marked this pull request as draft January 7, 2023 15:31
@Mehrbod2002
Copy link
Contributor Author

Mehrbod2002 commented Jan 7, 2023

@kubouch
Thanks for review .
Agree with problem we will have to attach table on our results .
You can consider our new way . which we are working with stack and trying to add our lost metadata by only on ls command not processing on all evaluating we have.

@Mehrbod2002 Mehrbod2002 marked this pull request as ready for review January 7, 2023 17:39
@kubouch
Copy link
Contributor

kubouch commented Jan 7, 2023

I think now your current solution completely sidesteps the metadata from the pipeline. Now there are two overlapping systems for setting metadata from ls, one of which is redundant. The html metadata is also ignored so these two would behave differently. Also, it seems like it will set the metadata for all pipelines until table is called which might cause other pipelines to be labeled as coming from ls. Or the stack that had ls gets dropped which won't preserve the metadata for the value (I suspect let x = (ls); $x might not be working after your recent change but I might be wrong). It would be good to have tests to cover all these cases so we don't have to guess, but I suspect the current solution is too hand-wavy.

The key problem seems to be losing the metadata when evaluating a subexpression. So, we need a reliable way to store it and the only place to store it seems to me is the Stack (the caller stack, not the callee which gets dropped if I remember correctly). But it must be stored in such a way that correctly identifies the metadata with the pipeline that produced it. Maybe Span as the hash map key would work? But it might be too fragile, not sure. Another approach would be to return the metadata from eval_expression() and just carry it along until we need to create the PipelineData again. That would probably be the most robust but it might add quite a bit of hassle, e.g., storing it together with a Value when saved into a variable, that might not be worth the complexity... So, I'm not exactly sure what the best solution would be, it might also require significantly changing how we handle the metadata.

@kubouch
Copy link
Contributor

kubouch commented Jan 7, 2023

Yeah, I tried it and you can generate a failure by doing

> let x = (ls)

> [ a b c ] | table

> $x
# does not show colored `name` column

That's because the table on the second line discarded the metadata on unrelated stream.

@Mehrbod2002
Copy link
Contributor Author

@kubouch
If we are going to use Box , it won't drop until , stack declare on parsing . as we need some data and clone it , then won't work .
if we use it on eval_subexpression it will any way convert to value and metadata drops .
Anyway ls with any Type going through from table and handle_row_stream .
Because we can save anything and our middleware (stack) is useless here and don't want conflict with Value::List , so I think It could be good to have new type on our Value enum something called for example value_with_metadata or something else and its value can be Value and have extra element called metadata .
and easily can save any metadata in future and use it for ls without any problem .
What do you think ?

@Mehrbod2002 Mehrbod2002 marked this pull request as draft January 7, 2023 22:01
@kubouch
Copy link
Contributor

kubouch commented Jan 8, 2023

I'm not sure I understand everything but having a new Value type that's Value+Metadata seems like it would add a lot of complexity and possible bugs anywhere you match against a Value. But yeah, it might be one approach worth considering.

I'll bring it up to our core team to see what others think.

@Mehrbod2002
Copy link
Contributor Author

Mehrbod2002 commented Jan 8, 2023

I'm not sure I understand everything but having a new Value type that's Value+Metadata seems like it would add a lot of complexity and possible bugs anywhere you match against a Value. But yeah, it might be one approach worth considering.

I'll bring it up to our core team to see what others think.

Thanks .
Just sure about this , we have a problem on .into_value in PipelineData, from converting ListStream to Value List ...
The bad way I know , we can add our Metadata on list and pop it when we reach and detect if it has metadata or no , two commands will face it (ls and every) it much less than this :
So logical way we have is add some optional Metadata on Value::List or have new type in Value Enum to convert any data which has metadata too . I think it's worth too , and believe can help for future features we have if it will contain metadata too.

@fdncred
Copy link
Collaborator

fdncred commented Jan 8, 2023

Semi-related I've been thinking about adding some type of Value to do date math better. In order to do date math better we would need to temporarily store the begin date time and end date time so that leap years, months, etc can be calculated correctly. Translating nanoseconds to durations isn't good enough for date math.

I'm wondering if a separate value type is better for this or if we should have a value type that can carry more things like metadata.

@Mehrbod2002
Copy link
Contributor Author

I'm wondering if a separate value type is better for this or if we should have a value type that can carry more things like metadata.

One Value which contains Value + Metadata absolutely better with less and easily development than have a lot modules for our types .
Surely do this thing , save a lot energy and high volumn of coding .

Currently watching Value implementation , many methods we can recall it with match easily , I mean for implement of such a thing I don't think need a much time and could be as you said really useful for toward features metadata which NuShell have .

Don't know where Core Developers will decide , but will appreciate if you apprise me about it or assign me for this , like to solve this problem .

@kubouch
Copy link
Contributor

kubouch commented Jan 8, 2023

One problem with the Value + Metadata approach is that for every match Value, it adds one arm that needs to be checked. For example, if you want a list, you need to check for Value::List and Value::ValueWithMetadata(Value::List, _). Somebody will 100% forget to do this and it will be a source of bugs. Also, it's hard to add these retroactively relying on the compiler since some (most?) existing matches for Value are fall-through (use _), or use if let etc. But yeah, if there is a way to do it gracefully, it might be a good solution.

@Mehrbod2002
Copy link
Contributor Author

One problem with the Value + Metadata approach is that for every match Value, it adds one arm that needs to be checked. For example, if you want a list, you need to check for Value::List and Value::ValueWithMetadata(Value::List, _). Somebody will 100% forget to do this and it will be a source of bugs. Also, it's hard to add these retroactively relying on the compiler since some (most?) existing matches for Value are fall-through (use _), or use if let etc. But yeah, if there is a way to do it gracefully, it might be a good solution.

We don't need to use it everywhere ,
Yes matching in Value will be doable for its implementation, for example now we need a Value with metadata on table we use it here or date calculations , not every list we have , currently we have problem with Value::List only on ls and Html .

for ls if use it , we have to Match some commands related to ls like every or some other .

I think until we have these less commands and it is early , it's doable and necessary , because I think we will encounter new features which need a metadata .

@Mehrbod2002
Copy link
Contributor Author

@kubouch
One way we convert if from pipeline and it's easy, just detect which Data has a metadata .
Good news is we have only one Value on list for now and better news our lists d which maybe has a metadata are related to table, so for now we use our new Value+Metadata here only .
Some other commands which read ls for example are finger counting and we can match then easily , if you attention we only have List type for now so ... we walked most of the way on our related commands to ls too . :)

Further PRs if our developers need a metadata can use it easily then in future features too .

@kubouch
Copy link
Contributor

kubouch commented Jan 11, 2023

So we all agreed adding this new recursive Value::ValueWithMetadata variant would add too much maintenance overhead. We thought of a couple of variants:

  • Make Value an opaque type and basically disallow manual pattern matching on the Value (only via helper methods), then adding a new enum variant should not be such a PITA
  • Our longer term goal is to remove Spans from Values and replacing them with either an ID or a Rc<Span>, this could possibly open up space for being able to store the metadata on the together with the Span without adding a new field to Value.

In both cases, we'd need to do some work before being able to address the metadata problem.

One thing that could be tried wight now is to implement the data coming out of ls (or wherever the metadata is generated) as Value::CustomValue. Not sure if it's a good idea and what the outcome would be, I've never worked with custom values before, but maybe it could work.
@sholderbach says CutsomValue might not work either

@sholderbach
Copy link
Member

Value::CustomValue is in itself opaque and wouldn't have the match problem as a trait object, but the problem is the vastness of existing matches on Value. They wouldn't catch it if a Value::Int for example gets wrapped in your special custom value and throw a type error as soon as you encounter a Value::CustomValue(MetaWrapValue(Value::Int)).

@Mehrbod2002
Copy link
Contributor Author

So we all agreed adding this new recursive Value::ValueWithMetadata variant would add too much maintenance overhead. We thought of a couple of variants:

* Make `Value` an opaque type and basically disallow manual pattern matching on the `Value` (only via helper methods), then adding a new enum variant should not be such a PITA

* Our longer term goal is to remove `Span`s from `Value`s and replacing them with either an ID or a `Rc<Span>`, this could possibly open up space for being able to store the metadata on the together with the Span without adding a new field to `Value`.

In both cases, we'd need to do some work before being able to address the metadata problem.

One thing that could be tried wight now is to implement the data coming out of ls (or wherever the metadata is generated) as Value::CustomValue. Not sure if it's a good idea and what the outcome would be, I've never worked with custom values before, but maybe it could work. @sholderbach says CutsomValue might not work either

On first variant you mean create new Enum and cover Value Enum ?
And on second one you mean can we attach new element directly on Span ?

@kubouch
Copy link
Contributor

kubouch commented Jan 12, 2023

Yeah, in the first case, adding a new Value enum variant would have only internal effect without the need to refactor other code.

The second option is about removing Span from being a part of Value and replacing it with an identifier pointing at a global storage of all Spans (similar to how definitions are handled in EngineState). This would reduce the size of the Value and consequently the whole Stack. The same identifier could be used to retrieve metadata as well. I think handling metadata this way is cleaner than adding a new Value enum variant.

The two options are not mutually exclusive: The Value refactor would still be beneficial even if we use the second option for handling metadata.

So yeah, we're putting the metadata on hold for now since there are issues to be handled before that and we need to discuss it a bit more.

@Mehrbod2002
Copy link
Contributor Author

@kubouch
As I am testing now , if cover Value enum with new one , as right now we have two element which maybe contain Metadata , for now we it will be usable on table .
Is it okay to start process and implement it ?

@kubouch
Copy link
Contributor

kubouch commented Jan 14, 2023

I think we want to wait at this moment and first think of a good way forward. The issue is connected to other problems, as I described above, and we haven't yet decided how to proceed.

@fdncred
Copy link
Collaborator

fdncred commented Jan 14, 2023

I really wish we could move forward with this in some way. @Mehrbod2002 is motivated to work on it but the core-team hasn't really made a decision yet. I guess that's because this is "hard" to figure out. I'd like to find a way to move forward in the general right direction, even if we have to change it later. I'm just not sure what that is.

@Mehrbod2002
Copy link
Contributor Author

@fdncred @kubouch
Please consider to look the new commit ,
Just to make sure about we don't conflict with anything and test , added ValueWithMetadata to Value,
luckily just two commands conflict with it and metadata just for now in Nushell used for table and only ls command ... so for now we can rest with minimal codes they works .
and on rendering table just Nushell use flatten command for Value and it improved too .

@Mehrbod2002 Mehrbod2002 marked this pull request as ready for review January 15, 2023 22:19
@fdncred
Copy link
Collaborator

fdncred commented Jan 15, 2023

@Mehrbod2002 I didn't look at the code much but I tried it and it didn't seem to work.
image

@Mehrbod2002
Copy link
Contributor Author

@fdncred
Sure about right commit ? Please consider to check it .
Screenshot 2023-01-16 08:19:57

@fdncred
Copy link
Collaborator

fdncred commented Jan 16, 2023

You are right @Mehrbod2002, it works. I have issues checking out your PR now. I get this unless I use --force. I must of not seen the error previously.

gh pr checkout 7689
From https://github.com/nushell/nushell
 ! [rejected]            refs/pull/7689/head -> lscolor  (non-fast-forward)
failed to run git: exit status 1

image

@Mehrbod2002
Copy link
Contributor Author

You are right @Mehrbod2002, it works. I have issues checking out your PR now. I get this unless I use --force. I must of not seen the error previously.

gh pr checkout 7689
From https://github.com/nushell/nushell
 ! [rejected]            refs/pull/7689/head -> lscolor  (non-fast-forward)
failed to run git: exit status 1

image

Yes
It's because my Branch was not full update, can you help how can update brach with keep these changes ?

@fdncred
Copy link
Collaborator

fdncred commented Jan 16, 2023

I'm no git expert so this could be the wrong advice, but what i usually try to do is to make sure my main branch is up to date and then merge main on top of my changes.

git checkout main
git fetch upstream
git merge upstream/main
git push origin main
git checkout your_branch_name
git merge main

@Mehrbod2002
Copy link
Contributor Author

@fdncred Problem with commit solved now ?

@fdncred
Copy link
Collaborator

fdncred commented Jan 16, 2023

yup, it works better now.

@kubouch
Copy link
Contributor

kubouch commented Jan 17, 2023

This introduces the exact problem I described above. Adding a new enum variant adds too much developer overhead as we need to check every single pattern match on Value to see if it comes with Metadata. IMO this is too error-prone.

@Mehrbod2002
Copy link
Contributor Author

This introduces the exact problem I described above. Adding a new enum variant adds too much developer overhead as we need to check every single pattern match on Value to see if it comes with Metadata. IMO this is too error-prone.

Why do you think one condition is going to be too error-prone ? How many commands will use metadata now ?
If you are going to conver Enum with new one which contains Metadata you have check on it again and too much overhead too.
Only 2 commands not all of them , any implementations further or feature can be able to do it on any command too in future .

And this overhead is we use List and ValueWithMetadata on Ls which we will know it will be ValueWithMetadata .
So we don't need to check every command, we new the commands which maybe contains Metadata we use ValueWithMetadata not List for example :)

Anyway is there better way we can think on it appreciate to know it ( except change the whole core to cover Value with new enum )

@Mehrbod2002
Copy link
Contributor Author

@fdncred @kubouch
So by this agreement , I don't think this PR going to be solving this problem as least temporary .
Anyway no solution we have too .
Ready to close this PR .

@fdncred
Copy link
Collaborator

fdncred commented Jan 17, 2023

I'm sad to see this closed. Hopefully we can come up with a work-able solution soon. Thanks @Mehrbod2002 for working so hard on this.

@kubouch kubouch added the pipeline-metadata Additional information passed along that is not directly attached to the values (e.g. LSCOLORS) label Feb 3, 2023
@kubouch kubouch mentioned this pull request Feb 3, 2023
3 tasks
@Mehrbod2002 Mehrbod2002 deleted the lscolor branch February 4, 2023 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pipeline-metadata Additional information passed along that is not directly attached to the values (e.g. LSCOLORS)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

changing default display_output hook disable ls_colors
5 participants