Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.8.0 Fix wrapped filenames with hyperlinks #223

Merged
merged 16 commits into from Sep 26, 2022

Conversation

dandavison
Copy link
Contributor

@dandavison dandavison commented Sep 14, 2022

Hi @zhiburt, this patch fixes nushell bug nushell/nushell#6553 causing filenames in narrow tables to be incorrectly displayed, and fixes #203. The patch was made against tabled v0.8.0 (the version of tabled that nushell is currently using), but as you can see it's a tiny change, which can also be made against src/features/width/wrap.rs in your devel branch of tabled (here's a branch targeting your current master). I thought this PR targeting v0.8.0 might be more useful for getting the fix into nushell sooner. The crate it's using is here. It has unit tests and has been well tested by users, running in delta for a couple of years now.

I've added one example: cargo run --example hyperlink_keep_words displays a table which is incorrect without the patch:

On this branch

 〉cargo run --example hyperlink_keep_words
.-----------------------------------------------.
| name                           | is_hyperlink |
| Debian                         | true         |
| Debian                         | false        |
| DebianDebianDebianDebianDebian | true         |
| DebianDebianDebianDebianDebian |              |
| DebianDebianDebianDebian       |              |
| DebianDebianDebianDebianDebian | false        |
| DebianDebianDebianDebianDebian |              |
| DebianDebianDebianDebian       |              |
'-----------------------------------------------'

With the patch reverted

 〉git revert --no-edit HEAD
 .-----------------------------------------------.
| name                           | is_hyperlink |
| De | true         |
| bian                      |              |
| Debian                         | false        |
| De | true         |
| bianDebianDebianDebianDebianDe |              |
| bianDebianDebianDebianDebianDe |              |
| bianDebianDebianDebian    |              |
| DebianDebianDebianDebianDebian | false        |
| DebianDebianDebianDebianDebian |              |
| DebianDebianDebianDebian       |              |
'-----------------------------------------------'

Currently it's simply removing the hyperlinks. I have the code to re-apply the hyperlinks to the wrapped segments, but I didn't want to disturb your split_keeping_words and split functions.

@zhiburt
Copy link
Owner

zhiburt commented Sep 14, 2022

Hi @dandavison

But I was thinking we shall preserve the links?
Or we can strip them?

I mean in case of wrapping where text is just a link with no other content, all lines would be links.
What do you think?

@dandavison
Copy link
Contributor Author

But I was thinking we shall preserve the links?

Hi @zhiburt, good call! I agree. I didn't do that last night because I wanted to think about the API, but I'm going to have a go now. I'm thinking that the best way might be to return a prefix and a suffix, that can be reapplied to each fragment of text.

@dandavison
Copy link
Contributor Author

Hey @zhiburt, just a quick question: could you confirm that this is a bug in v0.8.0: 82e28c9 I think you may have fixed it in master.

@zhiburt
Copy link
Owner

zhiburt commented Sep 14, 2022

Hey @zhiburt, just a quick question: could you confirm that this is a bug in v0.8.0: 82e28c9 I think you may have fixed it in master.

I think you're correct with a fix.
I think it was not fixed in the master.

is_first_word = true;

Do you still working on this?
As I kind of stuck..... 😞 as I've noticed some issue in my logic (I was getting link indexes and strip the links via regex and then doing wrap while restoring links by offsets, but I forget about shift SGR gives (which we also "stripped" before)).

@dandavison
Copy link
Contributor Author

Thanks! Yes, I'm working on this. I'm almost ready to open a PR. Hopefully I can do it this evening. Here's my current diff:

dandavison/tabled@v0.8.0-vte-ansi-iterator-hyperlinks...v0.8.0-vte-ansi-iterator-hyperlinks-osc-partition

dandavison/vte-ansi-tools@main...osc-partition

@dandavison
Copy link
Contributor Author

Clearly we have to support stuff like this :)

printf '\e]8;;http://example.com\e\\\e[31mThis\e[0m \e[32mis\e[0m \e[33ma\e[0m \e[34mlink\e[0m\e]8;;\e\\\n'

image

(I wonder if Alacritty's even supporting that correctly? There's no link on "is a link")

@dandavison
Copy link
Contributor Author

dandavison commented Sep 15, 2022

Hi @zhiburt, OK, this is very close I think.

cargo run --features=color --example hyperlink_keep_words
image

So that is almost correct. There's just a small bug with the way the colors are starting too early, at the ends of the lines: image

The bug only occurs with split_keeping_words. Here it is using plain split:

image

I'm guessing we should look for this bug in the previously existing code, but I might be wrong :)

Anyway, what do you think? Shall we fix nushell tables with an update to tabled including this? And if you're able to help find that bug that would be awesome, but I'll look too.

@dandavison
Copy link
Contributor Author

nushell tables are looking correct. It looks like that bug doesn't affect unicolored file names:

image

@dandavison dandavison force-pushed the v0.8.0-vte-ansi-iterator-hyperlinks branch from a39052e to eef2e51 Compare September 15, 2022 04:16
src/width.rs Outdated Show resolved Hide resolved
@fdncred
Copy link
Contributor

fdncred commented Sep 15, 2022

loving this team work - excited to see a fix in nushell, although a light colored terminal is questionable. 🤣

Copy link
Owner

@zhiburt zhiburt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dandavison sorry for the delay.

I think the PR accomplish a general case.
Where we have a string being considered to be a link.
But it may not be always the case, see my comments bellow.

Yes that's why it's not that an easy issue, and doing it with no allocations is even more complex.

What do you think @dandavison am I right here?

src/width.rs Outdated Show resolved Hide resolved
src/width.rs Outdated Show resolved Hide resolved
@dandavison
Copy link
Contributor Author

@zhiburt thanks very much for reviewing carefully and catching those two points. I've updated the branch to strip the hyperlinks in the cases that you identified.

I suggest that we leave it to future work to do the more sophisticated operations of reapplying parsed hyperlinks to the precise substrings that they should apply to. Meanwhile, nushell doesn't need that feature, and the current branch state will fix nushell. Does that sound OK to you?

@zhiburt
Copy link
Owner

zhiburt commented Sep 15, 2022

So you @dandavison return prefix/suffix only in case if 1 link is present and no additional text is there, otherwise you trim the links. Correct?

    let table = |text: &str| {
        let data = [Distribution {
            name: text.to_owned(),
            is_hyperlink: true,
        }];

        Table::from_iter(data)
            .with(
                Segment::all()
                    .modify()
                    .with(Width::wrap(30).keep_words())
                    .with(Alignment::left()),
            )
            .to_string()
    };

    let text = format!(
        "asd {} 2 links in a string {}",
        format_osc8_hyperlink("https://www.debian.org/", "Debian"),
        format_osc8_hyperlink("https://www.wikipedia.org/", "Debian"),
    );
    assert_eq!(
        table(&text),
        "+--------------------------------+--------------+\n\
         | name                           | is_hyperlink |\n\
         +--------------------------------+--------------+\n\
         | asd Debian 2 links in a string | true         |\n\
         | Debian                         |              |\n\
         +--------------------------------+--------------+"
    );

    let text = format!(
        "link: {}",
        format_osc8_hyperlink("https://www.debian.org/", "Debian"),
    );
    assert_eq!(
        table(&text),
        "+--------------------------------+--------------+\n\
         | name                           | is_hyperlink |\n\
         +--------------------------------+--------------+\n\
         | link: Debian                   | true         |\n\
         +--------------------------------+--------------+"
    );

    let text = format!(
        "{} :link",
        format_osc8_hyperlink("https://www.debian.org/", "Debian"),
    );
    assert_eq!(
        table(&text),
        "+--------------------------------+--------------+\n\
         | name                           | is_hyperlink |\n\
         +--------------------------------+--------------+\n\
         | Debian :link                   | true         |\n\
         +--------------------------------+--------------+"
    );

I suggest that we leave it to future work to do the more sophisticated operations of reapplying parsed hyperlinks to the precise substrings that they should apply to. Meanwhile, nushell doesn't need that feature, and the current branch state will fix nushell. Does that sound OK to you?

I think you may be right.
Doing it like you did is good enough in most cases.

Could you add tests?
And could you hide the dependency after color feature? (yes... it's better be called ansi but....)

Let me know if you don't have time.

@dandavison
Copy link
Contributor Author

Could you add tests?
And could you hide the dependency after color feature?

Will do!

src/width.rs Outdated Show resolved Hide resolved
dandavison added a commit to dandavison/tabled that referenced this pull request Sep 17, 2022
@zhiburt
Copy link
Owner

zhiburt commented Sep 19, 2022

Hi there,
sorry for the delay (as always)

So what do you think shall we merge?

If we want to, we can try to reduce allocations in subsequent PRs. E.g. it might be possible to return Cow from strip_osc and sometimes not allocate.

Personally I am OK with it, as it should not happen often.

It would also be possible to extend strip_osc so that it returns all URLs encountered; currently it returns the first only. It would then be possible in theory to reapply the URLs to the correct substrings of wrapped text.

As we've discussed already it may be left as a TODO, but if you wish welcome 😅

color starting too early, on the line before it should start

I could be intentional could you show the string you've used?

tabled is inserting spaces between the non-word ANSI boundaries due to a .push_str(' ')

True but where it may become an issue?

@dandavison
Copy link
Contributor Author

dandavison commented Sep 19, 2022

Hi @zhiburt, no problem!

color starting too early, on the line before it should start
could you show the string you've used?

You can see the issue in the output of

cargo run --features=color --example hyperlink_keep_words

tabled is inserting spaces between the non-word ANSI boundaries due to a .push_str(' ')

True but where it may become an issue?

You can see it in the spaces between the word "Debian". (Even if you get rid of the hyperlink, there are still spaces inserted)

image

@zhiburt
Copy link
Owner

zhiburt commented Sep 19, 2022

You can see it in the spaces between the word "Debian". (Even if you get rid of the hyperlink, there are still spaces inserted)

These spaces must be not colored right?

image

Thanks for pointing me it.
I'll work on it.

@dandavison
Copy link
Contributor Author

dandavison commented Sep 19, 2022

These spaces must be not colored right?

Right:

  1. The black and blue colored rectangles should not be there.
  2. It should be DebianDebianDebian, not Debian Debian Debian

I'll work on it.

Great!

I'm working on removing all usage of ansi_term from the vte-ansi-tools library that contains the AnsiIterator.

@zhiburt
Copy link
Owner

zhiburt commented Sep 21, 2022

About the issue with spaces.
I believe it was fixed on master earlier.

image
image

@dandavison
Copy link
Contributor Author

About the issue with spaces.
I believe it was fixed on master earlier.

OK, great. I don't think it's immediately necessary for nushell, since nushell links are a single color. So I think it would be OK to create an initial release fixing nushell that doesn't contain the fix in master. (But if you know how to apply the fix to this branch then please feel free to push commits to my branch)

(By the way, I've now removed all usage of ansi_term from the vte-ansi-tools library.)

@fdncred
Copy link
Contributor

fdncred commented Sep 21, 2022

The only time (i think) that things are multiple colors of ansi is when using find. Here's an example. This is in the current main. I'm not sure this applies here though.
image

@dandavison
Copy link
Contributor Author

dandavison commented Sep 21, 2022

The only time (i think) that things are multiple colors of ansi is when using find. Here's an example. This is in the current main. I'm not sure this applies here though.

Thank you @fdncred I think this is an excellent point, and very relevant. It gives a great real-world test that everything is working correctly.

On a first look, I sort of think there may be a bug: I think sometimes the way it is inserting the ANSI colors for the find result is breaking the hyperlink (even without wrapping).

For example, try searching for a single letter like ls | find t in the nushell repo root. I am seeing that some of the files retain functioning symlinks and some do not. E.g. check out the one under my cursor in the screenshot below: the hyperlink is only rendered on the first part of the word. (I don't have time to look into this any more now though!).

image

@fdncred
Copy link
Contributor

fdncred commented Sep 21, 2022

There could be a bug but I can't reproduce your findings on the latest main branch in Windows.
image

@dandavison
Copy link
Contributor Author

dandavison commented Sep 21, 2022

Ah, thanks. Looks like it's an Alacritty bug. Works on Wezterm. Do you see it on Alacritty? I can report it to Alacritty this evening. It's what I also mentioned above: #223 (comment)

@fdncred
Copy link
Contributor

fdncred commented Sep 21, 2022

I'm not sure how to enable links in alacritty. A quick search didn't reveal much. I enabled url: and laucher: but it doesn't seem to repond to it. so i'm not really sure what is wrong.

@zhiburt zhiburt self-requested a review September 25, 2022 21:15
@zhiburt
Copy link
Owner

zhiburt commented Sep 25, 2022

I guess it's finished right?

I hope I resolved the conflicts.

I wonder if this is correct?

tabled/tests/width_test.rs

Lines 2468 to 2483 in 42addd7

let text = format!(
"{} :link",
format_osc8_hyperlink("https://www.debian.org/", "Debian"),
);
assert_eq!(
table(&text),
"+-------+-------+\n\
| name | is_hy |\n\
| | perli |\n\
| | nk |\n\
+-------+-------+\n\
| Debia | true |\n\
| n | |\n\
| :link | |\n\
+-------+-------+"
);

Also the question do you plan to publish vte-ansi-tools on crates.io?

@dandavison
Copy link
Contributor Author

dandavison commented Sep 25, 2022

Great. What do you think about just copying strip_osc and AnsiIterator into a module in tabled for now, so that this fix can go ahead without waiting for the crate to be prepared? (documentation etc) I'd be happy with that as long as we can switch to a crate when one exists, and don't have much time right now for making the crate carefully.

@zhiburt
Copy link
Owner

zhiburt commented Sep 25, 2022

What do you think about just copying strip_osc and AnsiIterator into a module in tabled for now, so that this fix can go ahead without waiting for the crate to be prepared? (documentation etc) I'd be happy with that as long as we can switch to a crate when one exists, and don't have much time right now for making the crate carefully.

Sounds good, I'll work on it.
Thanks for the commitment.

But still is it correct?
I think a link must be there right?

tabled/tests/width_test.rs

Lines 2468 to 2483 in 42addd7

let text = format!(
"{} :link",
format_osc8_hyperlink("https://www.debian.org/", "Debian"),
);
assert_eq!(
table(&text),
"+-------+-------+\n\
| name | is_hy |\n\
| | perli |\n\
| | nk |\n\
+-------+-------+\n\
| Debia | true |\n\
| n | |\n\
| :link | |\n\
+-------+-------+"
);

Signed-off-by: Maxim Zhiburt <zhiburt@gmail.com>
…avison/tabled into arthurdent-v0.8.0-vte-ansi-iterator-hyperlinks
@zhiburt
Copy link
Owner

zhiburt commented Sep 26, 2022

But still is it correct?
I think a link must be there right?

It was correct,
When there's not only link we strip the link. Which is good enough as we agreed.

Thanks for your contribution once again.
@dandavison

@zhiburt zhiburt merged commit ffe1f55 into zhiburt:master Sep 26, 2022
@dandavison
Copy link
Contributor Author

Nice work finishing this off @zhiburt! I'm running your branch nushell/nushell#6286 in my local nushell to test it. I'm a bit busy the next two weeks but later on I'll look into getting that crate published, and I'd like to help out with benchmarking / optimizations if I get time.

@zhiburt
Copy link
Owner

zhiburt commented Oct 12, 2022

Hi @dandavison

Wanted to notice that;
I've spend a bit of time to try to switch to vte (basically your impl) in ansitok.

Some example of an usage.
https://gitlab.com/zhiburt/ansitok/-/blob/b963d7a83a8d928c78000a96e8bafe55a513205e/examples/parse_sgr.rs

I also would like to highlight 1 behavior.
I might be wrong but in an original logic.

Parsing this string \x1b\x1b\x1b\x1b\x1b would result in no yielded elements. Which in my mind must be 5 ESC?
I am pointing this out just in case it matters.

@dandavison
Copy link
Contributor Author

I've spend a bit of time to try to switch to vte (basically your impl) in ansitok.

Cool! My only request is: can we try to keep the VTE parser functions used in tabled and ansitok fairly cleanly isolated, so that we have the option in the future of moving them into a VTE ANSI Parser crate (like I started here: https://github.com/dandavison/vte-ansi-tools)

Parsing this string \x1b\x1b\x1b\x1b\x1b would result in no yielded elements. Which in my mind must be 5 ESC?

Yes I'm open to suggestions here. I'm not sure of the details right now but I think that one reason I ignore ESC like that is because it is used as a terminator in the OSC8 hyperlinks:

        format!(
            "{osc}8;;{url}{st}{text}{osc}8;;{st}",
            url = url,
            text = text,
            osc = "\x1b]",
            st = "\x1b\\"
        )

@dandavison
Copy link
Contributor Author

can we try to keep the VTE parser functions used in tabled and ansitok fairly cleanly isolated, so that we have the option in the future of moving them into a VTE ANSI Parser crate

To be clear: my concrete hope here is that delta and nushell and tabled / ansitok can share a single implementation of the VTE ANSIIterator, and share some of the associated utilities built using the ANSIIterator. I don't mind too much where code is located!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Recognize terminal links when doing wrap/truncate
3 participants