Skip to content

Linux: This operation requires the ICU library #172

Closed
@zcobol

Description

@zcobol
  • install edit on Linux
  • open editor and select Find
  • This operation requires the ICU library error is being shown

After this error the Edit menu won't show Find and Replace options anymore:

Before selecting Find:

Image

After:

Image

Activity

Neo-vortex

Neo-vortex commented on May 21, 2025

@Neo-vortex

have you tried this installing it ?
apt-get install libicu-dev
@zcobol

zcobol

zcobol commented on May 21, 2025

@zcobol
Author

@Neo-vortex test was done on Fedora 42 (no WSL) and libicu was already installed. After adding libicu-devel it works. Thank for the info!

However, the expectation was that all dependencies to be satisfied by the release at https://github.com/microsoft/edit/releases/download/v1.0.0/edit-1.0.0-x86_64-linux-gnu.xz

changed the title [-]On Linux `Find` doesn't work[/-] [+]Linux: This operation requires the ICU library[/+] on May 21, 2025
lhecker

lhecker commented on May 21, 2025

@lhecker
Member

Adding a Unicode-aware regex engine (like the regex crate) to the project would increase the binary size from 200KB to over 700KB. Enabling its performance features would increase it to about 1100KB. Because of this, the editor depends on your OS to provide libicu. In the future it would be nice to have a fallback for the search that at least works with ASCII (and matches Unicode literally, byte-by-byte).

#52 reported the same issue but let's keep them separate and use this issue to track finding the right ICU version. For more info, see here: #52 (comment)

#52 can then be used to focus on the wrong sort order.

added
I-bugIt shouldn't be doing this.
P-mediumImportant issues, but not urgent. Example: UI doesn't work, but it's not crashing.
E-help-wantedWe encourage you to jump in on these!
on May 21, 2025
kasini3000

kasini3000 commented on May 21, 2025

@kasini3000

utf16le +bom my .ps1 file same issue
Suggest adding:
apt-get install libicu-dev
dnf install libicu-devel
on error message .

emk2203

emk2203 commented on May 21, 2025

@emk2203

Adding a Unicode-aware regex engine (like the regex crate) to the project would increase the binary size from 200KB to over 700KB. Enabling its performance features would increase it to about 1100KB. Because of this, the editor depends on your OS to provide libicu. In the future it would be nice to have a fallback for the search that at least works with ASCII (and matches Unicode literally, byte-by-byte).

I can confirm the issue exists also in Kubuntu 24.10 dev. Even with libicu76 installed, an installation of libicu-dev is needed.

Regarding the filesizes: I was thinking about using edit in rescue systems, but the need to install 52.6 MB of libicu-dev on top of 38.7 MB of libicu76 defeats the purpose. A 197 kB binary is meaningless if it needs 91.3 MB libraries.

In contrast, the standard rescue system editor nano has a 285 kB binary and depends on libc6 (>= 2.38), libncursesw6 (>= 6), libtinfo6 (>= 6).

Is it possible to get a self-contained binary which includes a regex-crate? Just the instructions to build on Linux would be fine if they lead to this 1.1 MB binary including everything. Better than having the need for 91.3 MB libraries. In the future, a version including a search fallback would be great.

lhecker

lhecker commented on May 21, 2025

@lhecker
Member

If anyone wants to send a PR that adds support for a different regex engine, I'll happily accept it. It has to be a compile-time feature though (i.e. a feature in Cargo.toml). Since the buffer is chunked, it needs to use something like the regex-cursor crate.

However, I'd prefer skipping that part and immediately going to the destination: We should have a fallback that works without a large regex crate. It would disable the regex button entirely (and perhaps even the whole-word button?) and perform only ASCII-case-insensitive matching (non-ASCII would get matched literally). It would be easy to build something like that with a basic Boyer–Moore search algorithm (probably needs a custom solution since the buffer is chunked).

diabloproject

diabloproject commented on May 26, 2025

@diabloproject
Contributor

I was not able to replicate the issue, even in ubuntu docker, but maybe something like this will fix the "ICU not found" issue?

172.patch.zip

lhecker

lhecker commented on May 27, 2025

@lhecker
Member

That's the approach used by C# as far as I know and I'd prefer not adopting it. ICU releases 2 (?) major versions each year and so the list of hardcoded version numbers will quickly run out.

DHowett was prototyping an alternative approach, I believe by peering into the /etc/ld.so.cache IIRC.

diabloproject

diabloproject commented on May 27, 2025

@diabloproject
Contributor

Not every distribution/environment has it, plus having specific versions means that if they will break the API/ABI somehow, editor will not pick up faulty versions

I agree that this is not the most elegant solution, reading values that will be absent on many environments also does not sound great.

lhecker

lhecker commented on May 27, 2025

@lhecker
Member

Yes, it's all around bad. Still, I'd like to avoid hardcoding a list of versions if possible. We should consider it an option of last resort. Your patch, for instance, will stop working in a few months already as ICU 77.1 rolls out (it was released 2 weeks ago).

diabloproject

diabloproject commented on May 27, 2025

@diabloproject
Contributor

It should be possible to solve using a range of versions to test, or something similar. And use it as a fallback, while the primary solution could be ldconfig. Just none of the distros I use have ld.so.cache, so I have my concerns =/

hiareigl

hiareigl commented on Jun 4, 2025

@hiareigl

Would it be possible to use something like a Rust portation of Lua pattern matching instead of Regular expressions?
Programming in lua - pattern matching

If one needs more, he could use an external tool. (In case filtering or highlighting via an external command is planned.)

MrDowntempo

MrDowntempo commented on Jun 17, 2025

@MrDowntempo

If anyone wants to send a PR that adds support for a different regex engine, I'll happily accept it. It has to be a compile-time feature though (i.e. a feature in Cargo.toml). Since the buffer is chunked, it needs to use something like the regex-cursor crate.

However, I'd prefer skipping that part and immediately going to the destination: We should have a fallback that works without a large regex crate. It would disable the regex button entirely (and perhaps even the whole-word button?) and perform only ASCII-case-insensitive matching (non-ASCII would get matched literally). It would be easy to build something like that with a basic Boyer–Moore search algorithm (probably needs a custom solution since the buffer is chunked).

I'm not a rust dev, so can't provide the PR, but are you aware of the regex-lite crate which aims to provide regex with a smaller impact on binary size? https://docs.rs/regex-lite/latest/regex_lite/ Still might be too heavy.

lhecker

lhecker commented on Jun 17, 2025

@lhecker
Member

I think adding a non-Unicode regex library would be a bad trade-off. If anything, we should consider making the regex-cursor crate a compile-time option. A fallback boyer-moore or similar searcher is still useful in case an ICU version of this project fails to load ICU for some reason.

added a commit that references this issue on Jun 19, 2025
b277a1e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    E-help-wantedWe encourage you to jump in on these!I-bugIt shouldn't be doing this.P-mediumImportant issues, but not urgent. Example: UI doesn't work, but it's not crashing.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @lhecker@emk2203@MrDowntempo@Neo-vortex@kasini3000

      Issue actions

        Linux: This operation requires the ICU library · Issue #172 · microsoft/edit