Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to syntect 4.1, fancy-regex support #900

Merged
merged 2 commits into from
Apr 11, 2020
Merged

Conversation

sharkdp
Copy link
Owner

@sharkdp sharkdp commented Mar 31, 2020

This PR updates the syntect dependency to 4.1.

It also adds a regex-onig feature (enabled for the application) and a regex-fancy feature, which can be enabled to use fancy-regex as an engine in syntect.

With this change, applications which depend on bat as a library, can use

[dependencies.bat]
version = ""
default-features = false
features = ["regex-fancy"]

to remove the onig dependency. The full dependency tree after this update is attached below.

One problem that has not yet been resolved is the included set of syntaxes, which is not fully compatible with regex-fancy (due to regex rewriting, as described in the syntect 4.0 release notes) because it has been built with a version of bat that uses regex-onig. Fixing this would probably require us to

  • build two versions of bat (bat-onig and bat-fancy)
  • Adapt assets/create.sh to run bat cache --build … with both versions, creating two different binary dumps
  • Adapt assets.rs to read the respective binary, depending on the chosen feature:

bat/src/assets.rs

Lines 96 to 98 in 3355aeb

fn get_integrated_syntaxset() -> SyntaxSet {
from_binary(include_bytes!("../assets/syntaxes.bin"))
}

Unfortunately, that is currently not possible (with the full syntax set), due to trishume/syntect#287

FYI: @dtolnay

Full dependency tree when using features = ["regex-fancy"]

└── bat
    ├── ansi_colours v1.0.1
    │   [build-dependencies]
    │   └── cc v1.0.50
    ├── ansi_term v0.12.1
    ├── console v0.10.0
    │   ├── clicolors-control v1.0.1
    │   │   ├── lazy_static v1.4.0
    │   │   └── libc v0.2.68
    │   ├── lazy_static v1.4.0 (*)
    │   ├── libc v0.2.68 (*)
    │   ├── regex v1.3.6
    │   │   ├── aho-corasick v0.7.10
    │   │   │   └── memchr v2.3.3
    │   │   ├── memchr v2.3.3 (*)
    │   │   ├── regex-syntax v0.6.17
    │   │   └── thread_local v1.0.1
    │   │       └── lazy_static v1.4.0 (*)
    │   ├── termios v0.3.1
    │   │   └── libc v0.2.68 (*)
    │   └── unicode-width v0.1.7
    ├── content_inspector v0.2.4
    │   └── memchr v2.3.3 (*)
    ├── encoding v0.2.33
    │   ├── encoding-index-japanese v1.20141219.5
    │   │   └── encoding_index_tests v0.1.4
    │   ├── encoding-index-korean v1.20141219.5
    │   │   └── encoding_index_tests v0.1.4 (*)
    │   ├── encoding-index-simpchinese v1.20141219.5
    │   │   └── encoding_index_tests v0.1.4 (*)
    │   ├── encoding-index-singlebyte v1.20141219.5
    │   │   └── encoding_index_tests v0.1.4 (*)
    │   └── encoding-index-tradchinese v1.20141219.5
    │       └── encoding_index_tests v0.1.4 (*)
    ├── error-chain v0.12.2
    │   [build-dependencies]
    │   └── version_check v0.9.1
    ├── globset v0.4.5
    │   ├── aho-corasick v0.7.10 (*)
    │   ├── bstr v0.2.12
    │   │   └── memchr v2.3.3 (*)
    │   ├── fnv v1.0.6
    │   ├── log v0.4.8
    │   │   └── cfg-if v0.1.10
    │   └── regex v1.3.6 (*)
    ├── syntect v4.1.0
    │   ├── bincode v1.2.1
    │   │   ├── byteorder v1.3.4
    │   │   └── serde v1.0.105
    │   ├── bitflags v1.2.1
    │   ├── fancy-regex v0.3.3
    │   │   ├── bit-set v0.5.1
    │   │   │   └── bit-vec v0.5.1
    │   │   └── regex v1.3.6 (*)
    │   ├── flate2 v1.0.14
    │   │   ├── cfg-if v0.1.10 (*)
    │   │   ├── crc32fast v1.2.0
    │   │   │   └── cfg-if v0.1.10 (*)
    │   │   ├── libc v0.2.68 (*)
    │   │   └── miniz_oxide v0.3.6
    │   │       └── adler32 v1.0.4
    │   ├── fnv v1.0.6 (*)
    │   ├── lazy_static v1.4.0 (*)
    │   ├── lazycell v1.2.1
    │   ├── plist v0.5.3
    │   │   ├── base64 v0.10.1
    │   │   │   └── byteorder v1.3.4 (*)
    │   │   ├── humantime v2.0.0
    │   │   ├── indexmap v1.3.2
    │   │   │   [build-dependencies]
    │   │   │   └── autocfg v1.0.0
    │   │   ├── line-wrap v0.1.1
    │   │   │   └── safemem v0.3.3
    │   │   ├── serde v1.0.105 (*)
    │   │   └── xml-rs v0.8.1
    │   ├── regex-syntax v0.6.17 (*)
    │   ├── serde v1.0.105 (*)
    │   ├── serde_derive v1.0.105
    │   │   ├── proc-macro2 v1.0.10
    │   │   │   └── unicode-xid v0.2.0
    │   │   ├── quote v1.0.3
    │   │   │   └── proc-macro2 v1.0.10 (*)
    │   │   └── syn v1.0.17
    │   │       ├── proc-macro2 v1.0.10 (*)
    │   │       ├── quote v1.0.3 (*)
    │   │       └── unicode-xid v0.2.0 (*)
    │   ├── serde_json v1.0.50
    │   │   ├── itoa v0.4.5
    │   │   ├── ryu v1.0.3
    │   │   └── serde v1.0.105 (*)
    │   ├── walkdir v2.3.1
    │   │   └── same-file v1.0.6
    │   └── yaml-rust v0.4.3
    │       └── linked-hash-map v0.5.2
    └── unicode-width v0.1.7 (*)

@gilescope
Copy link

The idea that people could use bat on a windows machine just having a rust install without having to worry about configuring c++ compilers is a win in my book even if not every syntax is supported.

@sharkdp
Copy link
Owner Author

sharkdp commented Apr 11, 2020

I guess there is no harm in merging this, even if the regex-fancy version is not fully compatible with all syntaxes.

@sharkdp sharkdp merged commit fccbe4f into master Apr 11, 2020
@sharkdp sharkdp deleted the syntect-4.1-update branch April 11, 2020 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants