Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make index compatible with virtual drives on Windows #1843

Merged
merged 2 commits into from
Feb 14, 2023

Conversation

gyk
Copy link
Contributor

@gyk gyk commented Feb 9, 2023

Fixes #1724.

Cargo.toml Outdated
@@ -23,6 +23,7 @@ regex = { version = "1.5.5", default-features = false, features = ["std", "unico
aho-corasick = "0.7"
tantivy-fst = "0.4.0"
memmap2 = { version = "0.5.3", optional = true }
normpath = "1.0.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not introduce a dependency for this. Can you remove it and stick to std's canonicalize?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove it and stick to std's canonicalize?

Unfortunately this is not possible currently as the PR attempting to fix the problem was closed, but in the future we can switch to std::path::absolute once it is stabilized.

IMHO normpath is quite lightweight, and its biggest dependency is windows-sys = { version = "0.45", features = ["Win32_Storage_FileSystem"] } (not the macro-heavy windows crate), which only adds some reasonable amount of compilation time and little extra size to the release build. Also Tantivy has already included dependencies of windows 0.39 and windows-sys 0.42.

cargo-bloat output
    Analyzing target\release\bloat-normpath.exe

 File  .text     Size    Crate Name
 1.9%   2.6%   3.2KiB      std rustc_demangle::demangle
 1.7%   2.4%   3.0KiB      std <rustc_demangle::legacy::Demangle as core::fmt::Display>::fmt
 1.4%   1.9%   2.4KiB normpath normpath::normalize::normalize_virtually
 1.3%   1.7%   2.2KiB      std core::str::pattern::impl$28::is_contained_in
 1.1%   1.5%   1.9KiB      std <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
 1.0%   1.4%   1.8KiB      std rustc_demangle::v0::Printer::print_path
 1.0%   1.4%   1.7KiB      std std::sys_common::once::generic::Once::call
 1.0%   1.4%   1.7KiB      std std::sys::windows::path::maybe_verbatim
 1.0%   1.4%   1.7KiB      std rustc_demangle::v0::Printer::print_const
 1.0%   1.3%   1.7KiB      std core::str::count::do_count_chars
56.7%  77.7%  97.9KiB          And 443 smaller methods. Use -n N to show more.
73.0% 100.0% 126.0KiB          .text section size, the file size is 172.5KiB

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gyk I won't merge this with the extra dependency. Isn't this PR useful without the canonicalization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, with std::fs::canonicalize the index still cannot be put on special drives. You might want to consider:

  1. Make normpath a Windows-only dependency, or
  2. Call std::fs::canonicalize like before, but in case it returns this kind of error but the path does exist, just use the original directory_path as a fallback.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok so canonicalize return an error for that kind of path. I see. I'm ok with solution 1 or solution 2. You can decide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll update the code later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fulmicoton Updated using approach 2 above. PTAL.

Copy link
Collaborator

@fulmicoton fulmicoton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gyk can you remove the deps on normpath?

@codecov-commenter
Copy link

Codecov Report

Merging #1843 (abec685) into main (dab93df) will decrease coverage by 0.03%.
The diff coverage is 7.14%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main    #1843      +/-   ##
==========================================
- Coverage   94.41%   94.39%   -0.03%     
==========================================
  Files         297      297              
  Lines       53269    53281      +12     
==========================================
  Hits        50294    50294              
- Misses       2975     2987      +12     
Impacted Files Coverage Δ
src/directory/mmap_directory.rs 87.28% <7.14%> (-2.64%) ⬇️
src/store/index/mod.rs 97.83% <0.00%> (-0.55%) ⬇️
src/query/boolean_query/block_wand.rs 97.06% <0.00%> (+0.20%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@fulmicoton fulmicoton merged commit dfe4e95 into quickwit-oss:main Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create index in virtual drives on Windows
3 participants