Skip to content

Redesigned Components iterator to use front and back indexing instead mutating and subslicing path field#156496

Open
asder8215 wants to merge 1 commit into
rust-lang:mainfrom
asder8215:components_rewrite
Open

Redesigned Components iterator to use front and back indexing instead mutating and subslicing path field#156496
asder8215 wants to merge 1 commit into
rust-lang:mainfrom
asder8215:components_rewrite

Conversation

@asder8215
Copy link
Copy Markdown
Contributor

@asder8215 asder8215 commented May 12, 2026

This PR entirely changes how Components<'_> is implemented. Currently, the Components<'_> iterator 'consumes' components through mutating its path field to a subslice that presents the left over unconsumed path components (this consumed path component is what's returned in Components::next or Components::next_back). However, this PR keeps the path field alive/unmodified and uses front and back indexing strategy to extract consumed/unconsumed components.

This PR benefits implementations like Components::as_path, which is pretty used is multiple areas of the standard library. Previously, Components<'_> iterator was required to clone inside the function to present the unconsumed path because our original Component<'_> consuming behavior on path will not allow the returned &'a Path from Components::as_path to last after a Components::next or Components::next_back call. Due to the current implementation of Components iterator has a size of 64 bytes, if you're using Components::as_path after each Components::next/Components::next_back, then it's pretty unfortunate to be cloning 64 bytes again and again, especially if each of your path components are a few bytes (e.g., "foo/bar/baz").

On the point of size, with the indexing strategy, this PR has further optimized the size of Components<'_> from 64 bytes -> 40 bytes since a large chunk of the Components<'_> was taken up by the Option<Prefix> (this takes up 40 bytes), which we indicate that a prefix exists/unconsumed through calling parse_prefix on the path field (which I think is inexpensive since these Windows prefix length are not that long I believe) and seeing if our first_comp field is Some(_) or None (front index is encoded with prefix length if it exists, so we don't need to parse prefix again within Components::next or Components::next_back).

Due to not having the prefix Option<Prefix> field inside Components<'_> anymore, all the prefix functions in Components<'_> have been removed in favor of calling parse_prefix, Prefix::is_verbatim, Prefix::is_drive, etc.

I'm curious if this redesign of Components<'_> improves Path equality as pointed out by @clarfonthey in #154521 with Path equality being slow; I haven't benchmarked this though.

Right now, when I tested it locally on my PC (Fedora OS), it passed all the standard library tests and rust analyzer didn't crash on me (had a few crash reports coming from rust analyzer early on when I messed around with Components<'_> dealing something with threads using Path::components, but now that's all resolved). I have not tested this on Windows yet, and I would probably need someone to help me test on this platform as my Windows VM is not working properly to run the standard library test suite.

There's a lot of things being done here, and possibly there may be better approaches or ways I could improve this implementation or write the code in a neater way here. I am open to any advice or feedback on this approach.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 12, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented May 12, 2026

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: @ChrisDenton, libs
  • @ChrisDenton, libs expanded to 8 candidates

@rustbot

This comment has been minimized.

@asder8215 asder8215 force-pushed the components_rewrite branch from 1627e2f to 33e69e1 Compare May 12, 2026 09:09
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented May 12, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@rust-log-analyzer

This comment has been minimized.

@asder8215 asder8215 force-pushed the components_rewrite branch from 33e69e1 to ed9d33d Compare May 12, 2026 17:05
@rust-log-analyzer

This comment has been minimized.

@asder8215 asder8215 force-pushed the components_rewrite branch from ed9d33d to 0b0f84c Compare May 12, 2026 17:19
@rust-log-analyzer

This comment has been minimized.

… of mutating and subslicing path field; as a result, Components iterator memory size goes from 64 bytes to 40 bytes and as_path does not use cloning at all
@asder8215 asder8215 force-pushed the components_rewrite branch from 0b0f84c to 8ed33ea Compare May 12, 2026 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants