Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose Utf8Lossy as Utf8Chunks #99544

Merged
merged 1 commit into from
Aug 20, 2022
Merged

Expose Utf8Lossy as Utf8Chunks #99544

merged 1 commit into from
Aug 20, 2022

Conversation

dylni
Copy link
Contributor

@dylni dylni commented Jul 21, 2022

This PR changes the feature for Utf8Lossy from str_internals to utf8_lossy and improves the API. This is done to eventually expose the API as stable.

Proposal: rust-lang/libs-team#54
Tracking Issue: #99543

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Jul 21, 2022
@rustbot
Copy link
Collaborator

rustbot commented Jul 21, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive
Copy link
Collaborator

r? @kennytm

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 21, 2022
@dylni
Copy link
Contributor Author

dylni commented Jul 21, 2022

@rustbot label +T-libs-api -T-libs

@rustbot rustbot added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 21, 2022
@dylni dylni mentioned this pull request Jul 21, 2022
4 tasks
@rust-log-analyzer

This comment has been minimized.

@dylni
Copy link
Contributor Author

dylni commented Aug 13, 2022

r? rust-lang/libs-api
@rustbot

@dylni
Copy link
Contributor Author

dylni commented Aug 13, 2022

Based on rust-lang/highfive#419 and since the previous command isn't working, reassigning randomly:

r? @Mark-Simulacrum


f.write_char('"')
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the new Debug impl is just going to format to the byte slice, rather than the (nicer) view we previously had. Maybe we can keep that nicer Debug impl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I avoided this was that I wasn't sure if the formatting should be Utf8Chunks("...") or "...". However, formatting as "..." could be helpful to avoid the primary use case of bstr::BStr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a Debug impl we typically wrap in the struct name, and I think that makes sense in this case too. We could consider a Display impl that formats as "...", though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difficulty with using a Display impl instead of Debug for this is that I don't think there's anywhere else in libstd where Display uses byte escapes (except obvious cases such as EscapeDefault), so it might be inconsistent. I would prefer to add something similar to Path::display that would return a new struct.

Based on your later comment, do you want me to squash commits and keep the PR as-is or add the private method described above and use debug_struct to include the struct name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't quite realize this wasn't addressed in my quick re-review skim. Let's add the private method and use debug_struct; I think that'll be slightly better. That can be squashed with the other commits though.

}

pub fn chunks(&self) -> Utf8LossyChunksIter<'_> {
Utf8LossyChunksIter { source: &self.bytes }
/// Returns the invalid character that follows the valid substring and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I guess this isn't really strictly an invalid character, but rather arbitrary bytes? (Of max length 4? Not sure on that, but seems like it might be worth noting).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I clarified this to be "invalid sequence". The result would be a sequence that should have been a character, so calling it a character could be technically wrong.

I added a note about the maximum length as well, but I believe it should be 3 instead of 4. At 4 characters, the sequence should always be parsed as a character.

@Mark-Simulacrum
Copy link
Member

@rustbot author

Overall looks great, just a couple smaller nits and I think we can go ahead and merge.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 14, 2022
@dylni
Copy link
Contributor Author

dylni commented Aug 16, 2022

Thanks for the quick review! All comments should be addressed.

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 16, 2022
@Mark-Simulacrum
Copy link
Member

r=me with commits squashed, thanks!

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 20, 2022
@dylni
Copy link
Contributor Author

dylni commented Aug 20, 2022

Sorry for the noise. The new method is not private, but it's as close as I can get, which caused way more issues than I expected.

Diff: https://github.com/rust-lang/rust/compare/fc6e13dd5fd9b46e3b10aca21d0567917236f4e6..f402f48d36af05d9782c8e0c433e1bac05335f9e

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 20, 2022
@Mark-Simulacrum
Copy link
Member

Seems OK -- I think we should consider a Display impl on Utf8Chunks or some other way to expose this beyond std properly, so maybe if you're interested that's a good idea for an ACP to follow up on this.

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Aug 20, 2022

📌 Commit f402f48d36af05d9782c8e0c433e1bac05335f9e has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 20, 2022
@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Aug 20, 2022
@rust-log-analyzer

This comment has been minimized.

@dylni
Copy link
Contributor Author

dylni commented Aug 20, 2022

Sorry about the formatting, it should work now. I didn't consider the rustfmt.toml file changing things when I tested locally.

@Mark-Simulacrum
Copy link
Member

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Aug 20, 2022

📌 Commit e8ee0b7 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 20, 2022
…iaskrgr

Rollup of 9 pull requests

Successful merges:

 - rust-lang#99415 (Initial implementation of REUSE)
 - rust-lang#99544 (Expose `Utf8Lossy` as `Utf8Chunks`)
 - rust-lang#100585 (Fix trailing space showing up in example)
 - rust-lang#100596 (Remove unnecessary stderr files)
 - rust-lang#100642 (Update fortanix-sgx-abi and export some useful SGX usercall traits)
 - rust-lang#100691 (Make `same_type_modulo_infer` a proper `TypeRelation`)
 - rust-lang#100693 (Add LLVM15-specific codegen test for `try`/`?`s that now optimize away)
 - rust-lang#100710 (Windows: Load synch functions together)
 - rust-lang#100807 (Add TaKO8Ki to translation-related mention groups)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit d499065 into rust-lang:master Aug 20, 2022
@rustbot rustbot added this to the 1.65.0 milestone Aug 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants