Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use OsString everywhere #1193

Closed
wants to merge 1 commit into from
Closed

Conversation

CryZe
Copy link
Contributor

@CryZe CryZe commented Jan 3, 2024

This makes it so OsString / OsStr are used everywhere instead of String / str, because the operating systems make no guarantee that the values provided are valid UTF-8. The OsString / OsStr types are specifically provided by the standard library for these situations. This fixes situations where sysinfo either had to lossily lose some of the information, had some subtle undefined behavior, and partially rejected processes and other resources entirely, making them impossible to be accessed by the user of this crate.

Resolves #1190

@CryZe
Copy link
Contributor Author

CryZe commented Jan 3, 2024

This is of course a massive breaking change, but I'm opening this so we can figure out if we want to make this change everywhere or to what degree it should be done.

Comment on lines +358 to +370
let friendly_name = match Self::os_version().unwrap_or_default().as_bytes() {
f_n if f_n.starts_with(b"14.0") => "Sonoma",
f_n if f_n.starts_with(b"10.16")
| f_n.starts_with(b"11.0")
| f_n.starts_with(b"11.1")
| f_n.starts_with(b"11.2") =>
{
"Big Sur"
}
f_n if f_n.starts_with("10.15") => "Catalina",
f_n if f_n.starts_with("10.14") => "Mojave",
f_n if f_n.starts_with("10.13") => "High Sierra",
f_n if f_n.starts_with("10.12") => "Sierra",
f_n if f_n.starts_with("10.11") => "El Capitan",
f_n if f_n.starts_with("10.10") => "Yosemite",
f_n if f_n.starts_with("10.9") => "Mavericks",
f_n if f_n.starts_with("10.8") => "Mountain Lion",
f_n if f_n.starts_with("10.7") => "Lion",
f_n if f_n.starts_with("10.6") => "Snow Leopard",
f_n if f_n.starts_with("10.5") => "Leopard",
f_n if f_n.starts_with("10.4") => "Tiger",
f_n if f_n.starts_with("10.3") => "Panther",
f_n if f_n.starts_with("10.2") => "Jaguar",
f_n if f_n.starts_with("10.1") => "Puma",
f_n if f_n.starts_with("10.0") => "Cheetah",
f_n if f_n.starts_with(b"10.15") => "Catalina",
f_n if f_n.starts_with(b"10.14") => "Mojave",
f_n if f_n.starts_with(b"10.13") => "High Sierra",
f_n if f_n.starts_with(b"10.12") => "Sierra",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remove older versions, because they are unsupported by Rust now. Also I renamed it to macOS in the string, as that's the official way to write the name now.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave them as other rust backends still support them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this PR as a whole (I read that you want it to be split up, so I'll take that into account) would require 1.74, which does not support any older versions of macOS.

Comment on lines +731 to +732
s.rsplit(|&b| b == b':')
.next()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rsplit.next should be faster considering we want the last one.

v.push(*i.offset(2));
v.push(*i.offset(3));
fn add_u32(v: &mut Vec<u8>, i: u32) {
v.extend(i.to_ne_bytes());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be safe by simply using to_ne_bytes().

@@ -105,7 +104,7 @@ impl CpusWrapper {
let mut parts = line.split(|x| *x == b' ').filter(|s| !s.is_empty());
if first {
self.global_cpu.inner.name =
to_str!(parts.next().unwrap_or(&[])).to_owned();
OsStr::from_bytes(parts.next().unwrap_or(&[])).to_owned();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous to_str! also felt a little dubious here considering it just always assumed valid UTF-8, even though that's afaict not a guarantee that Linux gives here.

This makes it so `OsString` / `OsStr` are used everywhere instead of
`String` / `str`, because the operating systems make no guarantee that
the values provided are valid UTF-8. The `OsString` / `OsStr` types are
specifically provided by the standard library for these situations. This
fixes situations where `sysinfo` either had to lossily lose some of the
information, had some subtle undefined behavior, and partially rejected
processes and other resources entirely, making them impossible to be
accessed by the user of this crate.
@GuillaumeGomez
Copy link
Owner

Please only update the process name information to OsString. For other information, we can take a look one by one and make changes if it is proven problematic to use String.

@@ -1,10 +1,10 @@
task:
name: rust 1.69 on freebsd 13
name: rust 1.74 on freebsd 13
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OsString is really only usable since 1.74, so a MSRV bump is necessary. This PR isn't urgent, so feel free to delay this PR to whenever such a MSRV bump would be warranted.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating the MSRV is fine since it'll be part of a "major" release in any case since it's a breaking change.

@CryZe
Copy link
Contributor Author

CryZe commented Jan 3, 2024

@GuillaumeGomez I'm going to split out the other changes, but are you rather tending towards not wanting them at all or is it just too large of a PR? Cause I do think that almost all of these should probably be OsString. There's definitely a bunch where UTF-8 is just unsafely assumed, which seems dubious and other situations where certain resources are just silently skipped entirely if they can't be parsed as UTF-8. Both of these feel like they should be addressed.

I do at least somewhat agree that OsString while actually being usable now since 1.74, is still lacking a bunch of methods on it directly for interacting with it. There's for example no way to just Display an OsStr lossily without going through String (i.e. a mandatory heap allocation) or external crates, despite Path, which really just is OsStr having such a method. I'm going to be advocating and possibly contributing some of these in std going forward probably.

Maybe similar to how std::env has functions for getting either OsString or String, could there be two functions each in sysinfo as well.

@GuillaumeGomez
Copy link
Owner

Too large for sure. I think overall we'll tend to replace all String with OsString, but I'd prefer for it to happen for one feature at a time so I can check if I'm not missing something out, etc.

@CryZe
Copy link
Contributor Author

CryZe commented Jan 4, 2024

Alright, I'll close this and do individual PRs in a few hours.

@CryZe CryZe closed this Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Process List incomplete on long names
2 participants