Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 158 additions & 0 deletions crates/kernel_cmdline/src/bytes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,83 @@ impl<'a> Cmdline<'a> {
anyhow::anyhow!("Failed to find kernel argument '{key}'")
})
}

/// Add or modify a parameter to the command line
///
/// Returns `true` if the parameter was added or modified.
///
/// Returns `false` if the parameter already existed with the same
/// content.
pub fn add_or_modify(&mut self, param: Parameter) -> bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if for the use cases that require mutation we should just create an ordered hashmap from it?

That's basically how https://github.com/ostreedev/ostree/blob/main/src/libostree/ostree-kernel-args.c works.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this relates to #1608 (comment) in that how duplicate keys should be handled when setting a parameter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah in the current state (as in how this patch is at this very moment...) if you construct a Cmdline with duplicate parameters, we allow that and don't try to do any verification at build time. But if you remove a duplicate, all duplicates are removed. And if you add_or_modify a duplicate, it removes all of the old instances and replaces it with just the single new one. I think that makes sense.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: "ordered hashmap", we'd have to pull in something like IndexMap? I see ostree-ext already uses that...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep indexmap is very widely used, I think it'd make sense here. Though we do need to preserve multiple keys so we should map from &str -> Vec<&str>

let mut new_params = Vec::new();
let mut modified = false;
let mut seen_key = false;

for p in self.iter() {
if p.key == param.key {
if !seen_key {
// This is the first time we've seen this key.
// We will replace it with the new parameter.
if p != param {
modified = true;
}
new_params.push(param.parameter);
} else {
// This is a subsequent parameter with the same key.
// We will remove it, which constitutes a modification.
modified = true;
}
seen_key = true;
} else {
new_params.push(p.parameter);
}
}

if !seen_key {
// The parameter was not found, so we append it.
let self_mut = self.0.to_mut();
if !self_mut.is_empty() && !self_mut.last().unwrap().is_ascii_whitespace() {
self_mut.push(b' ');
}
self_mut.extend_from_slice(param.parameter);
return true;
}
if modified {
self.0 = Cow::Owned(new_params.join(b" ".as_slice()));
true
} else {
// The parameter already existed with the same content, and there were no duplicates.
false
}
}
Comment on lines +138 to +179
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of add_or_modify is a single-pass algorithm, which is efficient for modifications. However, when adding a new parameter (i.e., the key does not already exist), the implementation iterates through all existing parameters and builds a new_params vector, which is then discarded. This results in unnecessary allocation and processing.

A more optimal approach for the 'add' case would be to first check if the key exists. If it doesn't, you can take a fast path to append the new parameter without building the intermediate new_params vector. This change would improve performance for additions and make the code's intent clearer by separating the 'add' and 'modify' logic paths.

Here is a suggested refactoring that implements this approach:

    pub fn add_or_modify(&mut self, param: Parameter) -> bool {
        // By checking for existence first, we can fast-path the common case of adding a new parameter.
        if !self.iter().any(|p| p.key == param.key) {
            // The parameter was not found, so we append it.
            let self_mut = self.0.to_mut();
            if !self_mut.is_empty() && !self_mut.last().unwrap().is_ascii_whitespace() {
                self_mut.push(b' ');
            }
            self_mut.extend_from_slice(param.parameter);
            return true;
        }

        // The key exists, so we need to rebuild.
        let mut new_params = Vec::new();
        let mut modified = false;
        let mut seen_key = false;

        for p in self.iter() {
            if p.key == param.key {
                if !seen_key {
                    // This is the first time we've seen this key.
                    // We will replace it with the new parameter.
                    if p != param {
                        modified = true;
                    }
                    new_params.push(param.parameter);
                } else {
                    // This is a subsequent parameter with the same key.
                    // We will remove it, which constitutes a modification.
                    modified = true;
                }
                seen_key = true;
            } else {
                new_params.push(p.parameter);
            }
        }

        if modified {
            self.0 = Cow::Owned(new_params.join(b" ".as_slice()));
            true
        } else {
            // The parameter already existed with the same content, and there were no duplicates.
            false
        }
    }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh, constructing the vec isn't that expensive, it's just copying the slice references. I purposefully tried to avoid iterating the whole thing twice, but kinda indifferent on this one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we shouldn't care about performance here, only correctness and maintainability


/// Remove parameter(s) with the given key from the command line
///
/// Returns `true` if parameter(s) were removed.
pub fn remove(&mut self, key: ParameterKey) -> bool {
let mut removed = false;
let mut new_params = Vec::new();

for p in self.iter() {
if p.key == key {
removed = true;
} else {
new_params.push(p.parameter);
}
}

if removed {
self.0 = Cow::Owned(new_params.join(b" ".as_slice()));
}

removed
}
}

impl<'a> AsRef<[u8]> for Cmdline<'a> {
fn as_ref(&self) -> &[u8] {
&self.0
}
}

/// A single kernel command line parameter key
Expand All @@ -144,6 +221,12 @@ impl<'a> std::ops::Deref for ParameterKey<'a> {
}
}

impl<'a, T: AsRef<[u8]> + ?Sized> From<&'a T> for ParameterKey<'a> {
fn from(s: &'a T) -> Self {
Self(s.as_ref())
}
}

impl PartialEq for ParameterKey<'_> {
/// Compares two parameter keys for equality.
///
Expand Down Expand Up @@ -171,6 +254,8 @@ impl PartialEq for ParameterKey<'_> {
/// A single kernel command line parameter.
#[derive(Debug, Eq)]
pub struct Parameter<'a> {
/// The full original value
parameter: &'a [u8],
/// The parameter key as raw bytes
key: ParameterKey<'a>,
/// The parameter value as raw bytes, if present
Expand Down Expand Up @@ -214,6 +299,7 @@ impl<'a> Parameter<'a> {

let ret = match equals {
None => Self {
parameter: input,
key: ParameterKey(input),
value: None,
},
Expand All @@ -232,6 +318,7 @@ impl<'a> Parameter<'a> {
};

Self {
parameter: input,
key,
value: Some(value),
}
Expand Down Expand Up @@ -537,4 +624,75 @@ mod tests {
assert_eq!(rd_args[2], param("rd.foo=a"));
assert_eq!(rd_args[3], param("rd.qux=c"));
}

#[test]
fn test_add_or_modify() {
let mut kargs = Cmdline::from(b"foo=bar");

// add new
assert!(kargs.add_or_modify(param("baz")));
let mut iter = kargs.iter();
assert_eq!(iter.next(), Some(param("foo=bar")));
assert_eq!(iter.next(), Some(param("baz")));
assert_eq!(iter.next(), None);

// modify existing
assert!(kargs.add_or_modify(param("foo=fuz")));
iter = kargs.iter();
assert_eq!(iter.next(), Some(param("foo=fuz")));
assert_eq!(iter.next(), Some(param("baz")));
assert_eq!(iter.next(), None);

// already exists with same value returns false and doesn't
// modify anything
assert!(!kargs.add_or_modify(param("foo=fuz")));
iter = kargs.iter();
assert_eq!(iter.next(), Some(param("foo=fuz")));
assert_eq!(iter.next(), Some(param("baz")));
assert_eq!(iter.next(), None);
}

#[test]
fn test_add_or_modify_empty_cmdline() {
let mut kargs = Cmdline::from(b"");
assert!(kargs.add_or_modify(param("foo")));
assert_eq!(kargs.0, b"foo".as_slice());
}

#[test]
fn test_add_or_modify_duplicate_parameters() {
let mut kargs = Cmdline::from(b"a=1 a=2");
assert!(kargs.add_or_modify(param("a=3")));
let mut iter = kargs.iter();
assert_eq!(iter.next(), Some(param("a=3")));
assert_eq!(iter.next(), None);
}

#[test]
fn test_remove() {
let mut kargs = Cmdline::from(b"foo bar baz");

// remove existing
assert!(kargs.remove("bar".into()));
let mut iter = kargs.iter();
assert_eq!(iter.next(), Some(param("foo")));
assert_eq!(iter.next(), Some(param("baz")));
assert_eq!(iter.next(), None);

// doesn't exist? returns false and doesn't modify anything
assert!(!kargs.remove("missing".into()));
iter = kargs.iter();
assert_eq!(iter.next(), Some(param("foo")));
assert_eq!(iter.next(), Some(param("baz")));
assert_eq!(iter.next(), None);
}

#[test]
fn test_remove_duplicates() {
let mut kargs = Cmdline::from(b"a=1 b=2 a=3");
assert!(kargs.remove("a".into()));
let mut iter = kargs.iter();
assert_eq!(iter.next(), Some(param("b=2")));
assert_eq!(iter.next(), None);
}
}
95 changes: 95 additions & 0 deletions crates/kernel_cmdline/src/utf8.rs
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,30 @@ impl<'a> Cmdline<'a> {
self.value_of(key)
.ok_or_else(|| anyhow::anyhow!("Failed to find kernel argument '{key}'"))
}

/// Add or modify a parameter to the command line
///
/// Returns `true` if the parameter was added or modified.
///
/// Returns `false` if the parameter already existed with the same
/// content.
pub fn add_or_modify(&mut self, param: Parameter) -> bool {
self.0.add_or_modify(param.0)
}

/// Remove parameter(s) with the given key from the command line
///
/// Returns `true` if parameter(s) were removed.
pub fn remove(&mut self, key: ParameterKey) -> bool {
self.0.remove(key.0)
}
}

impl<'a> AsRef<str> for Cmdline<'a> {
fn as_ref(&self) -> &str {
str::from_utf8(self.0.as_ref())
.expect("We only construct the underlying bytes from valid UTF-8")
}
}

/// A single kernel command line parameter key
Expand Down Expand Up @@ -525,4 +549,75 @@ mod tests {
let k2 = ParameterKey::from("a-c");
assert_ne!(k1, k2);
}

#[test]
fn test_add_or_modify() {
let mut kargs = Cmdline::from("foo=bar");

// add new
assert!(kargs.add_or_modify(param("baz")));
let mut iter = kargs.iter();
assert_eq!(iter.next(), Some(param("foo=bar")));
assert_eq!(iter.next(), Some(param("baz")));
assert_eq!(iter.next(), None);

// modify existing
assert!(kargs.add_or_modify(param("foo=fuz")));
iter = kargs.iter();
assert_eq!(iter.next(), Some(param("foo=fuz")));
assert_eq!(iter.next(), Some(param("baz")));
assert_eq!(iter.next(), None);

// already exists with same value returns false and doesn't
// modify anything
assert!(!kargs.add_or_modify(param("foo=fuz")));
iter = kargs.iter();
assert_eq!(iter.next(), Some(param("foo=fuz")));
assert_eq!(iter.next(), Some(param("baz")));
assert_eq!(iter.next(), None);
}

#[test]
fn test_add_or_modify_empty_cmdline() {
let mut kargs = Cmdline::from("");
assert!(kargs.add_or_modify(param("foo")));
assert_eq!(kargs.as_ref(), "foo");
}

#[test]
fn test_add_or_modify_duplicate_parameters() {
let mut kargs = Cmdline::from("a=1 a=2");
assert!(kargs.add_or_modify(param("a=3")));
let mut iter = kargs.iter();
assert_eq!(iter.next(), Some(param("a=3")));
assert_eq!(iter.next(), None);
}

#[test]
fn test_remove() {
let mut kargs = Cmdline::from("foo bar baz");

// remove existing
assert!(kargs.remove("bar".into()));
let mut iter = kargs.iter();
assert_eq!(iter.next(), Some(param("foo")));
assert_eq!(iter.next(), Some(param("baz")));
assert_eq!(iter.next(), None);

// doesn't exist? returns false and doesn't modify anything
assert!(!kargs.remove("missing".into()));
iter = kargs.iter();
assert_eq!(iter.next(), Some(param("foo")));
assert_eq!(iter.next(), Some(param("baz")));
assert_eq!(iter.next(), None);
}

#[test]
fn test_remove_duplicates() {
let mut kargs = Cmdline::from("a=1 b=2 a=3");
assert!(kargs.remove("a".into()));
let mut iter = kargs.iter();
assert_eq!(iter.next(), Some(param("b=2")));
assert_eq!(iter.next(), None);
}
}