Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement(remap): Add push and append functions #5750

Merged
merged 33 commits into from Jan 11, 2021
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
7c4ec31
Add append function
Dec 28, 2020
e69b2aa
Add CUE docs sources for append function
Dec 28, 2020
b7d4aa0
Add TOML unit test for append function
Dec 28, 2020
40b22f1
Fix formatting
Dec 28, 2020
fb0c35e
Fix CUE formatting
Dec 28, 2020
62957dc
Resolve md5/sha1 import ambiguity
Dec 29, 2020
7dc4407
Clarify append semantics in docs
Dec 29, 2020
0c80b9d
Merge branch 'master' into remap-append-function
Dec 29, 2020
f766a9f
Update function description re: return value
Dec 29, 2020
217f3de
Merge branch 'master' into remap-append-function
Jan 2, 2021
1b69a0f
Rename function to push
Jan 3, 2021
986a828
Add append function
Jan 3, 2021
ca017c3
Add docs for append function
Jan 3, 2021
9f7b832
Update behavior tests
Jan 3, 2021
336ff86
Fix formatting issue
Jan 3, 2021
1c7a4ed
Update docs
Jan 5, 2021
97d0758
Use lit and array macros
Jan 5, 2021
0dd8284
Make behavior test naming and placement consistent with other tests
Jan 5, 2021
14b28c9
Add inner type definitions plus tests
Jan 6, 2021
fe987c0
Add inner type definitions for push function
Jan 6, 2021
6c34790
Fix merge conflict in behavior tests
Jan 6, 2021
9da4ed5
Reformat
Jan 6, 2021
8095cf7
Add inner type defs back into push and append expressions
Jan 6, 2021
531934a
Fix test naming issue created by merge
Jan 6, 2021
f8ae749
Fix merge conflicts
Jan 7, 2021
e9cbf97
Merge branch 'master' into remap-append-function
Jan 7, 2021
e06dc3f
Fix behavior test array equality statements
Jan 7, 2021
58e8dcf
Fix merge conflict
Jan 7, 2021
85c0c9c
Fix array declaration issue in behavior test
Jan 7, 2021
1211fcd
Fix behavior test outputs
Jan 9, 2021
9f6c3ad
Fix merge conflict
Jan 10, 2021
95f4919
Merge branch 'master' into remap-append-function
Jan 11, 2021
7282233
Fix merge conflict
Jan 11, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/reference/remap.cue
Expand Up @@ -49,7 +49,7 @@ remap: {

arguments: [...#Argument] // Allow for empty list
return: [#RemapReturnTypes, ...#RemapReturnTypes]
category: "Check" | "Coerce" | "Encode" | "Enumerate" | "Event" | "Hash" | "IP" | "Map" | "Number" | "Parse" | "Random" | "String" | "Test" | "Timestamp"
category: "Array" | "Check" | "Coerce" | "Encode" | "Enumerate" | "Event" | "Hash" | "IP" | "Map" | "Number" | "Parse" | "Random" | "String" | "Test" | "Timestamp"
description: string
examples?: [#RemapExample, ...#RemapExample]
name: Name
Expand Down
40 changes: 40 additions & 0 deletions docs/reference/remap/append.cue
@@ -0,0 +1,40 @@
package metadata

remap: functions: append: {
arguments: [
{
name: "value"
description: "The array"
required: true
type: ["array"]
},
{
name: "items"
description: "The items to append"
required: true
type: ["array"]
},
]
return: ["array"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might warrant a larger discussion, but I think it would be clearer that append modifies the array in-place if it didn't return anything. The fact that it returns the modified array could lead users to believe it is returning a new array that is a copy of the old array plus the new element.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do know that Jean et al would eventually like to make del, for example, return either the deleted value or null if the field doesn't exist. Returning the new array would enable you to do things like unique(append(.fruits, "mango")) (mentioned in my other comment) and feels more consistent with some programming languages (Go et al) but not others (Rust et al). I don't have strong feelings there, especially on this first iteration. Happy to defer. For now I'll leave the current behavior in place (return the new array) and update the docs.

Copy link
Member

@jszwedko jszwedko Dec 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Go is a bit special because it actually returns a new "struct" representing the array. Usually this is pointing to the same underlying memory, but it'll actually be different if it had to extend the array; this is a common gotcha for people coming to the language. Consider that Go's in-place sorting routines do not return the sorted array. I think this makes it clearer that it is happening in-place.

That being said, maybe I'm in the minority here. I know Ruby returns the same array to make it easier to chain though they also have sort create a copy while sort! sorts in-place.

I guess the most important thing is consistency in VRL. We can choose to always return the modified value, even if it happens in-place, and hopefully that consistency will make it easy to reason about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would update the docs to make it clear that the array itself is modified though, the current wording:

Appends the specified item to an array and returns the new array

Might lead people to imagine it returns a new array. Specifically the word new there. Maybe:

Appends the specified item to an array and returns the modified array

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what you're saying now. I'll update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jszwedko Every VRL function except del is a "pure" function that generates a value but doesn't modify the object. In this case, the value of the list and the item are merely extracted from the function arguments and used to produce a new array. Compare with the del function logic:

self.paths
    .iter()
    .try_for_each(|path| object.remove(path.as_ref(), false))?; // Object is manipulated

The Object type has methods that give you hooks into the event itself. If you don't invoke those then the underlying event is unchanged. So I'm not sure that tests verifying that the event is unchanged are any more necessary here than in all of the other functions (except del, of course).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I'm probably missing something in the plumbing of remap-lang. I would have assumed that complex types like the Vec and BTreeMap would have been accessed by reference to avoid unnecessary clones (though we do want the clone here).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably missing something in the plumbing of remap-lang.

You're not missing anything 😄. We opted to accept the clone, as a trade-off to keep the initial implementation simpler.

Since we're using the bytes crate, the cloning is actually relatively cheap, since it uses reference-counting internally, and other most other value types implement copy, and are thus cheap to clone (cloning a map is probably most expensive, since we're storing the keys as strings on the heap).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha 🤔 thanks for the explanation!

@lucperkins I think it'd still be nice to add a test that asserts the original array was unchanged in case we change the behavior of always cloning values and need to add an explicit clone here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every VRL function except del is a "pure" function that generates a value but doesn't modify the object.

merge is another one. merge(., .foo) will merge .foo into the root.

category: "Array"
description: """
Adds each item from an array to the end of another array. The expression
`append([1, 2, 3], [4, 5, 6])`, for example, would produce the array `[1, 2, 3, 4, 5, 6]`.
The items in both arrays can be of any VRL type.
"""
examples: [
{
title: "Mixed array"
input: {
kitchen_sink: [72.5, false, [1, 2, 3]]
items: ["booper", "bopper"]
}
source: """
.kitchen_sink = append(.kitchen_sink, .items)
"""
output: {
kitchen_sink: [72.5, false, [1, 2, 3], "booper", "bopper"]
}
},
]
}
58 changes: 58 additions & 0 deletions docs/reference/remap/push.cue
@@ -0,0 +1,58 @@
package metadata

remap: functions: push: {
arguments: [
{
name: "value"
description: "The array"
required: true
type: ["array"]
},
{
name: "item"
description: "The item to push"
required: true
type: ["any"]
},
]
return: ["array"]
category: "Array"
description: """
Adds the specified item to the end of an array and returns the resulting array. The item
can be of any VRL type and is added even if an item with the same value is already present
in the array.

The `push` function does *not* change the array in place. In this example, the `push`
function would return an array with `apple`, `orange`, and `banana`, but the value of
`fruits` would be unchanged:

```js
.fruits = ["apple", "orange"]
push(.fruits, "banana")
.fruits
["apple", "orange"]
```

In order to change the value of `fruits`, you would need to store the resulting array in
the field:

```js
.fruits = push(.fruits, "banana")
```
"""
examples: [
{
title: "Mixed array"
input: {
kitchen_sink: [72.5, false, [1, 2, 3]]
item: "booper"
}
source: """
.kitchen_sink = push(.kitchen_sink, .item)
"""
output: {
kitchen_sink: [72.5, false, [1, 2, 3], "booper"]
}
},
]
}
4 changes: 4 additions & 0 deletions lib/remap-functions/Cargo.toml
Expand Up @@ -35,6 +35,7 @@ anyhow = "1"
[features]

default = [
"append",
"assert",
"ceil",
"compact",
Expand Down Expand Up @@ -73,6 +74,7 @@ default = [
"parse_regex_all",
"parse_timestamp",
"parse_url",
"push",
"redact",
"replace",
"round",
Expand All @@ -97,6 +99,7 @@ default = [
"uuid_v4",
]

append = []
assert = []
ceil = []
compact = []
Expand Down Expand Up @@ -135,6 +138,7 @@ parse_key_value = ["nom"]
parse_syslog = ["syslog_loose"]
parse_timestamp = ["shared/conversion"]
parse_url = ["url"]
push = []
redact = []
replace = []
round = []
Expand Down
143 changes: 143 additions & 0 deletions lib/remap-functions/src/append.rs
@@ -0,0 +1,143 @@
use remap::prelude::*;

#[derive(Clone, Copy, Debug)]
pub struct Append;

impl Function for Append {
fn identifier(&self) -> &'static str {
"append"
}

fn parameters(&self) -> &'static [Parameter] {
&[
Parameter {
keyword: "value",
accepts: |v| matches!(v, Value::Array(_)),
required: true,
},
Parameter {
keyword: "items",
accepts: |v| matches!(v, Value::Array(_)),
required: true,
},
]
}

fn compile(&self, mut arguments: ArgumentList) -> Result<Box<dyn Expression>> {
let value = arguments.required("value")?.boxed();
let items = arguments.required("items")?.boxed();

Ok(Box::new(AppendFn { value, items }))
}
}

#[derive(Debug, Clone)]
struct AppendFn {
value: Box<dyn Expression>,
items: Box<dyn Expression>,
}

impl Expression for AppendFn {
fn execute(&self, state: &mut state::Program, object: &mut dyn Object) -> Result<Value> {
let mut value = self.value.execute(state, object)?.try_array()?;
let mut items = self.items.execute(state, object)?.try_array()?;

value.append(&mut items);

Ok(value.into())
}

fn type_def(&self, state: &state::Compiler) -> TypeDef {
use value::Kind;

let items_type = self
.items
.type_def(state)
.fallible_unless(Kind::Array)
.with_inner_type(self.items.type_def(state).inner_type_def);

self.value
.type_def(state)
.fallible_unless(Kind::Array)
.merge(items_type)
.with_constraint(Kind::Array)
.with_inner_type(
self.items
.type_def(state)
.merge(self.value.type_def(state))
.inner_type_def,
)
lucperkins marked this conversation as resolved.
Show resolved Hide resolved
}
}

#[cfg(test)]
mod tests {
use super::*;
use value::Kind;

test_type_def![
value_array_items_array_infallible {
expr: |_| AppendFn {
value: array!["foo", "bar", 142].boxed(),
items: array!["baq", "baz", true].boxed(),
},
def: TypeDef {
fallible: false,
kind: Kind::Array,
inner_type_def: Some(TypeDef {
kind: Kind::Boolean | Kind::Bytes | Kind::Integer,
..Default::default()
}.boxed())
},
}

value_non_array_fallible {
expr: |_| AppendFn {
value: lit!(27).boxed(),
items: array![1, 2, 3].boxed(),
},
def: TypeDef {
fallible: true,
kind: Kind::Array,
inner_type_def: Some(TypeDef {
kind: Kind::Integer,
..Default::default()
}.boxed())
},
}

items_non_array_fallible {
expr: |_| AppendFn {
value: array![1, 2, 3].boxed(),
items: lit!(27).boxed(),
},
def: TypeDef {
fallible: true,
kind: Kind::Array,
inner_type_def: Some(TypeDef {
kind: Kind::Integer,
..Default::default()
}.boxed())
},
}
];

test_function![
append => Append;

both_arrays_empty {
args: func_args![value: array![], items: array![]],
want: Ok(value!([])),
}

one_array_empty {
args: func_args![value: array![], items: array![1, 2, 3]],
want: Ok(value!([1, 2, 3])),
}

neither_array_empty {
args: func_args![value: array![1, 2, 3], items: array![4, 5, 6]],
want: Ok(value!([1, 2, 3, 4, 5, 6])),
}
];
}
20 changes: 16 additions & 4 deletions lib/remap-functions/src/lib.rs
@@ -1,5 +1,7 @@
mod util;

#[cfg(feature = "append")]
mod append;
#[cfg(feature = "assert")]
mod assert;
#[cfg(feature = "ceil")]
Expand Down Expand Up @@ -76,6 +78,8 @@ mod parse_syslog;
mod parse_timestamp;
#[cfg(feature = "parse_url")]
mod parse_url;
#[cfg(feature = "push")]
mod push;
#[cfg(feature = "redact")]
mod redact;
#[cfg(feature = "replace")]
Expand Down Expand Up @@ -127,6 +131,8 @@ mod uuid_v4;
pub use crate::md5::Md5;
#[cfg(feature = "sha1")]
pub use crate::sha1::Sha1;
#[cfg(feature = "append")]
pub use append::Append;
#[cfg(feature = "assert")]
pub use assert::Assert;
#[cfg(feature = "ceil")]
Expand Down Expand Up @@ -199,6 +205,8 @@ pub use parse_syslog::ParseSyslog;
pub use parse_timestamp::ParseTimestamp;
#[cfg(feature = "parse_url")]
pub use parse_url::ParseUrl;
#[cfg(feature = "push")]
pub use push::Push;
#[cfg(feature = "match")]
pub use r#match::Match;
#[cfg(feature = "redact")]
Expand Down Expand Up @@ -246,10 +254,8 @@ pub use uuid_v4::UuidV4;

pub fn all() -> Vec<Box<dyn remap::Function>> {
vec![
#[cfg(feature = "md5")]
Box::new(Md5),
#[cfg(feature = "sha1")]
Box::new(Sha1),
lucperkins marked this conversation as resolved.
Show resolved Hide resolved
#[cfg(feature = "append")]
Box::new(Append),
#[cfg(feature = "assert")]
Box::new(Assert),
#[cfg(feature = "ceil")]
Expand Down Expand Up @@ -294,6 +300,8 @@ pub fn all() -> Vec<Box<dyn remap::Function>> {
Box::new(IsNullish),
#[cfg(feature = "log")]
Box::new(Log),
#[cfg(feature = "md5")]
Box::new(Md5),
#[cfg(feature = "merge")]
Box::new(Merge),
#[cfg(feature = "now")]
Expand Down Expand Up @@ -322,6 +330,8 @@ pub fn all() -> Vec<Box<dyn remap::Function>> {
Box::new(ParseTimestamp),
#[cfg(feature = "parse_url")]
Box::new(ParseUrl),
#[cfg(feature = "push")]
Box::new(Push),
#[cfg(feature = "match")]
Box::new(Match),
#[cfg(feature = "redact")]
Expand All @@ -330,6 +340,8 @@ pub fn all() -> Vec<Box<dyn remap::Function>> {
Box::new(Replace),
#[cfg(feature = "round")]
Box::new(Round),
#[cfg(feature = "sha1")]
Box::new(Sha1),
#[cfg(feature = "sha2")]
Box::new(Sha2),
#[cfg(feature = "sha3")]
Expand Down