Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Remap v1 RFC #3134

Merged
merged 8 commits into from
Aug 3, 2020
Merged

chore: Remap v1 RFC #3134

merged 8 commits into from
Aug 3, 2020

Conversation

Jeffail
Copy link
Contributor

@Jeffail Jeffail commented Jul 21, 2020

Closes #2744

Rendered

Signed-off-by: Ashley Jeffs <ash@jeffail.uk>
Signed-off-by: Ashley Jeffs <ash@jeffail.uk>
Signed-off-by: Ashley Jeffs <ash@jeffail.uk>
Signed-off-by: Ashley Jeffs <ash@jeffail.uk>
Copy link
Contributor

@binarylogic binarylogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is excellent. One comment on a proposed syntax change that I would like to resolve, but otherwise looks good.

inputs = [ "foo" ]
type = "vicscript"
mapping = """
foo = "hello"
Copy link
Contributor

@binarylogic binarylogic Jul 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is the first presentation of this syntax, the thing that immediately jumps out to me is the lack of clarity around event field assignment. It looks like a bunch of variables are being set with no real action. Following precedence set by our lua transform, I'm curious what you think about this alternative syntax?

event.log.foo = "hello"
event.log.bar = event.log.bar + 10
event.log.baz_is_large = event.log.baz_is_large > 10

I realize the intent of your syntax is to avoid maintaining a runtime environment. So things like this would not be possible:

# not possible
baz = 1
event.log.bar = baz

And we could simply throw an error saying that baz is "unknown". The advantage with my syntax is:

  1. It's obvious you are setting an event level field.
  2. The log namespace means we can use this with metrics (unless you were thinking of a better way to solve this).
  3. It's closer in syntax to other transforms that manipulate events (like lua and wasm).

Disadvantages:

  1. It suggests that this is a runtime. Users might be surprised that my baz = 1 example doesn't work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that'd be reasonable, one thing to note is that enforcing a path prefix would be stripped within a map definition, so you'd still get bare assignments like this:

map foo {
  type = (type.uppercase() | "UNKNOWN")
  name = name.titlecase()
}

event.log.metadata = event.log.metadata.apply("foo")

Another thing to note is that both IDML and Bloblang have an optional path prefix root (root.foo = "bar" would set the event field foo to "bar"), which is useful for when a path clashes with a keyword like if or match, but is also implicit when omitted as a convenience. It's assumed it usually won't be used unless you specifically want to assign directly to the root. On the right hand side we have a similar keyword this which represents the source value in the current context.

Adding them to the above example looks like this:

map foo {
  root.type = (this.type.uppercase() | "UNKNOWN")
  root.name = this.name.titlecase()
}

root.event.log.metadata = this.event.log.metadata.apply("foo")

I would opt to leave these in the spec as they are. We could choose to make them mandatory but I wouldn't rename root to something like event as it's somewhat misleading in the context of maps.

On the topic of variables, we can actually support them without a runtime or making the spec turing complete, it's just a mapping that isn't part of the output event. I originally omitted them from the RFC so we could consider them later as a more advanced feature, Bloblang has them and it's definitely useful, I already have an example here that uses them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would opt to leave these in the spec as they are.

Sounds good. I've come around to leaving out the prefix, especially as we start to incorporate it in other places where specifying event.log would be confusing. Such as dynamic configuration values.

map foo {
  type = (type.uppercase() | "UNKNOWN")
 name = name.titlecase()
}

The map syntax feels a little strange to me as well. As I noted in my other comment, we should agree on these details in the form of a spec so that we can obtain consensus. I realize that's the part of what this RFC aims to accomplish, but it still feels hand-wavy to me.

On the topic of variables, we can actually support them without a runtime or making the spec turing complete, it's just a mapping that isn't part of the output event.

Sounds good to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's a little unclear to have assignment look like free variables, but I'm not a fan of the verbosity of event.log..

The jq approach is interesting here, where assignment to a foo key is just .foo = "bar".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can do that. One thing that's open to confusion here is that . on the left hand side means something different from the right hand side.

Using root and this makes that distinction clear (even if it's not immediately obvious what they point to), whereas a very common pitfall of jq is that users will naturally assume in many places that a dot on the left and the right are the same thing. It means the docs for lots of features ends up needing to also explain that same concept continuously, making them quite verbose: https://stedolan.github.io/jq/manual/#Assignment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that's open to confusion here is that . on the left hand side means something different from the right hand side.

Yeah, this is interesting. It looks like in "normal" = assignment they are the same thing, but "most users will want to use modification assignment operators", where the RHS is a filter. I get the feeling there's some history there and I'm curious what led to the current state.

And supports coalescing fields within a path with a pipe operator `|`, where if the first value is null or missing then the next value is selected, and so on:

```coffee
foo = bar.(baz | buz).bev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 really like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same! This is really nice and concise.

You can assign a value to the root of your event by writing to the `root` keyword:

```coffee
root = nested.object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another example where it's clearer to write:

event.log = nested.object

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In both of these cases I worry about the user data having root, event, or log fields. If we can sidestep that whole problem it could save us a lot of headache.

Vicscript supports float, int, boolean, string, null, array and object literals:

```coffee
root = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 👍 on the primitive types you've laid out. But this reminds me, do we want to allow non "object" types at the root? I'm concerned this will break in other parts of Vector.

rfcs/2020-07-21-2744-vicscript.md Outdated Show resolved Hide resolved

It also seems like there are some key limitations with the language that we would need to contribute solutions for (or fork). The main one being simple coalescing expressions, I can't find anything in their docs to allow something like `foo = bar.(baz | buz).bev` without a nasty bunch of nested `match` expressions.

Similar to JQ the syntax is also far enough removed from basic c-style declarations that there's a learning curve and more difficulty in maintaining syntax highlighters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with all of the above regarding Tremor script. I think it's an impressive feature of Tremor, but would like to avoid it for the reasons you listed.


### It's more to support

A key role for us as Vector maintainers is to assist users with their configs and thereby encourage adoption. It stands to reason that making Vector as a project larger would conflict with our ability to do this. However, we already have a range of transforms for mapping and after reviewing some large example configs ([https://github.com/timberio/vector/issues/1965](https://github.com/timberio/vector/issues/1965)) I think scrapping them in favor of a more condensed and readable language would be overall beneficial to us when we're sporting our support hats.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I think this will aid us in support.


### It's a lot of effort

Vicscript is an entirely new language spec where we'll need to implement both the parser and executor, which is clearly going to be a significant chunk of work. The obvious mitigation to this is to simply try it out and put a time cap on it, then review how far along the project got. This is only a speculative drawback and it's only going to draw us back once (until we decide to rewrite it for fun).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value exceeds the effort in my opinion.


Mapping events is a core feature of Vector, and once we introduce a performant and more powerful alternative to our existing mapping transforms it's reasonable to expect it to become the standard for small or moderate sized pipelines. As such, I think it's important for us to have strong ownership of this language. It will allow us to build it into exactly the right solution for Vector users. It will also allow us to guarantee its performance and stability in the future, and provide first class support for it.

Given most of the risk around this idea is simply how difficult it will be I'd say the obvious first step is to test that out with a proof of concept. We can start with a quick implementation of a small subset of the spec and build a parser and execution engine for review. If we end up with nothing, or with a hostile codebase, then we can instead look into adopting other projects and compare.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with everything you've laid out as a rationale. I am big fan of starting with a small experiment.


1. Dip our toes and build a new transform with a parser using something like [nom](https://github.com/Geal/nom). This should only go as far as a tiny subset of the spec, where we allow basic operations that are roughly equivalent to our existing mapping transforms and work directly to/from our internal event type.
2. Review the parser and the codebase, and compare the performance of this transform with the existing mapping transforms. If it sucks then it goes in the bin and we investigate using other existing alternatives.
3. If instead we're happy with it then at this point we should consider breaking the project out of Vector. Since the spec is mostly a copy of Bloblang it would be sensible to share efforts for community tooling such as syntax highlighters, linters, documentation, etc. Breaking our parser out as a reference Rust implementation would help build that community.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a step before 3 that would outline an actual spec for the language. We need consensus on the details and priority, but this is not necessary for the initial experiment.

rfcs/2020-07-21-2744-vicscript.md Outdated Show resolved Hide resolved
rfcs/2020-07-21-2744-vicscript.md Outdated Show resolved Hide resolved
rfcs/2020-07-21-2744-vicscript/example1.coffee Outdated Show resolved Hide resolved
@Jeffail
Copy link
Contributor Author

Jeffail commented Jul 23, 2020

I'm thinking in the interest of keeping the RFC simpler in the absence of a more in-depth spec I should remove more advanced features such as map, match expressions and iterator methods for now. We can add them one-by-one once we're agreed on the fundamentals and if we need more discussion they can be broken out into their own RFCs.

Signed-off-by: Ashley Jeffs <ash@jeffail.uk>
rfcs/2020-07-21-2744-vicscript.md Outdated Show resolved Hide resolved
rfcs/2020-07-21-2744-vicscript.md Outdated Show resolved Hide resolved
rfcs/2020-07-21-2744-vicscript.md Outdated Show resolved Hide resolved
Signed-off-by: Ashley Jeffs <ash@jeffail.uk>
Copy link
Contributor

@binarylogic binarylogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

Copy link
Member

@bruceg bruceg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This clears up a lot for me. It also explains why there was no "delete" operation that I had also forgotten to ask about. While mutating the original event may be more obvious for writers (think "I just want to modify this one field"), this approach leaves open opportunities for internal optimizations.

Copy link
Member

@lukesteensen lukesteensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! My biggest questions revolve around what level of complexity we want to target.

If our primary target use case is on the lower end of the complexity spectrum, I'd be tempted to make this a bit more jq-like just to remove some ambiguity and hopefully take advantage of broad user knowledge of that syntax. It definitely doesn't scale as well to large programs, but it's still well within the ideal of "make simple things simple and complicated things possible".

If we're targetting higher on the complexity scale, I think it's reasonable to differ from the one-liner style and focus more on readability. I do worry that this is sacrificing ease of use for the simplest use cases, which are likely to be some of the most popular.

inputs = [ "foo" ]
type = "vicscript"
mapping = """
foo = "hello"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's a little unclear to have assignment look like free variables, but I'm not a fan of the verbosity of event.log..

The jq approach is interesting here, where assignment to a foo key is just .foo = "bar".

And supports coalescing fields within a path with a pipe operator `|`, where if the first value is null or missing then the next value is selected, and so on:

```coffee
foo = bar.(baz | buz).bev
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same! This is really nice and concise.

The most common query type is a dot separated path, describing how to reach a target field within the input event:

```coffee
foo = bar.baz.0.buz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this differentiate between array and map indexing at all? Or would .0 work for both {"0": "foo"} and ["foo"]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's on a right hand side we have the reference object to look at, so this path follows the same rules as JSON pointers where if we approach an array and the path segment can be parsed as a integer then we use it as an index. However, this obviously can't always work on the left hand side unless the array already exists from a previous assignment, so there's room for confusion there.

We could instead opt to force array indexes to be explicit all over and foo.0.bar would always mean an object foo with a key 0, and indexes are expressed as foo[0].bar.

You can assign a value to the root of your event by writing to the `root` keyword:

```coffee
root = nested.object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In both of these cases I worry about the user data having root, event, or log fields. If we can sidestep that whole problem it could save us a lot of headache.

Out: {"id":"second","sorted_foo":["a","b","c","d"]}
```

The resulting event will only contain fields that are explicitly mapped, in order to begin your map with a full copy of the input event structure you can start by assigning `this` to `root`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this pattern. Again, the jq approach here is interesting and doesn't require these special names. A full copy is the default with simple assignment like .foo = "bar", while {foo: "bar"} gives you a fresh object.

I'm also curious if there would be performance implications to this for simple cases like add_fields. With enough work we could definitely optimize away the copying, but I'm not sure we really want to sign ourselves up for that.


### JQ

JQ is a great tool and rightfully gets a lot of love. It also basically solves the same problem as Vicscript. Unfortunately, it's common knowledge that JQ doesn't score high on readability, and therefore it doesn't scale well at all in relation to mapping size and complexity (imagine trying to solve [https://github.com/timberio/vector/issues/1965](https://github.com/timberio/vector/issues/1965) with it).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two important factors here, readability and scalability.

Readability is a tricky one. I definitely agree that jq can be hard to understand, but that's largely a non-issue if the user is already familiar with jq from somewhere else. Given that it's pretty popular, it's hard to say how those balance out.

Scalability is one where we need to think about the target use cases. If we're expecting for this transform to handle mostly light reshaping, then scalability is less of an issue. Users will likely have the opportunity to master the basics before diving into something more complicated and may not mind spending a bit more time to figure out something they recognize is pretty complicated. On the other hand, if we expect a meaningful number of users to jump straight into larger mapping declarations, it's more important that we have something that scales smoothly to that size and level of complexity.

@binarylogic binarylogic self-requested a review July 26, 2020 17:34
Copy link
Contributor

@binarylogic binarylogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking about this more, I'm starting to agree with @lukesteensen that the jq syntax is a better path forward:

  1. I don't particularly like the proposed syntax. I find it mildly confusing for the reasons specified above.
  2. The learning curve favors JQ.
    1. Users without JQ experience will likely find JQ easier to understand. Additionally, there are a lot of resources online for JQ.
    2. Users with JQ experience will have no learning curve.
  3. It removes the burden of defining a spec and all of the decisions that go along with that. I'm sure there are some tricky questions that JQ has already solved.
  4. It'll be easier to maintain and improve over time.

If others disagree, I'm happy to move forward with the proposed syntax. Otherwise, I would like to obtain consensus and update this RFC to reflect the change.

I'm also curious if we can build off of jq_rs.

@Jeffail
Copy link
Contributor Author

Jeffail commented Jul 27, 2020

This is worth digging into but I think there's two different mapping syntaxes being conflated here, and with a side order of the path syntax which is also a separate consideration.

Wholesale Mapping

Firstly, there's wholesale "this is the thing" mapping, which is basically just an object literal at the root, this is the traditional jq style:

{foo: .bar, baz: "something"}

And with our current proposal (assuming we eventually implement object literals), it looks like this:

root = {"foo": bar, "baz": "something"}

Or, as I've done with bloblang, the above can be simplified to a single expression, making it even more jq-like:

{"foo": bar, "baz": "something"}

In order to match jq syntax for these mappings we would just need to allow non-quoted keys, the main limitation with doing that is that it makes it awkward to have dynamic key values. In bloblang the mapping {key:value} would have a dynamic key determined by the field key of the input event, but I think it's totally reasonable to decide this is out of scope for vicscript.

Other than that minor distinction (and the rest of the language) this mapping style is basically the same.

Mapping with Assignments

The other style of mapping is a series of assignments that can optionally follow an initial wholesale mapping, this is the primary mapping style of IDML and Bloblang as it's more readable for expressing larger mappings, in jq it looks like this:

.tagPath = (.tagPath // (.key | split("/")))

And in this proposal it actually looks basically the same, the difference is in the other operators and path syntax:

tagPath = tagPath | key.split("/")

They also behave the same, where both make an explicit copy of the right hand side in order to preserve immutability of the right hand side context.

However, where they differ is in the initial state of the mapped object, where jq begins with the original event and this proposal starts with an empty object. The performance merits of either should be minor if any, starting with the input event doesn't automatically remove any copy overhead in practice because you still need an immutable reference of the input event during the sequence of assignments, optimizing ends up looking basically the same (citation needed).

In terms of safety starting with a full copy makes it less likely you'll accidentally drop fields from input events (or ones that are added later), an empty copy makes it less likely you'll accidentally include fields that you don't want to leak (or ones that are added later).

In terms of annoyance they can both be worked around with a single expression. For the current proposal starting with a full copy is root = this, but if we changed it to start with a copy and a user wants it to start empty then it's simple root = {}.

My personal take is that starting empty makes sense from the large mapping perspective, and it's nice to use the mapping as an explicit list of what the output events look like (although you can't quite derive an entire schema from it). However, I also agree that root = this looks naff and in our case it's worth focusing on the smaller use cases. Therefore I'm happy to go with either approach.

Path Syntax

I've mentioned above but will paste again here that we can go for a jq style path syntax but one thing that's open to confusion is that . on the left hand side means something different from the right hand side.

Using root and this makes that distinction clear (even if it's not immediately obvious what they point to), whereas a very common pitfall of jq is that users will naturally assume in many places that a dot on the left and the right are the same thing. It means the docs for lots of features ends up needing to also explain that same concept continuously, making them quite verbose: https://stedolan.github.io/jq/manual/#Assignment.

However, if we adopt this style we can simply reference their docs (which are pretty good) rather than have to explain it ourselves.

Going with JQ

In terms of going full-in on jq syntax we are only starting with very basic assignments and therefore it's totally possible to match it. However, the spec is huge and so ultimately we're either going to end up with a small subset of the spec or we'll end up using a third party implementation (which is great if we can find one that's suitable).

I'm also curious if we can build off of jq_rs.

We can try it, but from what I've understood this library simply calls out to the C-lib and we'd need to serialize/deserialize events, which will likely make it perform significantly worse than the existing transforms.

@lukesteensen
Copy link
Member

Wholesale Mapping

It sounds like the relevant questions here are:

  1. Assignment to root keyword vs bare object literal
  2. Quoted vs non-quoted keys

I'm relatively strongly in favor of going straight to the bare object literal syntax for (1), mostly to avoid the problem of keywords that may also exist as user keys. For (2), I don't feel strongly and it seems like something we could easily add later, so maybe we start with quoted keys only since they're unambiguous.

Mapping with Assignments

Here it seems like we just need to decide between defaulting to the original vs an empty object. Given that the transforms we're trying to replace are all mutation-based, I assumed a similar default would make sense. It seems like it doesn't make that big a difference since the mode can be switched quite easily.

Path Syntax

I agree that there's definitely some unfortunate complexity around jq's assignment operators, but I'm not sure that's necessarily a downside to adopting their path syntax. To me, removing the need for keywords is a big benefit. That's one less thing for users to have to learn, and I'd just be waiting for the GitHub issue asking why their data disappeared when they tried to assign to a "root" key.

Going with JQ

This seems like the biggest question. Do we go with an actual subset of jq (confusing assignment and all), or do we just borrow the parts of their syntax that we find convenient?

I would lean towards just taking the parts that we like, at least for the time being. It sacrifices some of the potential compatibility benefits, but true compatibility seems out of reach so maybe it's better not to pretend. For simple use cases we should be able to match jq for that basic level of familiarity, but we shouldn't feel like we need to extend that past what we think makes sense.

For example, jq is very much cli-focused and is designed for single line programs. Two assignments would look like a pipeline:

.foo = "bar" | .bar = "baz"

In our case, the mappings are written in in a file, where the more natural style would be to use multiple lines:

.foo = "bar"
.bar = "baz"

Given that there's almost no chance of us reaching true compatibility with jq, I think it'd make sense to make some early breaks like this to avoid setting the expectation of being exactly the same.

Signed-off-by: Ashley Jeffs <ash@jeffail.uk>
@binarylogic
Copy link
Contributor

Given that there's almost no chance of us reaching true compatibility with jq, I think it'd make sense to make some early breaks like this to avoid setting the expectation of being exactly the same.

Agree, we can note it's heavily inspired by jq.

Comment on lines 93 to 97
Remove fields by assigning them the `delete` keyword:

```text
.foo = delete
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything else here is great, but this still feels funky to me. I understand why it's this way, the left-hand side is the "target" and the right-hand side is the "action", but the traditional function syntax seems more intuitive. Going with our jq theme, this would just be del(.foo):

del(.foo)

Does this introduce significant complexity? I'm also curious to get @lukesteensen's opinion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is another case where I think we can try to avoid keywords and use something less ambiguous like a builtin function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got options here, bloblang and IDML use foo = deleted(), so it's a function rather than a keyword which we could copy. Implementing deletions as a right-hand query allows you to use it in all the same ways as regular mappings, which means they can be conditional:

.content = if .type == "bar" {
  .bar.body
} else {
  deleted()
}

And also mapped in an iterator in order to remove elements from arrays, etc:

.filtered_foos = .foo.map(if .ele.value < 10 { .ele.value + 10 } else { deleted() })

However, iterators aren't part of the RFC scope and if we're keeping the spec minimal then I think it's reasonable not to support those cases. It's easy to add a del(.foo) left-hand function, but if we want to enable conditional deletions the same as conditional mappings then we'd need to implement if as a statement you can put on the left-hand side:

if .type == "bar" {
  del(.bar.id)
}

At which point we need to decide whether we bother implementing it as a right-hand expression as they'd be parsed and implemented separately. Doing so is useful for expanding the language later, but not so much if we're keeping the language minimal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're saying, and it makes sense, but from a UX standpoint I find the following much more intuitive:

if .type == "bar" {
  del(.bar.id)
}

I feel like I might be suggesting something that is fundamentally at odds with the purpose of this language (performance). If that is the case, then I'd prefer to go with your proposal, otherwise, I find the above clearer. Performance is a key requirement of this language and I don't want to sacrifice that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, apologies for the back and forth. I've been trying to build consensus off line on the syntax proposed here. Just confirming was we settled on for this RFC:

.foo = "bar"
del(.baz)

This would:

  1. Add a root foo key with the value "bar".
  2. Delete the root baz key.

@binarylogic
Copy link
Contributor

@Jeffail, given the above, can we:

  1. Wrap up the final changes and merge this?
  2. Rename this to Remapping Syntax RFC. It's likely we'll call this ou remapping language and call the transform remap.

@Jeffail
Copy link
Contributor Author

Jeffail commented Aug 3, 2020

Cool, I've updated to swap vicscript to remap and added the del(.foo) statement. I've also removed if statements for now and we can break that conversation out elsewhere. The syntax of the transform is currently:

[transforms.foo_remapped]
  inputs = [ "foo" ]
  type = "remap"
  mapping = """
.foo = "hello"
.bar = .bar + 10
.baz_is_large = .baz > 10
"""

Where we have a single field mapping which stutters slightly (remap.mapping), maybe it could be script or something instead.

Signed-off-by: Ashley Jeffs <ash@jeffail.uk>
@binarylogic binarylogic changed the title chore: Vicscript RFC chore: Remap RFC Aug 3, 2020
Copy link
Contributor

@binarylogic binarylogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks great.

@binarylogic binarylogic merged commit 8b0b1c8 into master Aug 3, 2020
@binarylogic binarylogic deleted the vicscript-rfc branch August 3, 2020 15:52
@binarylogic binarylogic changed the title chore: Remap RFC chore: Remap v1 RFC Aug 6, 2020
@vector-vic vector-vic mentioned this pull request Sep 16, 2020
mengesb pushed a commit to jacobbraaten/vector that referenced this pull request Dec 9, 2020
Signed-off-by: Brian Menges <brian.menges@anaplan.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New remap transform RFC
4 participants