Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enrich Radon by supporting new operators on maps, arrays and strings #2291

Open
guidiaz opened this issue Oct 25, 2022 · 7 comments
Open

Enrich Radon by supporting new operators on maps, arrays and strings #2291

guidiaz opened this issue Oct 25, 2022 · 7 comments
Labels
breaking change ⚠️ Introduces breaking changes enhancement 📈 New feature or request

Comments

@guidiaz
Copy link
Contributor

guidiaz commented Oct 25, 2022

New Map operators:

  • MapMap: same as ArrayMap, but iterating on maps.
  • MapFilter: same as ArrayFilter, but iterating on maps.
  • MapAlter: expecting two parameters: "key" and SUBSCRIPT; it would return the same input map after applying SUBSCRIPT over the value of the specified entry, if found.

New Array operators:

  • ArrayAlter: expecting two paramters: "array index" and SUBSCRIPT; it would return the same input array after applying SUBSCRIPT over the value at the specified index, if found.

New String operators:

  • StringSlice: expecting two required parameters for simple slicing: "start index" and "slice length" (optional?).
  • StringHexToBytes: expecting no prefix of any kind (e.g. "0x"), convert hex string into an array of bytes.
@guidiaz guidiaz added the enhancement 📈 New feature or request label Oct 25, 2022
@aesedepece
Copy link
Member

New Map operators:

* _**MapMap**_: same as _ArrayMap_, but iterating on maps.

* _**MapFilter**_: same as _ArrayFilter_, but iterating on maps.

* _**MapOne**_: expecting two parameters: "key" and SUBSCRIPT; it would return the same input map after applying SUBSCRIPT over the specified key, if found.

Does iterating on maps mean invoking the visiting function (subscript) on 2-item arrays with index 0 being occupied by each key and index 1 being occupied by each value? I see no other way to model it 🤔

@guidiaz
Copy link
Contributor Author

guidiaz commented Oct 25, 2022

This idea came up yesterday while assessing with @tmpolaczyk a possible solution to some user request. That you say would be already solved by converting a Map to an Array ("entries" operator?), I guess. The thing here is to avoid converting to an array for then converting back to a map, as to avoid also to deal with the ordering of fields and so on.

So, "iterating on map" would mean to invoke the visiting subscript on every element of the JSON map, as is, with no conversion to an array of 2-element arrays involved. Surely, @tmpolaczyk can explain better.

@aesedepece
Copy link
Member

I mean, regardless of the map → array → map conversion, the visiting function still needs to have a contract, and taking into account that scripts cannot receive multiple arguments, I see no other way than what I described above.

@guidiaz
Copy link
Contributor Author

guidiaz commented Oct 25, 2022

Sorry, now I get it. The visiting subscript would receive one single argument: the value for each entry (not the key).

@tmpolaczyk
Copy link
Contributor

MapFilter could accept an array as input, this way we can filter by key or by value:

// Filter by key
MapFilter [ ArrayGet(0), StringMatch(...) ]
// Filter by value
MapFilter [ ArrayGet(1), IntegerGreaterThan(...) ]

MapMap would be a bit more complicated, because mutating an array is more difficult. But should also be possible with the proposed ArrayAlter operator:

// Convert map keys to lowercase:
MapMap [ ArrayAlter(0, [ StringToLowercase ] ) ]
// Convert map values to strings:
MapMap [ ArrayAlter(1, [ IntegerToString ] ) ]

And MapAlter should have the same input as MapMap.

For StringSlice I propose to do it the same way as Python, allowing negative indices to mean "starting from the end".

@tmpolaczyk tmpolaczyk added the breaking change ⚠️ Introduces breaking changes label Nov 24, 2022
@tmpolaczyk
Copy link
Contributor

You can find a work in progress implementation in this branch.

@tmpolaczyk
Copy link
Contributor

When implementing StringHexToBytes I also implemented the reverse, BytesToHexString, but it turns out that operator already exists under the name BytesToString. That was a bit strange because I expected that operator to be the UTF8 encode/decode, not the bytes to hex that it currently is. Because there already exists an unimplemented StringToBytes operator, so to be consistent with BytesToString, that one should be the hex decode operator.

I guess there is a possibility to treat hexadecimal as an encoding like UTF8, so we could define a new enum like:

enum RadonStringEncoding {
    Hex = 0,
    Ascii = 1,
    Utf8 = 2,
}

And therefore StringHexToBytes would just be StringToBytes. And to convert strings to utf8 bytes, it would be StringToBytes(2), with 2 as the argument. While this solution would be a bit counter-intuitive, at least to me, it could be easily extended to support encodings like base64 or URL encoding (the %20 stuff from URLs).

One limitation is that the operators will always convert bytes to string and string to bytes, and it may not be clear which one is encode and which one is decode. For example to encode a string as base64 it would be [StringToBytes(utf8), BytesToString(base64)], while we could have a BytesToBase64EncodedString operator instead of BytesToString(base64).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change ⚠️ Introduces breaking changes enhancement 📈 New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants