Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maps with Infinite Number of Keys: Total Maps and Decorated maps #105

Open
dnovatchev opened this issue Dec 26, 2021 · 16 comments
Open

Maps with Infinite Number of Keys: Total Maps and Decorated maps #105

dnovatchev opened this issue Dec 26, 2021 · 16 comments
Labels
Discussion A discussion on a general topic. PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-optional Categorized as "optional for 4.0" at the Prague f2f, 2024 XPath An issue related to XPath

Comments

@dnovatchev
Copy link
Contributor

dnovatchev commented Dec 26, 2021

Maps with Infinite Number of Keys: Total Maps and Decorated maps

1. Total Maps

Maps have become one of the most useful tools for creating readable, short and efficient XPath code. However, a significant limitation of this datatype is that a map can have only a finite number of keys. In many cases we might want to implement a map that can have more than a fixed, finite number of arguments.

Here is a typical example (Example 1):
A hotel charges per night differently, depending on how long the customer has been staying. For the first night the price is $100, for the second $90, for the third $80 and for every night after the third $75. We can immediately try to express this pricing data as a map, like this:

map {
1 : 100,
2 : 90,
3 : 80
(:  ??? How to express the price for all eventual next nights? :)
}

We could, if we had a special key, something like "TheRest", which means any other key-value, which is not one of the already specified key-values.

Here comes the first part of this proposal:

  1. We introduce a special key value, which, when specified in a map means: any possible key, different from the other keys, specified for the map. For this purpose we use the string: "\"

Adding such a "discard symbol" makes the map a total function on the set of any possible XPath atomic items.

Now we can easily express the hotel-price data as a map:

map {
1 : 100,
2 : 90,
3 : 80
'\' : 75
}

Another useful Example (2) is that now we can express any XPath item, or sequence of items as a map. Let's do this for a simple constant, like π:

let $π := map {
'\' : math:pi()
}
 return $π?*   (: produces 3.141592653589793  :)

the map above is empty (has no regular keys) and specifies that for any other key-value $k it holds that $π($k) eq math:pi()

Going further, we can express even the empty sequence (Example 3) as the following map:

let $Φ := map {
'\' : ()
}
 return $Φ?*   (: produces the empty sequence :)

Using this representation of the empty sequence, we can provide a solution for the "Forgiveness problem" raised by Jarno Jelovirta in the XML.Com #general channel in March 2021:

This expression will raise an error:

[map {"k0": 1}, map{"k0": [1, 2, 3]}]?*?("k0")?*

[XPTY0004] Input of lookup operator must be map or array: 1.

To prevent ("forgive", thus "Forgiveness Problem") the raising of such errors we could accept the rule that in XPath 4.0 any expression that evaluates to something different than a map or an array, could be coerced to the following map, which returns the empty sequence as the corresponding value for any key requested in a lookup:

map {
'\' : ()
}  (: produces the empty sequence  for any lookup:)

To summarize, what we have achieved so far:

  1. The map constructed in Example 1 is now a total function over the domain of all natural numbers. Any map with a "\" (discard key) is a total function over the value-space of all xs:anyAtomicType values
  2. We can represent any XPath 4.0 item or sequence in an easy and intuitive way as a map.
  3. It is now straight-forward to solve the "Forgiveness Problem" by introducing the natural and intuitive rule for coercing any non-map value to the empty map, and this allows to use anywhere the lookup operator ? without raising an error.

2. Decorated Maps

Although we already achieved a lot in the first part, there are still use-cases for which we don't have an adequate map solution:

  1. In the example (1) of expressing the hotel prices, we probably shouldn't get $75 for a key such as -1 or even "blah-blah-blah"
    But the XPath 4.0 language specification allows any atomic values to be possible keys and thus to be the argument to the map:get() function. If we want validation for the actually-allowed key-values for a specific given map, we need to have additional processing/functionality.

  2. With a discard symbol we can express only one infinite set of possible keys and group them under the same corresponding value. However, there are problems, the data for which needs several infinite sets of key-values to be projected onto different values. Here is one such problem:

Imagine we are the organizers of a very simple lottery, selling many millions of tickets, identified by their number, which is a unique natural number.

We want to grant prizes with this simple strategy.

  • Any ticket number multiple of 10 wins $10.
  • Any ticket number multiple of 100 wins $20
  • Any ticket number multiple of 1000 wins $100
  • Any ticket number multiple of 5000 wins $1000
  • Any ticket number which is a prime number wins $25000
  • Any other ticket number doesn't win a prize (wins $0)

None of the sets of key-values for each of the 6 categories above can be conveniently expressed with the map that we have so far, although we have merely 6 different cases!

How can we solve this kind of problem still using maps?

Decorators to the rescue...

What is decorator, what is the decorator pattern and when it is good to use one? According to Wikipedia:

What solution does it describe?
Define Decorator objects that

  • implement the interface of the extended (decorated) object (Component) transparently by forwarding all requests to it
  • perform additional functionality before/after forwarding a request.

This allows working with different Decorator objects to extend the functionality of an object dynamically at run-time.

The idea is to couple a map with a function (the decorator) which can perform any needed preprocessing, such as validation or projection of a supplied value onto one of a predefined small set of values (that are the actual keys of the map). For simplicity, we are not discussing post-processing here, though this can also be part of a decorator, if needed.

Let us see how a decorated-map solution to the lottery problem looks like:

let $prize-table := map {
  "ten" : 10,
  "hundred" : 20,
  "thousand" : 100,
  "five-thousand" : 1000,
  "prime" : 25000,
 "\" : 0
},
$isPrime := function($input as  xs:integer) as xs:boolean
{
  exists(index-of((2, 3, 5, 7, 11, 13, 17, 19, 23), $input)) (: simplified primality checker :)
},
$decorated-map := function($base-map as map(*), $input as xs:anyAtomicType) as item()*
{
  let $raw-result :=
         (
          let $key := 
           if(not($input castable as xs:positiveInteger)) then '\'  (: we can call the error() function here :) 
             else if($input mod 5000 eq 0) then 'five-thousand'
             else if($input mod 1000 eq 0) then 'thousand'
             else if($input mod 100 eq 0) then 'hundred'
             else if($input mod 10 eq 0) then 'ten'
             else if($isPrime($input)) then 'prime'
             else "\"
          return $base-map($key)
         ),
      $post-process := function($x) {$x},  (: using identity here for simplicity :)
      $post-processed := $post-process($raw-result)
    return $post-processed
},

$prizeForTicket := $decorated-map($prize-table, ?),       (: Note: this is exactly the lookup operator  ?    :)
$ticketNumbers := (1, 10, 100, 1000, 5000, 19, -3, "blah-blah-blah")

return $ticketNumbers ! $prizeForTicket(.)          (: produces 0, 10, 20, 100, 1000, 25000, 0, 0 :)

Conclusion

  1. In the 2nd part of this proposal, a new type/function -- the decorated-map was described.

  2. We defined the signature of a decorated-map and gave an example how to construct and use one in solving a specific problem. In particular, the proposal is to have a standard function:

    decorated-map ($base-map as map(*), $input as xs:anyAtomicType) as item()*

  3. Finally, we showed that the lookup operator ? on a decorated map $dm is identical to and should be defined as :

    $dm($base-map, ?)

What remains to be done?

The topic of decorators is extremely important, as a decorator may and should be possible to be defined on any function, not just on maps. This would be addressed in one or more new proposals. Stay tuned 😊

@dnovatchev dnovatchev changed the title Maps with Infinite Number of Keys: Total Maps and Decorated maps [XPath] Proposal: Maps with Infinite Number of Keys: Total Maps and Decorated maps Dec 26, 2021
@michaelhkay
Copy link
Contributor

Are there other languages that offer anything similar?

@dnovatchev
Copy link
Contributor Author

dnovatchev commented Dec 27, 2021

Are there other languages that offer anything similar?

Yes:

  1. Total maps: see for example Coq

  2. Decorators: present in many programming languages with different names:

And @michaelhkay , here is an excellent introduction (a module from the Pluralsight course: "Python: Beyond the Basics"):

"Closures and Decorators"

@michaelhkay
Copy link
Contributor

The essence of maps as currently defined is their simplicity, and I don't want to lose that.

I'm wondering if the most common use-case can be met by providing convenient ways to turn a map into a function:

map:with-default($map, $default) returns function($x){ if (map:contains($map, $x)) then $map($x) else $default }

Plus an extension to the semantics of the lookup operator '?' so that the LHS can be an arity-1 function. (But the RHS is atomised, so I think to maintain consistency with the current semantics it would have to be a function whose argument type is singleton atomic.)

I'm also wondering about a way of deriving a function from a map in such a way that the function is more strongly typed, but I'm finding it hard to come up with anything better than just writing a function that wraps the map explicitly:

function($x as xs:integer) as xs:string* {map:with-default($map, "")?$x}

@dnovatchev
Copy link
Contributor Author

dnovatchev commented Jan 20, 2022

The essence of maps as currently defined is their simplicity, and I don't want to lose that.

I'm wondering if the most common use-case can be met by providing convenient ways to turn a map into a function:

map:with-default($map, $default) returns function($x){ if (map:contains($map, $x)) then $map($x) else $default }

Using an explicit "default" symbol seems more simple and visually intuitive, like in the below:

let $prize-table := map {
  "ten" : 10,
  "hundred" : 20,
  "thousand" : 100,
  "five-thousand" : 1000,
  "prime" : 25000,
 "\" : 0
}

We already have too-many map:xxx() functions and many people find that the ? operator has too-many different roles.

But certainly a function like this is a possibility. I would prefer it to be named more like:

map:add-default(...)

This said, I prefer even more to have a symbol for a default key. In the proposal this is "\". I understand that someone could argue that there might be cases when we would need this specific value to be used just as a regular key.

To eliminate the possibility of such an issue, we could establish a new fn:default() that can be used only as the default-key and only within a map as a key. Thus the above would become:

let $prize-table := map {
  "ten" : 10,
  "hundred" : 20,
  "thousand" : 100,
  "five-thousand" : 1000,
  "prime" : 25000,
 default() : 0
}

@michaelhkay
Copy link
Contributor

If we were to do that, I would go for a keyword such as #default.

However, I'm strongly inclined towards having a self-imposed rule for this round: no data model changes.

@dnovatchev
Copy link
Contributor Author

If we were to do that, I would go for a keyword such as #default.

However, I'm strongly inclined towards having a self-imposed rule for this round: no data model changes.

This is exactly why instead of a new special keyword #default we can use the proposed function default() . This is just a function (constant, like the existing math:pi() ) and just adding a function doesn't change the data model.

@ChristianGruen
Copy link
Contributor

ChristianGruen commented Jan 20, 2022

I don't believe we should add a special-default-case solution just for maps. There are so many other expressions that could benefit from, but don't have a default branch.

I would rather recommend users to take advantage of the upcoming otherwise expression (or ?: as alternative writing, as currently realized in BaseX).

@pgfearo
Copy link

pgfearo commented Jan 20, 2022

I like the @michaelhkay proposal because, if I understand correctly, the following syntax seems quite neat:

let $prize-table := map {
  "ten" : 10,
  "hundred" : 20,
  "thousand" : 100,
  "five-thousand" : 1000,
  "prime" : 25000
} => map:with-default(0)

Regarding the ? lookup operator - given that we use * on the RHS to return all map values, perhaps we could use + on the RHS to return the default value, giving:

let $default := $prize-table?+

@dnovatchev
Copy link
Contributor Author

I think using the function with-default() as proposed by @michaelhkay makes sense and it seems quite visually appealing and intuitive if written in combination with the arrow operator:

let $myMap := map{"x" : 1, "y" : 2} => with-default( 0 )

It would be more visually appealing if this function could be called without specifying a prefix, so why not define it in the standard function namespace?

@dnovatchev
Copy link
Contributor Author

Plus an extension to the semantics of the lookup operator '?' so that the LHS can be an arity-1 function. (But the RHS is atomised, so I think to maintain consistency with the current semantics it would have to be a function whose argument type is singleton atomic.)

I believe this was already proposed in a separate proposal by @ChristianGruen : [XPath] Generalize lookup operator for function items

@dnovatchev
Copy link
Contributor Author

The essence of maps as currently defined is their simplicity, and I don't want to lose that.

I'm wondering if the most common use-case can be met by providing convenient ways to turn a map into a function:

map:with-default($map, $default) returns function($x){ if (map:contains($map, $x)) then $map($x) else $default }

@michaelhkay After careful consideration, I am satisfied with the function so proposed.
Just for readability I would prefer this to be in the standar fn; namespace, thus instead of having to write:

$someMap => map:withDefault(5)

we can omit any prefix and its namespace binding and simply write:

$someMap => withDefault(5)

Plus an extension to the semantics of the lookup operator '?' so that the LHS can be an arity-1 function. (But the RHS is atomised, so I think to maintain consistency with the current semantics it would have to be a function whose argument type is singleton atomic.)

We already have a similar proposal by @ChristianGruen : [XPath] Generalize lookup operator for function items

Also, in the starting proposal, we showed that the lookup operator ? for any decorated map $dm (and a total map is a special case of a decorated map) is identical to, and should be defined as:

$dm($base-map, ?)

I'm also wondering about a way of deriving a function from a map in such a way that the function is more strongly typed, but I'm finding it hard to come up with anything better than just writing a function that wraps the map explicitly:

function($x as xs:integer) as xs:string* {map:with-default($map, "")?$x}

Shouldn't this be:

function($x as xs:integer) as xs:string* {map:with-default($map, () )?$x}

@dnovatchev
Copy link
Contributor Author

I'm wondering if the most common use-case can be met by providing convenient ways to turn a map into a function:

map:with-default($map, $default) returns function($x){ if (map:contains($map, $x)) then $map($x) else $default }

@michaelhkay, In a previous comment I expressed approval for this idea. 💯

Please note however, that the users will certainly want to treat the result of the with-default as a map and not as just as a function.

Therefore, I would prefer the definition and implementation of the function as in the following expression:

let $withDefault := function($map as map(*), $default as item()*) as map(*)
{
   let $extenderKey := xs:float("-INF"),
       $props := if(map:contains($map, $extenderKey))
                    then  map:put($map($extenderKey), "$default", $default)
                    else map{ "$default" : $default }
    return
      map:put($map, $extenderKey, $props) 
}
  return
    (
      $withDefault(map{"x" : 2, "y" : 3}, 10),
      "==============",
       $withDefault($withDefault(map{"x" : 2, "y" : 3}, 10), 15)
     )

Evaluating this correctly produces:

map {
  xs:float("-INF"): map {
    "$default": 10
  },
  "x": 2,
  "y": 3
}
==============
map {
  xs:float("-INF"): map {
    "$default": 15
  },
  "x": 2,
  "y": 3
}

We will modify the behavior of the map:XXX functions not to deal with the $extenderKey and its values, except in the cases when $extenderKey is specifically provided as argument.

One could argue what specific value we should choose for the key $extenderKey. We can agree on any newly-generated GUID, or any other suitable value.

Different programming languages handle this in different ways. For example, in Python double-underscore mangling is the way to maintain distinct names, as explained here

@dnovatchev dnovatchev added XPath An issue related to XPath Feature A change that introduces a new feature labels Sep 14, 2022
@ChristianGruen
Copy link
Contributor

ChristianGruen commented Sep 25, 2022

Once again I’d like to draw the attention to the new otherwise expression, which I believe is a much more generic solution to lookups (not just in maps) that don't yield results. It's available in many other languages as well (https://en.wikipedia.org/wiki/Null_coalescing_operator). Do we really need and want a solution specific to maps? What about arrays and sequences, and what about lookups that return an empty sequence as value?

@dnovatchev
Copy link
Contributor Author

Do we really need and want a solution specific to maps? What about arrays and sequences, and what about lookups that return an empty sequence as value?

@ChristianGruen This is a very good question!

I completely agree that if we just wanted to provide a default-value capability to a map, there could be ways to achieve this with a null-coalescing function/operator.

However, this issue is about maps and ways to extend a map - not only with a default value, but potentially with new capabilities, that are not planned in advance. For example, a decorated map, that knows about the sequence of all decorators that have been applied to itself. Obviously the null-coalescing cannot help to achieve this, but placing all new-capabilities-related data within a nested map - can.

Addressing the concern that focusing to "just maps" is something not so generic, let us remember that maps in XPath are analogous to objects in the OOP world. Is there something even more generic than objects?

@ChristianGruen ChristianGruen changed the title [XPath] Proposal: Maps with Infinite Number of Keys: Total Maps and Decorated maps Maps with Infinite Number of Keys: Total Maps and Decorated maps Apr 27, 2023
@ChristianGruen ChristianGruen added Discussion A discussion on a general topic. and removed Feature A change that introduces a new feature labels Jun 29, 2023
@ChristianGruen
Copy link
Contributor

Copied from #707 (comment) (I think it’s better to discuss it here):

It is easy to see that this current proposal has a few shortcomings (the last one critical):

  • In many cases we may want just an error to be raised, thus returning the empty sequence will need more coding and checking (as above)

You could replace the default/fallback value with a function that creates the result (similar to https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-map-get);

@dnovatchev
Copy link
Contributor Author

Copied from #707 (comment) (I think it’s better to discuss it here):

It is easy to see that this current proposal has a few shortcomings (the last one critical):

  • In many cases we may want just an error to be raised, thus returning the empty sequence will need more coding and checking (as above)

You could replace the default/fallback value with a function that creates the result (similar to https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-map-get);

A programmer's proverb:

"Every problem can be solved with introducing one more level of indirection - except the problem of having too-many levels of indirection"

😄

@ndw ndw added PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-optional Categorized as "optional for 4.0" at the Prague f2f, 2024 labels Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion A discussion on a general topic. PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-optional Categorized as "optional for 4.0" at the Prague f2f, 2024 XPath An issue related to XPath
Projects
None yet
Development

No branches or pull requests

5 participants