Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IntervalMultiDict #83

Closed
AlexandreDecan opened this issue Apr 30, 2023 · 3 comments
Closed

IntervalMultiDict #83

AlexandreDecan opened this issue Apr 30, 2023 · 3 comments
Labels
discussion Let's discuss about it! enhancement New feature or request long term Will be addressed later

Comments

@AlexandreDecan
Copy link
Owner

AlexandreDecan commented Apr 30, 2023

Hello,

We currently have an IntervalDict structure that enables to map values to Interval instances. One of the main limitation of this structure is its inability to store more than one value for each Interval. The current issue proposes to extent IntervalDict (or to replace it) with an IntervalMultiDict structure allowing to store more than one pair key/value with intervals.

Let's assume the following scenario where "teachers" need to book "rooms" for timeslot (and a timeslot is an Interval instance of datetime or, to make it simpler, of Integers). Currently, one possible option to store these kind of data is as follows:

>>> teacher = P.IntervalDict()
>>> room = P.intervalDict()
>>> teacher[P.closed(8, 10)] = 'Mr. Bean'
>>> teacher(P.openclosed(10, 12)] = 'Ms. Carrot'
>>> room[P.closed(8, 12) = 'Appropriate room'

The purpose of the current issue is to make it more "API-friendly" to deal with such use-cases:

>>> attribution = P.MultiIntervalDict()
>>> attribution[P.closed(8,12)]['room'] = 'Appropriate room'
>>> attribution[P.closed(8,10)]['teacher'] = 'Mr. Bean'
>>> attribution[P.openclosed(10,12)]['teacher'] = 'Ms. Carrot'

Or even:

>>> attribution['room'][P.closed(8, 12)] = 'Appropriate room'
>>> attribution['teacher'][P.closed(8, 10)] = 'Mr. Bean'
>>> attribution['teacher'][P.openclosed(10, 12)] = 'Ms. Carrot'

Among other features, this enables:

>>> attribution[P.closed(9, 11)]
{
  P.closed(9,10): {
    'teacher': 'Mr. bean', 
    'room': 'Appropriate room', 
  },
  P.openclose(10, 11): {
    'teacher': 'Ms. Carrot', 
    'room':  'Appropriate room', 
  }
}

In this case, partial views on the MultiIntervalDict instances would be returned.

The opposite behaviour is also an option:

>>> attribution['teacher']
{
  P.closed(8, 10): 'Mr. Bean', 
 P.openclosed(10 ,12): 'Ms. Carrot', 
}

In that case, the returned value type would be an IntervalDict.

Obviously, combinations of both examples are expected:

>>> attribution[P.closed(9, 11)]['teacher']
{
  P.closed(9, 10): 'Mr. bean', 
  P.openclosed(10, 12): 'Ms. Carrot', 
}

or

>>> attribution['teacher'][P.closed(9, 11)]
{
  'Mr. Bean' : P.closed(9, 10), 
  'Ms. Carrot': P.openclosed(10, 12), 
}

The underlying structure would be a dictionary of IntervalDict instances. Performance is not expected to be good, but the whole structure would be mostly provided for convenience, not for performance. One main question that remains is how to distinguish between keys that are Interval instances and keys that are "appropriate dict keys". Is it easy to do a check on the type of the key, but what should be done when a key is, e.g., a numerical or a string value ? (since an Interval instance can be composed of numerical or string values).

Another point to consider is how to handle updates on "partial" objects? For example, if one asks for d[x][y], we need to ensure that d is updated if d[x] is. Returning an IntervalDict for d[x] is an option (since IntervalDictinstances are mutable) but probably we need to create a new subclass of dict so that any modification to d[i][k] where i is an interval and k is a "traditional" key would be reflected on d. Another option, as seen in pandas, would be to go for a d[x, y] notation (i.e., a tuple has to be provided to make sure that we return an appropriate data structure to keep track of the changes, instead of allowing notations such as d[x][y]).

@AlexandreDecan AlexandreDecan added enhancement New feature or request help wanted Extra attention is needed labels Apr 30, 2023
@AlexandreDecan
Copy link
Owner Author

Here's an example of a potential API. Indexing accepts either an interval, a key, or a tuple (interval, key). Internally, the structure would be a dict of IntervalDict (performance might be an issue, but this new IntervalMultiDict would be mostly proposed for convenience).

>>> m = P.IntervalMultiDict()
>>> m[P.closed(8, 12), 'teacher'] = 't1'
>>> m[P.closed(8, 10), 'room'] = 'r1'
>>> m[P.closed(11, 12), 'room'] = 'r2'
>>> m['room']  # Providing a key returns a (new?) IntervalDict instance
IntervalDict({P.closed(8, 10): 'r1', P.closed(11, 12): 'r2'})
>>> m[P.closed(8, 10]  # Returns a (likely new) IntervalMultiDict
{'teacher': IntervalDict({P.closed(8,10): 't1'}), 'room': IntervalDict(....)}
>>> m[P.singleton(9)]  # If interval is a singleton, returns a classical dict
{'teacher': 't1', 'room': 'r1'}
>>> m[P.singleton(9), 'room']  # Singleton + key return a value
'r1'
>>> m[9, 'room']  # For convenience, if a tuple is passed and the interval is a singleton, its value can be provided directly
'r1'
>>> m[P.singleton(9), ['room', 'teacher']]  # A list of keys can be passed
{'room': 'r1', 'teacher': 't1'}
>>> m[P.closed(10, 12)] = { 'room': 'r3', 'course': 'c1'}  # Dict can be used for assignment
>>> m[P.singleton(11)]
{'teacher': 't1', 'room': 'r3', 'course': 'c1'}

Mutability would only we allowed through indexing (i.e., m[x] always return a new instance, either of an IntervalDict if x is a key or a list of keys, or an IntervalMultiDict if x is an Interval. Changes on these returned values has no impact on the initial structure. That means that m[p,k] = x mutates m while m[p][k] = x does not).

Since this API is quite distinct from the one of a traditional dictionary, perhaps this structure shouldn't be called "IntervalMultiDict". Using "Dict" in the name implicitly suggests we mimick the API of a dict (as was the case for IntervalDict) but this means we need to implement many more operations (such as keys(), values(), items() returning views), and this can be tricky at first

@AlexandreDecan AlexandreDecan added discussion Let's discuss about it! and removed help wanted Extra attention is needed labels May 3, 2023
@AlexandreDecan
Copy link
Owner Author

While convenient, these two examples are "incorrect" in the sense that one would expect the output type to be consistent for all intervals, including singletons:

>>> m[P.singleton(9)]  # If interval is a singleton, returns a classical dict
{'teacher': 't1', 'room': 'r1'}
>>> m[P.singleton(9), 'room']  # Singleton + key return a value
'r1'

In the first example, P.singleton(9) cannot be replaced by 9 since there's no way to detect in m[x] whether x aims to refer to a value from the interval domain, or a key from the dictionary.

@AlexandreDecan
Copy link
Owner Author

I'll keep this for later...

@AlexandreDecan AlexandreDecan closed this as not planned Won't fix, can't repro, duplicate, stale May 27, 2023
@AlexandreDecan AlexandreDecan added the long term Will be addressed later label May 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Let's discuss about it! enhancement New feature or request long term Will be addressed later
Projects
None yet
Development

No branches or pull requests

1 participant