Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add information about recurring event and the sequence number #90

Open
NicoHood opened this issue May 15, 2022 · 20 comments
Open

Add information about recurring event and the sequence number #90

NicoHood opened this issue May 15, 2022 · 20 comments

Comments

@NicoHood
Copy link

NicoHood commented May 15, 2022

Hi!
I am using your library with success for some time. I do have a feature request, or more a brainstorm on how to solve this issue.

With your library I am basically unflattening the recurring event to flat single-day events. However the information that this was a recurring event is actually lost then. What is also lost is the information which recurring sequence this event was (the first, seconds, third occurrence).

Why do I need this? a) I have "normal" and recurring events. Sometimes I want to filter out the recurring events, as they bloat my usecase. b) I need a unique ID for each event. Usually the events have this id, but recurring events all have the same id. So I need to calculate a new internal id from the id + the sequence numer. That's at leats the plan.

Now how to solve this? Currently when parsing I will look at the whole input ical file and check if that original event had an rrule set. This way I can determine if the event was recurring. I currently cannot determine the sequence number, I'd need an rrule tool for that, I guess.

But isnt't there a way to built this into the library? Maybe with a custom X-RECURRING property or so? I don't want to bloat the library with my special usecase, but I can imagine, that others also might run into this issue. What are your thoughts?


We're using [Polar.sh](https://polar.sh/niccokunzmann) so you can upvote and help fund this issue. We receive the funding once the issue is completed & confirmed by you. Thank you in advance for helping prioritize & fund our work. Fund with Polar
@NicoHood
Copy link
Author

NicoHood commented May 15, 2022

I successfully managed to calculate the sequence as well. So here is my code, so you can have a better feeling of what I am talking about:

ical = icalendar.Calendar.from_ical(response.text)

# A dict that checks if a given uid is a recurring event.
# This is required since we unfold all events and afterwards cannot tell if they are recurring or not.
uid_parent_map = {}
for component in ical.walk():
    if component.name == "VEVENT":
        # Only add the parent event, not a moved event (that has a recurrence id set)
        if component.get("RECURRENCE-ID") is None:
            uid_parent_map[component["UID"]] = component

# Get start and end date of week:
# Requires python 3.8: https://docs.python.org/3/library/datetime.html#datetime.date.fromisocalendar
start = date.fromisocalendar(year, week, 1)
end = start + timedelta(days=7)

events = recurring_ical_events.of(ical).between(start, end)

# Convert ical to our internal class formal
eventsData = list()
for event in events:
    parent_event = uid_parent_map[event["UID"]]
    recurring = parent_event.get('RRULE') is not None
    sequence = 0
    if recurring:
        # Recreate the rrule using rrule, exdate, rdate.
        # Note: Exrule is deprecated and thereby not used.
        event_rrule = rrulestr(parent_event.get('RRULE').to_ical().decode(), dtstart=parent_event["DTSTART"].dt)

        # Rrules are managed in an rruleset:
        # https://dateutil.readthedocs.io/en/stable/rrule.html#dateutil.rrule.rruleset
        set = rruleset()
        set.rrule(event_rrule)

        # NOTE: We will NOT cound the exdates, as they would disturb our sequence.
        #       This means excluded events will just never appear as sequence number.
        # # This is a bit complicated:
        # # EXDATE and be either an vDDDLists object or a list of vDDDLists objects.
        # all_exdates = parent_event.get('EXDATE')
        # if all_exdates is not None:
        #     # If it is not a list, simply create a list with that single entry for easier parsing.
        #     if not isinstance(all_exdates, list):
        #         all_exdates = [all_exdates]
        #
        #     # Now iterate through the list, and the nested elements (also in a list)
        #     for exdatelist in all_exdates:
        #         for exdate in exdatelist.dts:
        #             set.exdate(exdate.dt)

        # This is a bit complicated:
        # RDATE and be either an vDDDLists object or a list of vDDDLists objects.
        # Note: I have not seen this in the wild yet, so this code section is untested
        all_rdates = parent_event.get('RDATE')
        if all_rdates is not None:
            print('rdate', event["SUMMARY"])
            # If it is not a list, simply create a list with that single entry for easier parsing.
            if not isinstance(all_rdates, list):
                all_rdates = [all_rdates]

            # Now iterate through the list, and the nested elements (also in a list)
            for rdatelist in all_rdates:
                for rdate in rdatelist.dts:
                    set.rdate(rdate.dt)

        # Special case when an event was moved:
        # Search for the original occurence of the event (value inside recurrence-id)
        # in order to have the original sequence intact!
        after = parent_event["DTSTART"].dt
        before = event["DTSTART"].dt
        recurrence_id = event.get("RECURRENCE-ID")
        if recurrence_id is not None:
            before = recurrence_id.dt

        # We must expand the whole rrule in order to know which sequence number was used.
        # The sequence starts with 1 (not zero!)
        # TODO the library crashes for recurring all-day events, so we need to convert to datetime first:
        # https://github.com/dateutil/dateutil/issues/938#issuecomment-1127004769
        all_rrule_events = set.between(after if type(after) is datetime else datetime.combine(after, datetime.min.time()), before if type(before) is datetime else datetime.combine(before, datetime.max.time()), inc=True)
        sequence = len(all_rrule_events)

    data = self.get_event_data(event, recurring = recurring, sequence = sequence)
    if data is not None:
        eventsData.append(data)

return eventsData

And this is where I generate the guids:

# Generate a unique (but well-known) id for the event, by using the sequence of the recurring event.
guid = uuid.uuid5(uuid.NAMESPACE_URL, f"{event['UID']}-{sequence}")

@niccokunzmann
Copy link
Owner

niccokunzmann commented May 16, 2022

Hi, thanks for taking the time for the whole motivation of your use-case.

When I implemented this, I also needed to find out if a recurrence has a modification, so there is identity already specified by the RFC 5545:
An event is the same event as another event if and only if these apply:
(1) the UUID is the same
(2) the DTSTART is the same

Or else: if the same event happens twice at the exact same time, it is the same. (If you make a copy in your calendar, the UUID changes)

If the sequence number is changed, we know which version is latest.

So, it seems, that the library just looses this information. That should be easy to add.

On a side-note, I would encourage you to use the identity specified by the standard and not construct your own for it like [event no 10 with UUID X]

I would say that knowing event identity is a common use-case, maybe the other issue #29 about the event index is about that - but without the motivation that is hard to guess. It should at least be documented in a section. I would say that this holds true and we can make a test with all calendar files used by this library.

counts = {} # event identity : count
for event in recurring_ical_events.of(a_calendar).all():
    event_identity = (event["UID"], event.get("RECURRENCE-ID", event["DTSTART"])) # assume UUID given
    counts[event_identity] = counts.get(event_identity, 0) + 1
assert all(count == 1 for count in counts.values())

What do you think?

@NicoHood
Copy link
Author

Using the dtstart is not an option for me, as I want to track how events change. Let me explain the usecase again:

I have a stagging environment A that parses ical files.
And an event database B.

Whenever an event should move from A to B it must be confirmed by a human. And that'swhy I need to check if an event is already in the database, or if it changed. A dtstart change would be quite common, so I cannot rely on that. I can only rely on the uid, but for recurring events that is not enough. So for my usecase a combination of the uid and sequence is a must. I currently do not see any other solution to that.

Also aren't you talking about UID instead of UUID?

@niccokunzmann
Copy link
Owner

UID - yes.
Also, I remember now, the Attribute is called 'RECURRENCE-ID' to make sure that moving recurrences is possible. I would guess that a combination of DTSTART or if present RECURRENCE-ID will yield a good result. Numbering fails if a weekly event is moved by more than one week.

@NicoHood
Copy link
Author

Numbering fails if a weekly event is moved by more than one week.

That is not true, why should it be? The recurrence id is set to the date where the event should originally take place. I take that into account and search for this end date, rather than the new, moved end date. This way I get the sequence number in the right place. What you mean is possibly that the order is wrong (correct), but the number itself is still unique (which is most important for me).

My usecase is quite specific. Another idea would be to integrate my logic and use this to produce unique UIDs. Because this is the final usecase, and if your library wants to unflatten an ical file, shouldn't it also take the UIDs into account? I could imagine that recurring events unflattend will have duplicated, UIDs which is agains the spec, isn't it?

@niccokunzmann
Copy link
Owner

This way I get the sequence number in the right place.

Cool! You might have said that somewhere. I am catching up.

I need a unique ID for each event. Usually the events have this id, but recurring events all have the same id. So I need to calculate a new internal id from the id + the sequence numer.

Here, you are talking about the sequence number.

So. How would you like to go about it?

E.g. if you work on it in a fork, we can still be in discussion here and in a PR.

Requirements that you do not have could be Python version compatibility and zoneinfo compatibility, performance.

For performance, I have a wondering about whether generating all events from 1970 to 2050 may or may not be so bad if at least they are skipped easily.
Also, your code could run optionally. We can make the tests so they switch it on and off.

What do you do with the RDATE if you do not consider EXDATE?

@niccokunzmann
Copy link
Owner

I modified the code to include "RECURRENCE-ID": #90 (comment)

@NicoHood
Copy link
Author

It took me a long time to understand your code, but it is super smart. The idea is to use the id + date OR recurring id in combination, which is mostly equivalent to my code, just smaller. What is not handled yet is the rdate, but that could be added as well. I have no sample whith an rdate, but isn't that just like the sequence?

This way we could generate new UIDs that are truely unique. Smart, smart...

Now the next step for me is to ask one more question:
Why are RECURRENCE-ID and such information even included in the generated event? Since the library's goal is to unfold all events I would expect them to not have this number.

In this particular usecase it is very useful to know, but if we integrate the logic inside the library, we can strip all the recurring related information. But this would also mean, that we need to overwrite the UID and set a new one.

In the end we have to decide:

  • Do we want each outputted event to have a unique (new) id?
  • Do we want to remove all (not required) information about the recurring, as the event was flattened?

I'd say yes, what do you say?

@NicoHood
Copy link
Author

I have to correct my statement:
Using the date is not enough, as it would not work when the event was moved. Let's say you've added a recurring event on monday, but now it has moved from 19:00 to 20:00, then it would be treated as different UID. It would be better, if it still has the same uid, with the new date, so I can track that change and display it on my User Interface. That is the only option I have.

@niccokunzmann
Copy link
Owner

niccokunzmann commented May 19, 2022 via email

@NicoHood
Copy link
Author

Well the recurrence id is usually set for moved events, but not for normal recurring events, right?

Indicating if an event is recurring is important to me. If we should use the recurrence id, i am not sure. It could be a smart default, very close to the spec. But it might also be very non-standard, I don't know.

@niccokunzmann
Copy link
Owner

niccokunzmann commented May 19, 2022 via email

@NicoHood
Copy link
Author

So I suggest to always set the recurrence id to the start of the event, if it IS a recurring event. This covers one of my usecases.

For the second usecase I am not yet sure what is the best practice. Rewriting the uid could be an idea.

@niccokunzmann
Copy link
Owner

niccokunzmann commented May 20, 2022 via email

@NicoHood
Copy link
Author

The second usecase is to have unique IDs for each event. None of the event should have the same UID. I'd suggest to use a combination of the mentioned UID + Sequence of the recurring event (note: the "sequence" I mean is not the sequence in the ical spec)

@niccokunzmann
Copy link
Owner

niccokunzmann commented May 26, 2022 via email

@niccokunzmann
Copy link
Owner

niccokunzmann commented Oct 11, 2022 via email

@NicoHood
Copy link
Author

I think we are misunderstanding each other. In the meantime I learned and tried a lot and got some good feeling of what I really need now.

I have two requirements for the unfolded events:

  • Know if they were in a recurring series (yes/no)
  • Know the number of recurrence with in the series (1...n)

I have implemented it by scanning the unfolded events afterwards, but it would make sense to integrate that into the library itself. We can keep everything as it is, but we need to add one more custom property. I think there is no way around a custom property, as this information is nothing specified by the spec.

I've implemented it with a single unsigned integer.

  • If it is 0 the event is not recurring, if it is >0 it is (first requirement)
  • A recurring event will have the number within that series stored as int starting from 1..n

I would keep the UID and I would keep the RECURRENCE-ID a) to not break existing code and b) as they are both useful in its own way. The UID is the original id and thereby useful (should be self explaining), the RECURRENCE-ID is the date from where the event date was moved. I personally do not (yet) need this information, but others might need.

So how do we want to name the new property? I've named it internally in my code sequence, but i think in the spec this name is already used for something else. occurence would be another idea, or recurring-sequence? Maybe also with a X- prefix?

Is it more clear now? If not, we can also have a chat in german, that might be easier.

@niccokunzmann
Copy link
Owner

I think, this is very clear now.

  1. If an event is not recurring, nothing is changed
  2. If an event is recurring, a property is added, I propose to name it X-RECURRENCE-INDEX as
    • it is about recurrence
    • it is a custom property
    • it is an index in a sequence.

I am happy to see a PR with some tests. I think these will make sure that you do not get any problems in production:

  • no property
  • what is expected if RDATE is there
  • what is expected if only EXDATE is there
  • what is expected if RRULE is there
  • what happens if they are there but there is only one event in the sequence
  • between() and at() will sometimes not return the event with index 0.
  • create a test with about 10000 events (daily 1970 to 2000) and only use 1 of them
  • proper index for rrule, exdate and rdate mix also with edited event

Did I get it? Hab ich es endlich verstanden? :)

By the way: if you like me to implement it, I am happy to be funded. I am doing this in my free time, have two kids and we live below the poverty line according to this definition (I am in Wales), at the moment.

@NicoHood
Copy link
Author

That is correct so far. Just one difference: I've set the recurrence index to 0 for non recurring events. Just to indicate, that this feature is available (instead of leaving it out). Both options however should work for the usecase.

I need some more time to implement this, but we have a good starting point. Give me some time :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants