# Redundancy

As we've seen in data representations, sometimes redundancy can create more efficient operations. Code is redundant if we can remove some parts of it, yet still reconstruct the removed portions. Example, storing 2 forms of a dictionary `{parent_node:child_node}` and `{child_node:parent_node}`

- Redundant storage can sometimes increase performance
- Redundant storage increases risk of internal inconsistency/ weird behaviour
  - Make sure to have a function checking for internal inconsistency

## Meeting Calendar

We use a meeting calendar which tracks all meetings of an organization. Each meeting has a `title`, `date` and `invitees`.

We see that we can either store meetings keyed by `date` or by `invitees`. Storing both allowing efficient accessing of meetings via both ways.

In [52]:
class Meeting():
    def __init__(self, title, date, invitees):
        self.title = title
        self.date = date
        self.invitees = invitees 
        
        # store the other dates on which this meeting occurs
        self.recurrences = set() 
            
    ########### ignore this method ###########
    def pprint(self, num_people_to_show=5):
        invitees = ""
        personi = 0
        for person in self.invitees:
            if personi <= num_people_to_show:
                invitees = invitees + person + ", "
                personi = personi + 1
            else:
                invitees = invitees[:-2] + "....."
                break
        result = f"{self.title} with {invitees[:-2]}"
        
        # add recurrences to printout
        if len(self.recurrences) != 0 : 
            recurrs = "; recurring on "
            for date in self.recurrences:
                recurrs = recurrs + date + ", "
            recurrs = recurrs[:-2]
            result = result + recurrs
        
        print(result)
        
        
class Calendar():
    def __init__(self):
        self.dates = {}
        self.people = {}
        
        self._checkRep()

    def _isrecurring(self, new_meeting):
        """Checks whether this meeting already
        exists with the same title and invitees.

        Recurring meetings occur on multiple dates with the same title and same set of invitees.
        """
        
        for meetings in self.dates.values():
            for meeting in meetings:
                if meeting.title == new_meeting.title and \
                    meeting.invitees == new_meeting.invitees:
                        return meeting
        return False
        
    def add_meeting(self, meeting):
        
        recurring_meeting = self._isrecurring(meeting)
            
        if not recurring_meeting: 
            # Implementation before support for recurring meetings:
            if not meeting.date in self.dates:
                self.dates[meeting.date] = [meeting]
            else: 
                self.dates[meeting.date].append(meeting)

            for invitee in meeting.invitees:
                if not invitee in self.people:
                    self.people[invitee] = [meeting]
                else: 
                    self.people[invitee].append(meeting)  
        else:
            # the recurring case
            recurring_meeting.recurrences.add(meeting.date)             
            
        # uncomment this line and our checkRep breaks---this meeting is only added to the people dictionary:
        # self.people["Adam"].append(Meeting("Meeting I forgot about!", "04-04-19", {"Adam"}))

        self._checkRep()

    def get_persons_meetings(self, person):
        if person in self.people:
            return self.people[person]
        else:
            raise LookupError("This person has no meetings.")
            
        # # First solution with looping needed, when we didn't store the people dictionary:
        # results = []
        # for meetings in self.dates.values():
        #     for meeting in meetings:
        #         if person in meeting.invitees:
        #             results.append(meeting)
        # return results


    def _checkRep(self):
        """Ensures that the internal representations of the meeting data
        are consistent; checks that all meetings in the people dictionary
        are also in the date dictionary, and vice versa. If not, raises 
        an AssertionError. """
        
        for dated_meetings in self.dates.values():
            for meeting in dated_meetings:
                for person in meeting.invitees:
                    assert meeting in self.people[person]
                
        for personed_meetings in self.people.values():
            for meeting in personed_meetings:
                assert meeting in self.dates[meeting.date]
                

    ############## ignore this method ##############
    def pprint(self):
        for date, meetings in self.dates.items():
            print(date + ": ") 
            for meeting in meetings:
                meeting.pprint()
            print()
        self._checkRep()

### Illustrating Our Case

In [53]:
import random
from datetime import datetime, timedelta

def generate_test_calendar(num_meetings, num_staff):
    '''generates a synthetic calendar containing num_meetings with a total directory of num_staff'''
    calendar = Calendar()
    
    # Generate staff members
    staff = [f"staff{i}" for i in range(1, num_staff + 1)]
    
    # Generate meeting titles
    meeting_types = ["Team Meeting", "Project Review", "Training Session", "Client Call", "Brainstorming"]
    
    # Generate dates
    start_date = datetime(2023, 1, 1)
    date_range = [start_date + timedelta(days=i) for i in range(365)]  # One year range
    
    for _ in range(num_meetings):
        # Random meeting details
        title = random.choice(meeting_types)
        date = random.choice(date_range).strftime("%m-%d-%y")
        num_attendees = random.randint(2, min(num_staff, 10))  # Between 2 and 10 attendees, or max staff if less
        invitees = set(random.sample(staff, num_attendees))
        
        # Create and add the meeting
        calendar.add_meeting(Meeting(title, date, invitees))
    
    return calendar

# adjust as needed
calendar = generate_test_calendar(3000,
                                200)

# Print the generated calendar
calendar.pprint()

06-18-23: 
Training Session with staff156, staff155, staff144
Team Meeting with staff95, staff135, staff188, staff127, staff88, staff126...
Team Meeting with staff26, staff162, staff178, staff172, staff147, staff196...
Brainstorming with staff126, staff59, staff164, staff160
Project Review with staff115, staff132, staff144, staff3, staff1, staff90...
Team Meeting with staff37, staff134, staff19, staff25
Brainstorming with staff67, staff161

10-03-23: 
Brainstorming with staff161, staff62, staff138, staff65, staff27, staff182...
Project Review with staff95, staff99, staff177, staff153, staff181, staff156...
Training Session with staff13, staff190, staff145, staff97, staff28, staff138...
Team Meeting with staff4, staff141
Project Review with staff161, staff5, staff28, staff21, staff24, staff46...

05-24-23: 
Project Review with staff114, staff152, staff166
Team Meeting with staff124, staff76, staff79, staff113, staff154, staff20...
Project Review with staff82, staff93, staff161, staff153

### Accessing By People

Suppose Adam wants to know his schedule. We see that in the original solution (assuming we only had `self.dates` and no `self.people`), we'd have to loop through every possible meeting and keep track of whether Adam is present in this meeting.

After implementing `self.people`, its much faster. Try the below code with both methods, by commenting out the relevant portions. The `self.people` method takes less than half the time of without `self.people`.

In [58]:
import time

start_time = time.time()

adams_meetings = calendar.get_persons_meetings("staff2")

for mtg in adams_meetings:
    mtg.pprint()

end_time = time.time()

execution_time = end_time - start_time
print(f"Execution time: {execution_time:.8f} seconds")

Team Meeting with staff120, staff198, staff2, staff200, staff109, staff152...
Brainstorming with staff124, staff16, staff2, staff37, staff78, staff139...
Client Call with staff80, staff2, staff194, staff153, staff19, staff6...
Client Call with staff124, staff95, staff106, staff187, staff2, staff200...
Team Meeting with staff72, staff2, staff176, staff144, staff78, staff121
Training Session with staff140, staff2, staff75, staff130, staff136
Training Session with staff59, staff2, staff194, staff11, staff173, staff116...
Brainstorming with staff31, staff2, staff109, staff142, staff88, staff70...
Brainstorming with staff2, staff127, staff61, staff125, staff195, staff163...
Training Session with staff94, staff13, staff2
Training Session with staff68, staff190, staff2, staff200, staff48, staff163...
Client Call with staff77, staff145, staff2, staff147, staff191, staff42...
Project Review with staff68, staff149, staff2, staff19, staff118, staff27...
Team Meeting with staff2, staff180, staff16

## Ensuring Consistency

Since we have 2 dictionaries that are 2 different representations of the same information, we need internal consistency checks using `checkRep` to ensure our code works.

> note the underscore at the beginning of method name. This is a Python convention for functions which aren't meant to be accessed outside of this class

We use `checkRep` fairly frequently, everytime we add a `Meeting` and everytime we print the calendar. We don't want to use this too much for efficiency reasons of course.