# Task type
- Free time gap/slot
- Recurring
- calculation 

## Time Gap
find the avaliable time gap


**Task**: 

To answer the question about optimizing your calendar to have a 30-minute walk time daily, let's break down the requirements and steps needed in the solution:

### Steps and Breakdown 

1. **Pre-filter the DataFrame**:
   - Exclude canceled meetings from `filtered_calendar_data` since they do not affect your availability.
   - Include only confirmed and tentative meetings as these impact your schedule.

2. **Determine Available Time Slots**:
   - For each day within the dataset, calculate the intervals between meetings that are at least 30 minutes long. These are potential walk times.

3. **Daily Time Constraints**:
   - Walk times should fit between standard work hours (9:00 AM to 6:00 PM). We must avoid suggesting a walk time that doesn't fit within these hours.

4. **Sort and Aggregate Walk Times**:
   - Prioritize finding at least one 30-minute block per day. If a day already has an available 30-minute block, that's adequate. If not, look into rearranging shorter breaks into one continuous 30-minute block if possible.

5. **Output the Analysis**:
   - For days where it's impossible to arrange a 30-minute walking period due to back-to-back meetings, suggest potential meetings that could be shortened or rescheduled.
   - 
1. **Identify Daily Gaps:** For each day of the week within the data, calculate the time differences between consecutive meetings to find       potential walk slots.
2. **Consider Work Hours**: Only consider gaps within the default work hours from "09:00:00" to "18:00:00".
3. **Filter for Adequate Duration**: Identify gaps that are at least 30 minutes long, which could be used for walking.
4. **Summarize Gaps**: For days where multiple shorter gaps exist, verify if they cumulatively amount to at least 30 minutes.
5. **Provide Recommendations**: Based on this analysis, determine which days already have enough time for walking and which days need schedule adjustments.

the approach involves analyzing the gaps between scheduled events where there are no overlaps.
1. **Filter valid meetings**: Only consider meetings that are not canceled. This could involve filtering out meetings where the status is 'cancelled'.
2. **Identify gaps between meetings**: By iterating through the days covered in the dataset and considering the start and end times of the meetings, we can identify intervals that are free between events. For this, we only consider days and time slots from 09:00:00 to 18:00:00.
3. **Consider gaps of a significant length**: In this scenario, a significant length can be defined as a default value (e.g., 30 minutes), where no meetings are scheduled, and hence, the time can be utilized for working on a new project.  
4. **Return available time slots**: The output should be the list of available time slots sufficiently large between existing meetings to facilitate effective work on a new project.

### Python Implementation

In [None]:
task = "analyze my calender this week and tell me how I can optimize it to have 30 min walk time daily"

answer = """
```python
import pandas as pd
from datetime import timedelta

def query_dataframe(filtered_calendar_data):
    # Filter out cancelled events
    active_meetings = filtered_calendar_data[filtered_calendar_data['status'] != 'cancelled']
    
    # Consider only 'confirmed' and 'tentative' meetings
    active_meetings = active_meetings[active_meetings['status'].isin(['confirmed', 'tentative'])]
    
    # Sort meetings by start time
    active_meetings.sort_values('start', inplace=True)
    
    # Initialize output
    walk_times = []

    # Iterate over each day in the week
    for single_date in pd.date_range(start=filtered_calendar_data['start'].min().date(), periods=7):
        # Filter meetings by day
        daily_meetings = active_meetings[active_meetings['start'].dt.date == single_date.date()]
        
        # Define start and end of the day
        start_of_day = pd.Timestamp(single_date).replace(hour=9, minute=0)
        end_of_day = pd.Timestamp(single_date).replace(hour=18, minute=0)
        
        # Keep track of the previous meeting end time
        previous_end_time = start_of_day

        # Iterate through the day's meetings to find gaps
        for _, row in daily_meetings.iterrows():
            # Check if there is a sufficient gap before the current meeting starts
            if row['start'] - previous_end_time >= timedelta(minutes=30):
                walk_times.append({'date': single_date.date(), 'start': previous_end_time, 'end': row['start']})

            # Update the end time of the most recent meeting
            previous_end_time = row['end']

        # Check for a gap at the end of the day
        if end_of_day - previous_end_time >= timedelta(minutes=30):
            walk_times.append({'date': single_date.date(), 'start': previous_end_time, 'end': end_of_day})

    # Convert walk times to a DataFrame
    answer = pd.DataFrame(walk_times, columns=['date', 'start', 'end'])
    return answer
```
"""

In [2]:
task = "analyze my calender this week and find me time slots to help me schedule yoga class at Thuesday and Friday"
answer = """
```python
import pandas as pd
from datetime import timedelta

def query_dataframe(filtered_calendar_data):
    # Get today's date; this is just to frame the context of 'this week'
    today_date = pd.to_datetime('2023-10-02 08:25:30')

    # Find the weekday of today_date, then calculate the start of the week
    start_of_week = today_date - pd.DateOffset(days=today_date.weekday())
    end_of_week = start_of_week + pd.DateOffset(days=6)

    # Filter for relevant week
    weekly_meetings = filtered_calendar_data[
        (filtered_calendar_data['start'] >= start_of_week) & 
        (filtered_calendar_data['start'] <= end_of_week)
    ]

    # Filter for 'confirmed' or 'tentative' meetings
    valid_meetings = weekly_meetings[weekly_meetings['status'].isin(['confirmed', 'tentative'])]

    # Check for Tuesday and Friday (weekday(): Monday=0, ..., Sunday=6)
    valid_meetings = valid_meetings[valid_meetings['start'].dt.dayofweek.isin([1, 4])]  #    

    # Sort meetings by start date
    valid_meetings.sort_values(by='start', inplace=True)

    # Dictionary to hold available slots
    available_slots = []

    for day in [1, 4]:  # 1 is Tuesday, 4 is Friday
        current_day = start_of_week + pd.DateOffset(days=day)
        daily_meetings = valid_meetings[valid_meetings['start'].dt.date == current_day.date()]
        
        # Start of day (9 AM)
        cur_day_start = pd.Timestamp(current_day.date()).replace(hour=9, minute=0)
        # End of day (6 PM)
        cur_day_end = pd.Timestamp(current_day.date()).replace(hour=18, minute=0)

        # Initialize the previous end time to start of day
        previous_end_time = cur_day_start
        
        for index, row in daily_meetings.iterrows():
            # If there's a gap before this meeting starts, count it as available
            if row['start'] > previous_end_time and (row['start'] - previous_end_time >= timedelta(minutes=30)):
                available_slots.append({
                    'day': current_day.date(),
                    'start': previous_end_time,
                    'end': row['start']
                })
            # Update the end time to the latest
            previous_end_time = max(previous_end_time, row['end'])

        # Check for a slot at the end of the day
        if cur_day_end > previous_end_time and (cur_day_end - previous_end_time >= timedelta(minutes=30)):
            available_slots.append({
                'day': current_day.date(),
                'start': previous_end_time,
                'end': cur_day_end
            })

    # Convert available slots to DataFrame
    available_slots_df = pd.DataFrame(available_slots)
    
    return available_slots_df
```
'"""


In [None]:
task = "Based on my upcoming calendar events, when should I schedule time for a new project"
answer = """
```python
import pandas as pd
from datetime import timedelta

def query_dataframe(filtered_calendar_data):
    # Get today's date, in case it influences the analysis.
    today_date = pd.to_datetime('2023-10-02 08:25:30', format='%Y-%m-%d %H:%M:%S')
    
    # Filter non-cancelled meetings
    valid_meetings = filtered_calendar_data[filtered_calendar_data['status'] != 'cancelled']
    valid_meetings = valid_meetings.sort_values(by='start')
    
    # Dictionary to store free time slots
    free_time_slots = []
    
    # Work Hours
    start_work_hour = 9
    end_work_hour = 18
    
    # Process each day in the dataset
    for day in pd.date_range(start=valid_meetings['start'].min().floor('D'), end=valid_meetings['end'].max().ceil('D')):
        day_meetings = valid_meetings[(valid_meetings['start'].dt.date <= day.date()) & (valid_meetings['end'].dt.date >= day.date())]
        
        # Times for the start and end of the working day
        start_of_day = pd.Timestamp(day).replace(hour=start_work_hour, minute=0)
        end_of_day = pd.Timestamp(day).replace(hour=end_work_hour, minute=0)
        
        # No meetings at all -> the entire day is free
        if day_meetings.empty:
            if (end_of_day - start_of_day) >= timedelta(minutes=30):
                free_time_slots.append([day.date(), start_of_day, end_of_day])
            continue
        
        # Sort meetings by starting time
        day_meetings = day_meetings.sort_values(by='start')
        
        # Check for free time before the first meeting
        first_meeting_start = day_meetings.iloc[0]['start']
        if (first_meeting_start - start_of_day) >= timedelta(minutes=30):
            free_time_slots.append([day.date(), start_of_day, first_meeting_start])
            
        # Check gaps between meetings
        previous_end = day_meetings.iloc[0]['end']
        for i in range(1, len(day_meetings)):
            current_start = day_meetings.iloc[i]['start']
            if (current_start - previous_end) >= timedelta(minutes=30):
                free_time_slots.append([day.date(), previous_end, current_start])
            previous_end = max(previous_end, day_meetings.iloc[i]['end'])
        
        # Check for free time after the last meeting
        last_meeting_end = day_meetings.iloc[-1]['end']
        if (end_of_day - last_meeting_end) >= timedelta(minutes=30):
            free_time_slots.append([day.date(), last_meeting_end, end_of_day])
    
    # Convert list to DataFrame
    time_slots_df = pd.DataFrame(free_time_slots, columns=['Day', 'Start Time', 'End Time'])
    
    return time_slots_df
```
"""


In [None]:
task = "Based on my upcoming calendar events, when should I schedule time for a new project next week except Monday and Tuesday?"
answer = """
    ```python
    import pandas as pd
    from datetime import timedelta

    def query_dataframe(filtered_calendar_data):
        # Get today's date, in case it influences the analysis.
        today_date = pd.to_datetime('2023-10-02 08:25:30', format='%Y-%m-%d %H:%M:%S')

        # Filter valid meetings (non-cancelled).
        valid_meetings = filtered_calendar_data[filtered_calendar_data['status'] != 'cancelled']
        valid_meetings = valid_meetings.sort_values(by='start')

        # Initialize a dictionary to store free time slots.
        project_time_slots = dict()

        # Determine the availability period per each day in the dataset.
        for day in valid_meetings['start'].dt.date.unique():
            daily_meetings = valid_meetings[valid_meetings['start'].dt.date == day]
            availability_periods = []

            if not daily_meetings.empty:
                # Assume start of the day at "09:00:00" and end of day at "18:00:00".
                start_of_day = pd.Timestamp(day).replace(hour=9, minute=0)
                end_of_day = pd.Timestamp(day).replace(hour=18, minute=0)
                
                # Start with the first possible slot in the day.
                previous_end_time = start_of_day
                
                # Check each meeting to find gaps.
                for index, row in daily_meetings.iterrows():
                    if row['start'] - previous_end_time >= timedelta(minutes=30):
                        # If the gap between the current start time and the previous end time is at least 30 minutes.
                        availability_periods.append((previous_end_time, row['start']))
                    
                    # Update the end time to the latest end time encountered.
                    previous_end_time = max(previous_end_time, row['end'])
                
                # Consider time after the last meeting till the end of day.
                if end_of_day - previous_end_time >= timedelta(minutes=30):
                    availability_periods.append((previous_end_time, end_of_day))
            
            else:
                # If there are no meetings, the entire day is free.
                availability_periods.append((start_of_day, end_of_day))
            
            # Store available periods per day if any.
            if availability_periods:
                project_time_slots[day] = availability_periods

        # Convert the dictionary to a DataFrame for easy manipulation and viewing.
        project_time_slots = pd.DataFrame([(k, v) for k, vals in project_time_slots.items() for v in vals], columns=['Day', 'Available Time Slot'])

        return project_time_slots

    ```
"""


## Recurring

In [None]:
task = "Is there any time slot where I consistently have meetings at Monday and tuesday?"
answer = """
    ```python
    def query_dataframe(filtered_calendar_data):
        # Step 1: Filter out cancelled meetings
        active_meetings = filtered_calendar_data[filtered_calendar_data['status'] != 'cancelled']
        
        # Step 2: Extract day of the week
        active_meetings['day_of_week'] = active_meetings['start'].dt.dayofweek
        
        # Step 3: Creating 30-minute windows during business hours
        business_hours = pd.date_range(start="09:00", end="18:00", freq="30T")
        time_slots = [(start.time(), (start + pd.Timedelta(minutes=30)).time()) for start in business_hours]
        
        # Step 4: Filter for Mondays and Tuesdays
        monday_tuesday_meetings = active_meetings[active_meetings['day_of_week'].isin([0, 1])]
        
        # Step 5: Group by day of the week
        grouped_by_day = monday_tuesday_meetings.groupby('day_of_week')
        
        # Step 6: Analyzing consistency across Monday and Tuesday
        consistent_slots = []
        for slot in time_slots:
            slot_coverage = 0
            for name, day_group in grouped_by_day:
                if any((day_group['start'].dt.time <= slot[0]) & (day_group['end'].dt.time >= slot[1])):
                    slot_coverage += 1
            
            if slot_coverage == 2:  # both Monday and Tuesday have at least one meeting covering the slot
                consistent_slots.append(slot)
        
        # Step 7: Results Generation
        if consistent_slots:
            return pd.DataFrame(consistent_slots, columns=["Start Time", "End Time"])
        else:
            return pd.DataFrame(columns=["Start Time", "End Time"])
    ```
"""

In [None]:
task = "Is there any time slot where I consistently have meetings every weekday?"
answer = """
```python
    def query_dataframe(filtered_calendar_data):
        # Step 1: Filter out cancelled meetings
        active_meetings = filtered_calendar_data[filtered_calendar_data['status'] != 'cancelled']
        
        # Step 2: Extract day of the week
        active_meetings['day_of_week'] = active_meetings['start'].dt.dayofweek
        
        # Step 3: Creating 30-minute windows during business hours
        business_hours = pd.date_range(start="09:00:00", end="18:00:00", freq="30T")
        time_slots = [(start.time(), (start + pd.Timedelta(minutes=30)).time()) for start in business_hours]
        
        # Step 4: Filter for weekdays (Monday(0) to Friday(4))
        weekday_meetings = active_meetings[active_meetings['day_of_week'].isin(range(5))]
        
        # Step 5: Group by day of the week
        grouped_by_day = weekday_meetings.groupby("day_of_week")
        
        # Step 6: Analyzing consistency across weekdays
        consistent_slots = []
        for slot in time_slots:
            slot_coverage = 0
            for _, day_group in grouped_by_day:
                if any((day_group['start'].dt.time <= slot[0]) & (day_group['end'].dt.time >= slot[1])):
                    slot_coverage += 1
            
            if slot_coverage == 5:  # each weekday has at least one meeting covering the slot
                consistent_slots.append(slot)
        
        # Step 7: Results Generation
        if consistent_slots:
            return pd.DataFrame(consistent_slots, columns=["Start Time", "End Time"])
        else:
            return pd.DataFrame(columns=["Start Time", "End Time"])
```
"""

In [None]:
task = ""
answer = """
```python
    def query_dataframe(filtered_calendar_data):
        # Step 1: Filter out cancelled meetings
        active_meetings = filtered_calendar_data[filtered_calendar_data['status'] != 'cancelled']
        
        # Step 2: Extract day of the week
        active_meetings['day_of_week'] = active_meetings['start'].dt.dayofweek
        
        # Step 3: Creating 30-minute windows during business hours
        business_hours = pd.date_range(start="09:00:00", end="18:00:00", freq="30T")
        time_slots = [(start.time(), (start + pd.Timedelta(minutes=30)).time()) for start in business_hours]
        
        # Step 4: Filter for weekdays (Monday(0) to Friday(4))
        weekday_meetings = active_meetings[active_meetings['day_of_week'].isin(range(5))]
        
        # Step 5: Group by day of the week
        grouped_by_day = weekday_meetings.groupby("day_of_week")
        
        # Step 6: Analyzing consistency across weekdays
        consistent_slots = []
        for slot in time_slots:
            slot_coverage = 0
            for _, day_group in grouped_by_day:
                if any((day_group['start'].dt.time <= slot[0]) & (day_group['end'].dt.time >= slot[1])):
                    slot_coverage += 1
            
            if slot_coverage == 0:  # each weekday has at least one meeting covering the slot
                consistent_slots.append(slot)
        
        # Step 7: Results Generation
        if consistent_slots:
            return pd.DataFrame(consistent_slots, columns=["Start Time", "End Time"])
        else:
            return pd.DataFrame(columns=["Start Time", "End Time"])
```
"""   

In [None]:
task = "At what hour of the day I typically have the most meetings?"
answer = """
```python
    def query_dataframe(filtered_calendar_data):
        # Step 1: Filter out cancelled meetings
        active_meetings = filtered_calendar_data[filtered_calendar_data['status'] != 'cancelled']
        
        # Step 2: Extract hour of the day from start times
        active_meetings['start_hour'] = active_meetings['start'].dt.hour
        
        # Step 3: Count meetings per hour
        meetings_per_hour = active_meetings['start_hour'].value_counts().sort_index()
        
        # Step 4: Identify the hour with the maximum number of meetings
        max_meetings_hour = meetings_per_hour.idxmax()
        
        # Retrieve count for the hour with the most meetings
        max_count = meetings_per_hour[max_meetings_hour]
        
        # Step 5: Return a DataFrame with the result
        result = pd.DataFrame({
            'Hour of Day': [max_meetings_hour],
            'Number of Meetings': [max_count]
        })
        
        return result
```
"""
