In [1]:
import matplotlib
import numpy as np
import pandas as pd
import seaborn
%matplotlib inline

# Day 4: Repose Record
You've sneaked into another supply closet - this time, it's across from the prototype suit manufacturing lab. You need to sneak inside and fix the issues with the suit, but there's a guard stationed outside the lab, so this is as close as you can safely get.

As you search the closet for anything that might help, you discover that you're not the first person to want to sneak in. Covering the walls, someone has spent an hour starting every midnight for the past few months secretly observing this guard post! They've been writing down the ID of the one guard on duty that night - the Elves seem to have decided that one guard was enough for the overnight shift - as well as when they fall asleep or wake up while at their post (your puzzle input).

For example, consider the following records, which have already been organized into chronological order:

```
[1518-11-01 00:00] Guard #10 begins shift
[1518-11-01 00:05] falls asleep
[1518-11-01 00:25] wakes up
[1518-11-01 00:30] falls asleep
[1518-11-01 00:55] wakes up
[1518-11-01 23:58] Guard #99 begins shift
[1518-11-02 00:40] falls asleep
[1518-11-02 00:50] wakes up
[1518-11-03 00:05] Guard #10 begins shift
[1518-11-03 00:24] falls asleep
[1518-11-03 00:29] wakes up
[1518-11-04 00:02] Guard #99 begins shift
[1518-11-04 00:36] falls asleep
[1518-11-04 00:46] wakes up
[1518-11-05 00:03] Guard #99 begins shift
[1518-11-05 00:45] falls asleep
[1518-11-05 00:55] wakes up
```

Timestamps are written using year-month-day hour:minute format. The guard falling asleep or waking up is always the one whose shift most recently started. Because all asleep/awake times are during the midnight hour (00:00 - 00:59), only the minute portion (00 - 59) is relevant for those events.

Visually, these records show that the guards are asleep at these times:

```
Date   ID   Minute
            000000000011111111112222222222333333333344444444445555555555
            012345678901234567890123456789012345678901234567890123456789
11-01  #10  .....####################.....#########################.....
11-02  #99  ........................................##########..........
11-03  #10  ........................#####...............................
11-04  #99  ....................................##########..............
11-05  #99  .............................................##########.....
```

The columns are Date, which shows the month-day portion of the relevant day; ID, which shows the guard on duty that day; and Minute, which shows the minutes during which the guard was asleep within the midnight hour. (The Minute column's header shows the minute's ten's digit in the first row and the one's digit in the second row.) Awake is shown as ., and asleep is shown as #.

Note that guards count as asleep on the minute they fall asleep, and they count as awake on the minute they wake up. For example, because Guard #10 wakes up at 00:25 on 1518-11-01, minute 25 is marked as awake.

If you can figure out the guard most likely to be asleep at a specific time, you might be able to trick that guard into working tonight so you can have the best chance of sneaking in. You have two strategies for choosing the best guard/minute combination.

Strategy 1: Find the guard that has the most minutes asleep. What minute does that guard spend asleep the most?

In the example above, Guard #10 spent the most minutes asleep, a total of 50 minutes (20+25+5), while Guard #99 only slept for a total of 30 minutes (10+10+10). Guard #10 was asleep most during minute 24 (on two days, whereas any other minute the guard was asleep was only seen on one day).

While this example listed the entries in chronological order, your entries are in the order you found them. You'll need to organize them before they can be analyzed.

What is the ID of the guard you chose multiplied by the minute you chose? (In the above example, the answer would be 10 * 24 = 240.)

In [2]:
lines = open("4.txt", "r").read().split('\n')

In [3]:
import re

lines = [re.match("\[(.*)\] (.*)", l).groups() for l in lines]
lines = pd.DataFrame(lines).set_index(0)
lines.index = pd.DatetimeIndex(lines.index.map(lambda s: '20' + s[2:]))
lines = lines.sort_index().iloc[:, 0]

In [4]:
lines.tail()

0
2018-11-22 00:28:00                falls asleep
2018-11-22 00:43:00                    wakes up
2018-11-23 00:00:00    Guard #1123 begins shift
2018-11-23 00:12:00                falls asleep
2018-11-23 00:32:00                    wakes up
Name: 1, dtype: object

In [5]:
guard = 0
asleep = 0

sleeps = []

for d, s in lines.iteritems():
    re_shift = re.match("Guard #(\d*)", s)
    if re_shift:
        guard = int(re_shift.groups()[0])
    if s == 'falls asleep':
        asleep = d
    if s == 'wakes up':
        sleeps.append({'guard': guard, 'start': asleep, 'end':d})
        
sleeps = pd.DataFrame(sleeps)
sleeps = sleeps.loc[:, ('guard', 'start', 'end')]
sleeps['duration'] = sleeps.end.sub(sleeps.start).map(lambda td: td.total_seconds() / 60)

In [6]:
top = sleeps.groupby('guard').sum().sort_values('duration', ascending=False)
top.head()

Unnamed: 0_level_0,duration
guard,Unnamed: 1_level_1
523,511.0
3463,461.0
1399,444.0
463,426.0
829,372.0


In [7]:
minutes = []

for _, sleep in sleeps[sleeps.guard == top.index[0]].iterrows():
    for i in range(sleep.start.minute, sleep.end.minute):
        minutes.append(i)

In [8]:
from collections import Counter

best_minutes = pd.Series(Counter(minutes)).sort_values(ascending=False)
best_minutes.head()

38    14
31    13
36    13
37    13
35    12
dtype: int64

In [9]:
top.index[0] * best_minutes.index[0]

19874

# Part Two

Strategy 2: Of all guards, which guard is most frequently asleep on the same minute?

In the example above, Guard #99 spent minute 45 asleep more than any other guard or minute - three times in total. (In all other cases, any guard spent any minute asleep at most twice.)

What is the ID of the guard you chose multiplied by the minute you chose? (In the above example, the answer would be 99 * 45 = 4455.)

In [10]:
minutes = []

for _, sleep in sleeps.iterrows():
    for i in range(sleep.start.minute, sleep.end.minute):
        minutes.append(dict(guard=sleep.guard, minute=i))

minutes = pd.DataFrame(minutes)
minutes['count'] = 1

In [11]:
best = minutes.groupby(['guard', 'minute']).sum().sort_values('count', ascending=False)
best.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,count
guard,minute,Unnamed: 2_level_1
463,49,20
463,50,18
463,48,18
463,47,17
3463,23,14


In [12]:
best.index[0][0] * best.index[0][1]

22687