--- Day 6: Tuning Trouble --- <br>
The preparations are finally complete; you and the Elves leave camp on foot and begin to make your way toward the star fruit grove.

As you move through the dense undergrowth, one of the Elves gives you a handheld device. He says that it has many fancy features, but the most important one to set up right now is the communication system.

However, because he's heard you have significant experience dealing with signal-based systems, he convinced the other Elves that it would be okay to give you their one malfunctioning device - surely you'll have no problem fixing it.

As if inspired by comedic timing, the device emits a few colorful sparks.

To be able to communicate with the Elves, the device needs to lock on to their signal. The signal is a series of seemingly-random characters that the device receives one at a time.

To fix the communication system, you need to add a subroutine to the device that detects a start-of-packet marker in the datastream. In the protocol being used by the Elves, the start of a packet is indicated by a sequence of four characters that are all different.

The device will send your subroutine a datastream buffer (your puzzle input); your subroutine needs to identify the first position where the four most recently received characters were all different. Specifically, it needs to report the number of characters from the beginning of the buffer to the end of the first such four-character marker.

For example, suppose you receive the following datastream buffer:

mjqjpqmgbljsphdztnvjfqwrcgsmlb
After the first three characters (mjq) have been received, there haven't been enough characters received yet to find the marker. The first time a marker could occur is after the fourth character is received, making the most recent four characters mjqj. Because j is repeated, this isn't a marker.

The first time a marker appears is after the seventh character arrives. Once it does, the last four characters received are jpqm, which are all different. In this case, your subroutine should report the value 7, because the first start-of-packet marker is complete after 7 characters have been processed.

Here are a few more examples:

bvwbjplbgvbhsrlpgdmjqwftvncz: first marker after character 5
nppdvjthqldpwncqszvftbrmjlhg: first marker after character 6
nznrnfrfntjfmvfwmzdfjlvtqnbhcprsg: first marker after character 10
zcfzfwzzqfrljwzlrfnpqdbhtmscgvjw: first marker after character 11
How many characters need to be processed before the first start-of-packet marker is detected?

In [31]:
import util

# signal data
signal = util.get_text_as_list('day6.txt', output = 'string')

# holds number of processed letters
letters_processed = 3

# for each letter in signal
for r in range(1,len(signal)):
    
    # add one to count of letters processed
    letters_processed += 1
    
    # create for letter window
    start = r - 1
    
    end = r + 3

    marker_window = signal[start:end]
    
    # check window for duplicates if there are none 
    if len(set(marker_window)) == len(marker_window):
        
        print(letters_processed)
        
        break

1987


--- Part Two --- <br>
Your device's communication system is correctly detecting packets, but still isn't working. It looks like it also needs to look for messages.

A start-of-message marker is just like a start-of-packet marker, except it consists of 14 distinct characters rather than 4.

Here are the first positions of start-of-message markers for all of the above examples:

mjqjpqmgbljsphdztnvjfqwrcgsmlb: first marker after character 19 <br>
bvwbjplbgvbhsrlpgdmjqwftvncz: first marker after character 23 <br>
nppdvjthqldpwncqszvftbrmjlhg: first marker after character 23 <br>
nznrnfrfntjfmvfwmzdfjlvtqnbhcprsg: first marker after character 29 <br>
zcfzfwzzqfrljwzlrfnpqdbhtmscgvjw: first marker after character 26 <br>
    
How many characters need to be processed before the first start-of-message marker is detected?

In [34]:
import util

# signal data
signal = util.get_text_as_list('day6.txt', output = 'string')

# holds number of processed letters
letters_processed = 13

# for each letter in signal
for r in range(1,len(signal)):
    
    # add one to count of letters processed
    letters_processed += 1
    
    # create for letter window
    start = r - 1
    
    end = r + 13

    marker_window = signal[start:end]
    
    # check window for duplicates if there are none 
    if len(set(marker_window)) == len(marker_window):
        
        print(letters_processed)
        
        break

3059


--- Day 7: No Space Left On Device --- <br>
You can hear birds chirping and raindrops hitting leaves as the expedition proceeds. Occasionally, you can even hear much louder sounds in the distance; how big do the animals get out here, anyway?

The device the Elves gave you has problems with more than just its communication system. You try to run a system update:

$ system-update --please --pretty-please-with-sugar-on-top
Error: No space left on device
Perhaps you can delete some files to make space for the update?

You browse around the filesystem to assess the situation and save the resulting terminal output (your puzzle input). For example:

$ cd /
$ ls
dir a
14848514 b.txt
8504156 c.dat
dir d
$ cd a
$ ls
dir e
29116 f
2557 g
62596 h.lst
$ cd e
$ ls
584 i
$ cd ..
$ cd ..
$ cd d
$ ls
4060174 j
8033020 d.log
5626152 d.ext
7214296 k
The filesystem consists of a tree of files (plain data) and directories (which can contain other directories or files). The outermost directory is called /. You can navigate around the filesystem, moving into or out of directories and listing the contents of the directory you're currently in.

Within the terminal output, lines that begin with $ are commands you executed, very much like some modern computers:

cd means change directory. This changes which directory is the current directory, but the specific result depends on the argument:
cd x moves in one level: it looks in the current directory for the directory named x and makes it the current directory.
cd .. moves out one level: it finds the directory that contains the current directory, then makes that directory the current directory.
cd / switches the current directory to the outermost directory, /.
ls means list. It prints out all of the files and directories immediately contained by the current directory:
123 abc means that the current directory contains a file named abc with size 123.
dir xyz means that the current directory contains a directory named xyz.
Given the commands and output in the example above, you can determine that the filesystem looks visually like this:

- / (dir)
  - a (dir)
    - e (dir)
      - i (file, size=584)
    - f (file, size=29116)
    - g (file, size=2557)
    - h.lst (file, size=62596)
  - b.txt (file, size=14848514)
  - c.dat (file, size=8504156)
  - d (dir)
    - j (file, size=4060174)
    - d.log (file, size=8033020)
    - d.ext (file, size=5626152)
    - k (file, size=7214296)
Here, there are four directories: / (the outermost directory), a and d (which are in /), and e (which is in a). These directories also contain files of various sizes.

Since the disk is full, your first step should probably be to find directories that are good candidates for deletion. To do this, you need to determine the total size of each directory. The total size of a directory is the sum of the sizes of the files it contains, directly or indirectly. (Directories themselves do not count as having any intrinsic size.)

The total sizes of the directories above can be found as follows:

The total size of directory e is 584 because it contains a single file i of size 584 and no other directories.
The directory a has total size 94853 because it contains files f (size 29116), g (size 2557), and h.lst (size 62596), plus file i indirectly (a contains e which contains i).
Directory d has total size 24933642.
As the outermost directory, / contains every file. Its total size is 48381165, the sum of the size of every file.
To begin, find all of the directories with a total size of at most 100000, then calculate the sum of their total sizes. In the example above, these directories are a and e; the sum of their total sizes is 95437 (94853 + 584). (As in this example, this process can count files more than once!)

Find all of the directories with a total size of at most 100000. What is the sum of the total sizes of those directories?



In [36]:
import util
import regex as re

# signal data
commands = util.get_text_as_list('day7.txt', output = 'list')

In [37]:
def print_dict(dir_dict):
    '''print input directory in readable form'''
    for key, value in dir_dict.items():

        print("Parent: ", key)
        print("Contents: ", value)
        print("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~")
        print()

In [38]:
def get_folder_contents(commands):
    '''takes in list of command line prompts
       returns a dictionary with keys representing each parent directory
       and values representing a list of child directories and file sizes 
       directly heald in the parent directory
    '''
    # will hold dictionary containing each directory as a key 
    # and the contents of that directory as values in that key
    folders = {}

    for command in commands:

        # split command into a list by spaces
        command_list = command.split(" ")

        # if command is a cd command set the current directory to that directory
        if (command_list[1] == 'cd') and (".." not in command_list):

            cd = command_list[-1]

            folders[f"{cd}"] = []

        # add ls of current directory contents to the value list for the current directory

        # for directories add the name of the directory
        elif command_list[0] == 'dir':

            folders[f'{cd}'].append(command_list[-1])   

        # for files add the size of the file as an intiger
        elif re.search(r"^[0-9]", command_list[0]):

            folders[f'{cd}'].append(int(command_list[0])) 
        
    return folders

In [39]:
def sum_folders(folders):
    '''takes in a dictionary returns that dictionary replacing each list in its values that contain only
       numbers with the sum of those numbers '''
    
    summed = {}
    
    # for each key value pair in folders
    for key, value in folders.items():

        # attempt to sum the items in value and add that key value pair to summed
        try: 

            
            file_size = str(sum(value))

            summed[key] = file_size

        # if items can't be summed add original key value pair to summed    
        except:

            summed[key] = value
            
    return summed
  

def move_dirs(complete, incomplete):
    ''' 
    takes in two dictionaries, complete and incomplete
    itterates through incomplete 
    moves key value pairs from incomplete to complete
    if the value in incomplete is not a list
    '''
    # holds unmoved values from complete 
    new_incomplete = {}
    
    # itterate through folders
    for key, value in incomplete.items():
        
        # if value is not a list add to complete
        if type(value) != list:
            
            complete[key] = value
            
        # if value is a list add to new incomplete
        else:
            
            new_incomplete[key] = value
    
    return complete, new_incomplete


def replace(complete, incomplete):
    '''takes in two dictionaries: complete and incomplete
       itterates through each key value pair in incomplete and complete
       for each value in incomlete, if a dir name, in that list, matches a key in complete
       that value is replaced by the corresponding file size in complete
       returns dictionary: incomplete'''
    
    # compare each item in incomplete to each item in complete
    for un_key, un_value in incomplete.items():
        for com_key, com_value in complete.items():

            # if the value of the key in complete is in a list of values in incomplete 
            if com_key in un_value:

                # create a new list replacing the name of the dictionary in incomplete with the corrisponding size in complete 
                dir_size = com_value

                new_list = list(map(lambda x: str(x).replace(com_key, dir_size), un_value))
                
                # convert file sizes to integer
                new_list = [int(item) if re.search(r"^[0-9]", item) else item for item in new_list]

                incomplete[un_key] = new_list
   
    return incomplete 

In [40]:
def get_dir_sizes(complete, incomplete):
    '''takes in two dictionaries: complete and incomplete
       uses function calls invoke the following process recursively until there are no key value pairs in incomplete:
       1) get total content size for directories by summing all value lists that contain only file sized
       2) move successfully summed items from incomplete to complete
       3) for each value in incomlete, if a dir name, in that list, matches a key in complete replace that value with 
          the corresponding file size in complete
    '''
    
    if len(incomplete) == 0:
        
        return complete
    
    else:
          
        # get total content size for directories by summing all value lists that contain only file sized
        incomplete = sum_folders(incomplete)

        # move successfully summed items from incomplete to complete 
        complete, incomplete = move_dirs(complete, incomplete)
        
        # replace every dir name in values in incomplete that matches a key in complete 
        # with the corrisponding file size in complete
        incomplete = replace(complete, incomplete)
         
        return get_dir_sizes(complete, incomplete)

In [41]:
# holds dict of dirs that have been successfully summed
complete = {}

# create dictionary with keys containing parent directories and values containing a list of contents 
# including directory names and file sizes  
incomplete = get_folder_contents(commands)

print("Incomplete: ", len(incomplete))
print_dict(incomplete)

Incomplete:  150
Parent:  /
Contents:  ['fnsvfbzt', 'hqdssf', 'jwphbz', 'lncqsmj', 'mhqs', 'trwqgzsb', 132067, 'wbsph']
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  fnsvfbzt
Contents:  [62158]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  hqdssf
Contents:  [44806]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  dlsmjsbz
Contents:  [9205]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  mhqs
Contents:  [133965]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  ctfl
Contents:  ['shzgg']
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  shzgg
Contents:  ['bnb']
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  wfbvtfmr
Contents:  [209356, 286105, 234687]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  hnqjmq
Contents:  [3307]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  jzjm
Contents:  [31719]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  lncqsmj
Contents:  ['jmsw', 'mcgm', 'mpc', 233050, 30757, 250575]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  rq

In [42]:
# get the sizes of the contents in each directory
complete = get_dir_sizes(complete, incomplete)

print("complete: ", len(incomplete))
print()
print_dict(complete)

complete:  150

Parent:  fnsvfbzt
Contents:  62158
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  hqdssf
Contents:  44806
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  dlsmjsbz
Contents:  9205
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  mhqs
Contents:  133965
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  wfbvtfmr
Contents:  730148
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  hnqjmq
Contents:  3307
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  jzjm
Contents:  31719
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  rqznrr
Contents:  105075
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  cqmgf
Contents:  202804
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  cssmfv
Contents:  13440
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  qzs
Contents:  156779
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  qfs
Contents:  234449
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent:  rzctqrgm
Contents:  99041
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parent: 

In [43]:
small_dirs = [int(value) for value in complete.values() if int(value) <= 100_000]

small_dirs

[62158,
 44806,
 9205,
 3307,
 31719,
 13440,
 99041,
 47522,
 61689,
 92888,
 99280,
 19398,
 19937,
 37975,
 85806,
 31407,
 92092,
 16759,
 41383,
 78315,
 18049,
 18477,
 94389,
 73628,
 88439,
 36092,
 44806,
 57972,
 41383,
 44806,
 41383]

In [44]:
sum(small_dirs)

1547551