## Part 1 problem statement

(Adapted from [Advent of Code 2021, day 3](https://adventofcode.com/2021/day/3))

You are given a list of binary numbers.
You need to use the binary numbers in that list to generate two new binary numbers (called the **gamma rate** and the **epsilon rate**).

Each bit in the gamma rate can be determined by finding the most common bit in the corresponding position of all numbers in the list.
For example, given the following list:

```txt
00100
11110
10110
10111
10101
01111
00111
11100
10000
11001
00010
01010
```

Considering only the first bit of each number, there are five `0` bits and seven `1` bits. Since the most common bit is `1`, the first bit of the gamma rate is `1`.

The most common second bit of the numbers in the diagnostic report is `0`, so the second bit of the gamma rate is `0`.

The most common value of the third, fourth, and fifth bits are `1`, `1`, and `0`, respectively, and so the final three bits of the gamma rate are `110`.

So, the gamma rate is the binary number `10110`, or `22` **in decimal**.

The epsilon rate is calculated in a similar way; rather than use the most common bit, the least common bit from each position is used. So, the epsilon rate is `01001`, or `9` **in decimal**. Multiplying the gamma rate (`22`) by the epsilon rate (`9`) gives `198`.

**Use the binary numbers in your input list to calculate the gamma rate and epsilon rate, then multiply them together.** What do you get? (Be sure to represent your answer in decimal, not binary.)

_Using the input file `input.txt`, the result should be 749376._

In [2]:
INPUT_FILE = "input.txt"

In [3]:
with open(INPUT_FILE, 'r') as f:
    lines = [line.strip() for line in f]

gamma, epsilon = "", ""
for col in range(len(lines[0])):
    # Count zeros and ones in this column
    zeroes, ones = 0, 0
    for line in lines:
        if line[col] == "0":
            zeroes += 1
        else:
            ones += 1    
    # Update gamma and epsilon  based on the most common bit
    if zeroes > ones:
        gamma += "0"
        epsilon += "1"
    else:
        gamma += "1"
        epsilon += "0"

print(int(gamma, 2) * int(epsilon, 2))


749376


In [4]:
with open(INPUT_FILE, 'r') as f:
    lines = [line.strip() for line in f]

gamma, epsilon = "", ""
for col in range(len(lines[0])):
    # Count zeros and ones 
    zeroes, ones = 0, 0
    for line in lines:
        if line[col] == "0":
            zeroes += 1
        else:
            ones += 1
    # Update gamman and epsilon based on the most common bit
    if zeroes < ones :
        gamma += "0"
        epsilon += "1"
    else:
        gamma += "1"
        epsilon += "0"
    
print(int(gamma, 2) * int(epsilon, 2))

749376


Even though this solution doesn't use many advanced techniques, there is one thing that is **super** useful already:
the built-in `int` can parse binary numbers:

In [5]:
int("101", 2)

5

That's something we will be using a lot, in here.

Another thing that is worth pointing out is that, when we read the file, we used `.strip()` to get rid of the newline character `"\n"` that comes in the end of each line when we use `.readlines()` on the file.

Now we want to know how to improve our code.
The first thing I do is wonder: what would I change if the numbers in the problem changed a bit?
For example, what would I do if the problem wanted us to count digits in hexadecimal, instead of binary?

If we were dealing with hexadecimal digits (numbers `0` through `9` and letters `"a"` through `"f"`), then I wouldn't want to have 16 variables just for counting:

## Convenient counting

Thankfully, we can do this with ease, we just need to use a container (something like a list or a dictionary) to hold the counting results.

I tend to prefer dictionaries, because the key-value system makes it very easy to map any kind of value to its count:

In [6]:
with open(INPUT_FILE, 'r') as f:
    lines = [line.strip() for line in f]

gamma, epsilon = "", ""
for col in range(len(lines[0])):
    # Count zeroes and ones in this column:
    counting = {"0": 0, "1": 0}
    for line in lines:
        counting[line[col]] += 1
    # Update gamma and epsilon based on the most common bit:
    if counting["0"] < counting["1"]:
        gamma += "1"
        epsilon += "0"
    else:
        gamma += "0"
        epsilon += "1"

print(int(gamma, 2) * int(epsilon, 2))

749376


As you can see, this simplified the code a fair bit already.

If you are wondering about the reason why I initialised the dictionary `counting` as `{"0": 0, "1": 0}` instead of `{}`, think about this:
if `"0"` and `"1"` are not existing keys of the dictionary `counting`, the line `counting[line[col]] += 1` wouldn't work.
We would have to write a an `if` statement to cover the first time we add something to the dictionary:

In [7]:
with open(INPUT_FILE, 'r') as f:
    lines = [line.strip() for line in f]

gamma, epsilon = "", ""
for col in range(len(lines[0])):
    counting = {}
    for line in lines:
        if line[col] not in counting:
            counting[line[col]] = 0
        counting[line[col]] += 1
    if counting["0"] < counting["1"]:
        gamma += "1"
        epsilon += "0"
    else:
        gamma += "0"
        epsilon += "1"
    
print(int(gamma, 2) * int(epsilon, 2))

749376


This is a common pattern in programming: you “look before you leap” (LBYL).
In other words, you make sure you _can_ do what you wanted to do.
In this case, you make sure the key exists before accessing that key in the dictionary.

However, Python tends to follow another code style, that says it's “easier to ask forgiveness than permission” (EAFP).
This code style suggests you should `try` to do what you want to do, and just fix the situation if you end up in trouble.

## EAFP versus LBYL

In Python, specifically, this generally means contrasting a preventive `if` with a `try` block.
For our example, something like this:

In [8]:
with open(INPUT_FILE, "r") as f:
    lines = [line.strip() for line in f]

gamma, epsilon = "", ""
for col in range(len(lines[0])):
    # Count zeroes and ones in this column:
    counting = {}
    for line in lines:
        try:
            counting[line[col]] += 1
        except KeyError:
            counting[line[col]] = 1
    # Update gamma and epsilon based on the most common bit:
    if counting["0"] < counting["1"]:
        gamma += "1"
        epsilon += "0"
    else:
        gamma += "0"
        epsilon += "1"
        
print(int(gamma, 2) * int(epsilon, 2))

749376


Using the EAFP approach is often the preferred way in Python, and this comparison was shown here for the sake of completeness.
You can read more about the choice between EAFP and LBYL in [here](https://mathspp.com/blog/pydonts/eafp-and-lbyl-coding-styles).

In our case, we can avoid the debate altogether by initialising the counting dictionary in the appropriate way, like was shown [above](#Convenient-counting).

 ## Dictionary with default value

This whole discussion about initialising the dictionary with the default values, versus using an `if` statement to ensure we can access the dictionary, versus a `try: ... except: ...` block, shows that in all three approaches we needed to give some default value to the dictionary.

Wouldn't it be great if there was some version of `dict` that assumed a default value?
Well, today is your lucky day, because there is!

`defaultdict`, from the `collections` module, is what we want.
`defaultdict` behaves just like a regular dictionary, except that you give it a “default value factory”: a function that returns the default values we care about.

In our case, we see that the count of a digit we haven't seen before should be `0`, so we just need a function that returns `0` to use with `defaultdict`.
As it turns out, `int` does the job:

In [9]:
int()

0

In [10]:
from collections import defaultdict
olympic_medals = defaultdict(int)
olympic_medals

defaultdict(int, {})

In [11]:
olympic_medals["Monu"]

0

Notice how, above, the dictionary knows that I have zero olympic medals (the default value for any human being), even though I never told the dictionary explicitly how many medals I have.

We can use a similar thing for our counting:

In [12]:
from collections import defaultdict

with open(INPUT_FILE, 'r') as f:
    lines = [line.strip() for line in f]

gamma, epsilon = "", ""
for col in range(len(lines[0])):
    counting = defaultdict(int)

    for line in lines:
        counting[line[col]] += 1
    
    if counting["0"] < counting["1"]:
        gamma += "1"
        epsilon += "0"
    else:
        gamma += "0"
        epsilon += "1"

print(int(gamma, 2) * int(epsilon, 2))

749376


In our case, because we only had two digits, initialising the dictionary by hand or using `defaultdict` was approximately the same work.
`defaultdict` becomes more useful if we have a lot of different things we might want to count, or if we can't know in advance _what_ things will be counted.

However, Python has an even _better_ way to count things:

## Counter

The `collections` module has another useful tool for us: a `Counter`!
A `Counter` does exactly what it says on the tin: it counts things:

In [13]:
from collections import Counter
mississippi_letters = Counter("Mississippi")
mississippi_letters

Counter({'i': 4, 's': 4, 'p': 2, 'M': 1})

In [14]:
mississippi_letters["i"]

4

But it includes other useful methods, like the `most_common`:

In [15]:
mississippi_letters.most_common(1)

[('i', 4)]

`Counter`s also have a default value of `0`:

In [1]:
mississippi_letters["z"]

NameError: name 'mississippi_letters' is not defined

In [1]:
from collections import Counter

with open(INPUT_FILE, 'r') as f:
    lines = [line.strip() for line in f]

gamma, epsilon = "", ""
for col in range(len(lines[0])):
    counting = Counter()
    for line in lines:
        counting[line[col]] += 1
    # Update gamma and epsilon values
    bit, _ = counting.most_common(1)[0]
    if bit == "0":
        gamma += "0"
        epsilon += "1"
    else:
        gamma += "1"
        epsilon += "0"

print(int(gamma, 2) * int(epsilon, 2))

NameError: name 'INPUT_FILE' is not defined

In [1]:
# This is a sample comment

In [None]:
# This is a sample comment