# Exploration of the input data

In this notebook, I am going to analyze the input data `room.txt`.

In [1]:
 from collections import Counter, defaultdict
 import re
 from copy import deepcopy

## Read and transform the input data


From the `task_en.txt` file, we can already extract information about the chairs:

```
The different types of chairs are as follows:
W: wooden chair
P: plastic chair
S: sofa chair
C: china chair
```

As we need theses capital letters to be identified to chairs, in the following, we are going to save them in a `set` called `chairs`

In [2]:
chairs = {'W', 'P', 'S', 'C'}

### Import the data as a string

In [3]:
with open('../rooms.txt', 'r') as f:
    apartment_string = f.read()
Counter(apartment_string)

Counter({'+': 24,
         '-': 240,
         '\n': 49,
         '|': 124,
         ' ': 2005,
         '(': 8,
         'c': 4,
         'l': 5,
         'o': 10,
         's': 2,
         'e': 6,
         't': 5,
         ')': 8,
         'P': 7,
         'S': 3,
         'p': 1,
         'i': 6,
         'n': 4,
         'g': 2,
         'r': 3,
         'm': 3,
         'W': 14,
         'f': 2,
         'C': 1,
         'b': 2,
         'a': 2,
         'h': 2,
         'k': 1,
         '/': 4,
         'v': 1,
         'y': 1})

From the function counter, we can already get a whole lot of information about the data:
- The `'-'`, `'|'` and `'/'` are the strings that delimit the areas of the rooms. The room being longer than large: <br>```number_of('-') < number_of('|')```

- The number of `'+'` gives the number of corners in the apartment

- The number of `'('` or `')'` gives the number of rooms

- The number of `'\n'` + 1 gives the length of the apartment

- **Most importantly:** gathering the capital letters keys and their counts already gives the relevant information for the first output line of the problem: **total**. As we need to find the chair repartition in each room as well, the problem is not yet solved, but we can definitively save the **total** result as a future check when the information of each room is gathered.

### Observation

According to the `task_en.txt` file, mistakes occured in the past when: *manually counting the various types of chairs*. This refers to the total number of chairs. 

I imagine a mistake being for example: forgetting a china chair when delivering all the chairs.

Considering that:
- gathering only the total number of each type of chairs can be quickly implemented
- this implementation will solve most of the client's problems
- solving the problem of finding the repartition of the chairs in each room looks much more complex

If I were given this problem in the context of my job, with a short deadline to meet, I would explain the client, the technical implications of the result that he would like to get. And I would propose him to firstly and temporarily experiment the quick-and-easy solution:
- Use my program to know the total number of each type of chair
- Use his existing plans to locate the chairs in the room

In [4]:
n_rooms = Counter(apartment_string)['(']
print(n_rooms)

total = {chair: Counter(apartment_string)[chair] for chair in Counter(apartment_string) if chair in chairs}
total

8


{'P': 7, 'S': 3, 'W': 14, 'C': 1}

Check the next notebook. You'll find out an explanation of the solution I found for the problem