# Explanatory Data Analysis (EDA)

In this notebook, I am going to analyze the input data `room.txt` and guide you through my thinking process to find the solution to this problem.

In [12]:
 from collections import Counter, defaultdict
 import re
 from copy import deepcopy

## Read and transform the input data


From the `task_en.txt` file, we can already extract information about the chairs:

```
The different types of chairs are as follows:
W: wooden chair
P: plastic chair
S: sofa chair
C: china chair
```

As we only need the capital letters in the following, we are going to save them in a `set` called `chairs`

In [13]:
chairs = {'W', 'P', 'S', 'C'}

### Import the data as a string

In [14]:
with open('rooms.txt', 'r') as f:
    rooms_string = f.read()
Counter(rooms_string)

Counter({'+': 24,
         '-': 240,
         '\n': 49,
         '|': 124,
         ' ': 2005,
         '(': 8,
         'c': 4,
         'l': 5,
         'o': 10,
         's': 2,
         'e': 6,
         't': 5,
         ')': 8,
         'P': 7,
         'S': 3,
         'p': 1,
         'i': 6,
         'n': 4,
         'g': 2,
         'r': 3,
         'm': 3,
         'W': 14,
         'f': 2,
         'C': 1,
         'b': 2,
         'a': 2,
         'h': 2,
         'k': 1,
         '/': 4,
         'v': 1,
         'y': 1})

From the function counter, we can already get a whole lot of information about the data:
- The `'-'`, `'|'` and `'/'` are the strings that delimit the areas of the rooms. The room being longer than large: <br>```number_of('-') < number_of('|')```

- The number of `'+'` gives the number of cornxzers in the apartment

- The number of `'('` or `')'` gives the number of rooms

- The number of `'\n'` + 1 gives the length of the apartment

- **Most importantly:** gathering the capital letters keys and their counts already gives the relevant information for the first output line of the problem: **total**. As we need to find the chair repartition in each room as well, the problem is not yet solved, but we can definitively save the **total** result as a future check when the information of each room is gathered.

### Observation

According tho the `task_en.txt`, mistakes occured in the past when: *manually counting the various types of chairs*. This refers to the total number of chairs. 

I imagine a mistake being for example: forgetting a china chair when delivering all the chairs.

Considering that:
- gathering only the toltal number of each type of chairs can be quickly implemented
- this implementation will solve most of the client's problems
- solving the problem of finding the repartition of the chairs in each room looks much more complex

If I were given this problem in the context of my job, I would explain the client, the technical implications of the result that he would like to get. And I would propose him to first experiment the quick-and-easy solution:
- Use my program to know the total number of each type of chair
- Use his existing maps to locate the chairs in the room

In [15]:
n_rooms = Counter(rooms_string)['(']
print(n_rooms)

total = {chair: Counter(rooms_string)[chair] for chair in Counter(rooms_string) if chair in chairs}
total

8


{'P': 7, 'S': 3, 'W': 14, 'C': 1}

As the input  represents a 2D-plan, I think the best way to locate things in the plan is to transform it into a 2D-array.

In [16]:
rooms = [list([j for j in i.split('\n')][0]) for i in rooms_string.splitlines()]

def print_of(rooms):
    return list(map(lambda L: ''.join(L), rooms))

print_of(rooms)

['+-----------+------------------------------------+',
 '|           |                                    |',
 '| (closet)  |                                    |',
 '|         P |                            S       |',
 '|         P |         (sleeping room)            |',
 '|         P |                                    |',
 '|           |                                    |',
 '+-----------+    W                               |',
 '|           |                                    |',
 '|        W  |                                    |',
 '|           |                                    |',
 '|           +--------------+---------------------+',
 '|                          |                     |',
 '|                          |                W W  |',
 '|                          |    (office)         |',
 '|                          |                     |',
 '+--------------+           |                     |',
 '|              |           |                     |',
 '| (toile

In [17]:
sep_chars = {'\\', '|', '/', '+', '-'}
sep_chars

{'+', '-', '/', '\\', '|'}

In [18]:
no_room_chars = {'\\', '|', '/', '+', '-', ' '}

## Move letters approach

In [19]:
dict_pos_chairs = {}
for i, row in enumerate(rooms):
    for j, element in enumerate(row):
        if element in chairs:
            dict_pos_chairs[(i, j)] = element
            # room = search_room(i, j)
            # rooms_chairs[room][element] += 1

print(dict_pos_chairs)
list_pos_chairs = list(dict_pos_chairs.keys())
list_pos_chairs.sort(key=lambda x: (x[0],x[1]))
list_pos_chairs

{(3, 10): 'P', (3, 41): 'S', (4, 10): 'P', (5, 10): 'P', (7, 17): 'W', (9, 9): 'W', (13, 44): 'W', (13, 46): 'W', (18, 41): 'P', (19, 4): 'C', (27, 34): 'W', (27, 38): 'W', (28, 34): 'W', (28, 38): 'W', (29, 8): 'P', (33, 38): 'W', (33, 43): 'W', (33, 47): 'W', (36, 2): 'S', (36, 38): 'W', (36, 43): 'W', (36, 47): 'W', (38, 2): 'S', (45, 46): 'P', (47, 45): 'P'}


[(3, 10),
 (3, 41),
 (4, 10),
 (5, 10),
 (7, 17),
 (9, 9),
 (13, 44),
 (13, 46),
 (18, 41),
 (19, 4),
 (27, 34),
 (27, 38),
 (28, 34),
 (28, 38),
 (29, 8),
 (33, 38),
 (33, 43),
 (33, 47),
 (36, 2),
 (36, 38),
 (36, 43),
 (36, 47),
 (38, 2),
 (45, 46),
 (47, 45)]

In [20]:
"""
Data Management
"""
def remove_chairs(except_chair):
    rooms = deepcopy(rooms_init)
    for pair in list_pos_chairs:
        if pair != except_chair:
            i, j = pair
            rooms[i][j] = ' '
    return rooms

In [None]:

        j -= 1
    # Initialize as a simple letter
    room_of_chair = row_str[j]
    j_left, j_right = j-1, j+1
    while (row_str[j_left] != '(') or (row_str[j_right] != ')'):
        if row_str[j_left] != '(':
            room_of_chair = row_str[j_left] + room_of_chair
            j_left -= 1
        if row_str[j_right] != ')':
            room_of_chair = room_of_chair + row_str[j_right] 
            j_right += 1
    return room_of_chair

In [22]:
"""
Horizontal check
"""

def is_in_room_same_row(i, j):
    chair = rooms[i][j]
    row_no_space = list(filter(lambda e: e != ' ', rooms[i]))
    return row_no_space[row_no_space.index(chair)+1] == '(' or row_no_space[row_no_space.index(chair)-1] == ')'


def find_room_on_same_row(i, j):
    chair = rooms[i][j]
    row = list(filter(lambda e: e != ' ', rooms[i]))
    j = row.index(chair)
    if row[j+1]  == '(':
        str_to_inspect = ''.join(row[j+1:])
        return re.split('\(|\)', str_to_inspect)[1]
    elif row[j-1] == ')':
        str_to_inspect = ''.join(row[:j-1])
        return re.split('\(|\)', str_to_inspect)[-1]

def change_pos(i, j, new_i, new_j):
    rooms[i][j], rooms[new_i][new_j] = rooms[new_i][new_j], rooms[i][j]
    i, j = new_i, new_j
    return i, j




In [28]:
rooms_init = [list([j for j in i.split('\n')][0]) for i in rooms_string.splitlines()]
dict_rooms_chairs = defaultdict(list)



def explore_horizontal_moves(i_start: int, j_start: int):
    i, j = i_start, j_start

    # First step: one vertical check at the starting point
    if is_in_room_same_column(i, j):
        return find_room_on_same_column(i, j)

    # Second step: Move horizontally 

    # Initialize moving up
    direction = 'up'

    while True:
        # If the room name is not found in the row, we move vertically
        if not is_in_room_same_row(i, j):   

            # If we reached the top of the room ...
            if direction == 'up' and rooms[i-1][j] in sep_chars:
                # ...we set the direction to down for the next step...
                direction = 'down' 
                # ...and get back to the initial position
                i, j = change_pos(i=i, j=j, new_i=i_start, new_j=j)
            
            # If we reached the bottom of the room ...
            if direction == 'down' and rooms[i+1][j] in sep_chars:
                # The research has been unsuccesful
                return 'not found'

            # We move up
            if direction == 'up':
                i, j = change_pos(i=i, j=j, new_i=i-1, new_j=j)

            # We move down   
            elif direction == 'down':
                i, j = change_pos(i=i, j=j, new_i=i+1, new_j=j)

        # We found the room name
        else:
            return find_room_on_same_row(i,j)



def explore_vertical_moves(i_start: int, j_start: int):
    i, j = i_start, j_start

    # First step: one horizontal check at the starting point
    if is_in_room_same_row(i, j):
        return find_room_on_same_row(i, j)

    # Second step: Move horizontally 

    # Initialize moving up
    direction = 'left'
    count = 0
    while count < 50:
        print(i, j)
        print(direction)
        count += 1
        # If the room name is not found in the column, we move horizontally
        if not is_in_room_same_column(i, j):   

            # If we reached the left side of the room ...
            if direction == 'left' and rooms[i][j-1] in sep_chars:
                # ...we set the direction to down for the next step...
                direction = 'right' 
                # ...and get back to the initial position
                i, j = change_pos(i=i, j=j, new_i=i, new_j=j_start)
            
            print(rooms[i][j+1] in sep_chars)
            # If we reached the right siede of the room ...
            if direction == 'right' and rooms[i][j+1] in sep_chars:
                print('ok')
                # The research has been unsuccesful
                return 'not found'

            # We move left
            if direction == 'left':
                i, j = change_pos(i=i, j=j, new_i=i, new_j=j-1)

            # We move right   
            elif direction == 'right':
                i, j = change_pos(i=i, j=j, new_i=i, new_j=j+1)

        # We found the room name
        else:
            return find_room_on_same_column(i,j)






for k, coordinates in enumerate(list_pos_chairs):
    if k+1 != 6:
        print(k+1)
        i, j = coordinates
        chair = dict_pos_chairs[coordinates]
        rooms = remove_chairs(except_chair=coordinates)
        room_of_chair = explore_vertical_moves(i, j)
        # Save it
        dict_rooms_chairs[room_of_chair].append(chair)



print(dict_rooms_chairs)
# print(dict_rooms_chairs)
print_of(rooms)

1
3 10
left
False
3 9
left
False
3 8
left
False
3 7
left
False
3 6
left
False
3 5
left
False
3 4
left
False
3 3
left
False
3 2
left
False
3 1
left
False
3 11
right
True
ok
2
3 41
left
False
3 40
left
False
3 39
left
False
3 38
left
False
3 37
left
False
3 36
left
False
3 35
left
False
3 34
left
False
3 33
left
False
3 32
left
False
3 31
left
False
3 30
left
False
3 29
left
False
3 28
left
False
3 27
left
False
3 26
left
False
3 25
left
False
3 24
left
False
3 23
left
False
3 22
left
False
3 21
left
False
3 20
left
False
3 19
left
False
3 18
left
False
3 17
left
False
3 16
left
False
3 15
left
False
3 14
left
False
3 13
left
False
3 42
right
False
3 43
right
False
3 44
right
False
3 45
right
False
3 46
right
False
3 47
right
False
3 48
right
True
ok
3
4 10
left
False
4 9
left
False
4 8
left
False
4 7
left
False
4 6
left
False
4 5
left
False
4 4
left
False
4 3
left
False
4 2
left
False
4 1
left
False
4 11
right
True
ok
4
5 10
left
False
5 9
left
False
5 8
left
False
5 7
left
False
5 6
le

['+-----------+------------------------------------+',
 '|           |                                    |',
 '| (closet)  |                                    |',
 '|           |                                    |',
 '|           |         (sleeping room)            |',
 '|           |                                    |',
 '|           |                                    |',
 '+-----------+                                    |',
 '|           |                                    |',
 '|           |                                    |',
 '|           |                                    |',
 '|           +--------------+---------------------+',
 '|                          |                     |',
 '|                          |                     |',
 '|                          |    (office)         |',
 '|                          |                     |',
 '+--------------+           |                     |',
 '|              |           |                     |',
 '| (toile

#### To check later
- In row exploration, case when letter under )

##### Reasons to temporarily remove chairs

- is room on the same column

### Remaining steps

1. Long way to the room name