# Assignment 2.

The file `tms_mapping.csv` contains results of TMS mapping of 10 participants' left hemisphere language areas. To perform the mapping a grid of targets was used, similar to this one:

<img src="https://drive.google.com/uc?export=view&id=15fBf7Qk-f_tTb7pvyvn29MPWV4ZseCR9" style="height: 200px; width: auto">


The mapping procedure was performed on 10 individuals. Thus, each of 10 grids (13 x 13 targets) contains individual responses to a single 10 Hz rTMS train. `1` means that the rTMS application to a particular site interfered with participant's speech; `0` means failure of any language interference.

Your task is: 
1. To overlap all grids so that they represent probability (%) of speech interference in all sites - use NumPy array.
2. To replace all subthreshold probabilities with `0` (as a threshold use 70% probability).
3. To add **x** and **y** coordinates, so that the central point of the grid is represented as `x = 0` and `y = 0`:
    * Add x axis (ranging from -6 to 6) on the top of the array
    * Add y axis (ranging from -6 to 6) on the left side of the array<br><br>
    
   The final array should look like this:
    
    ```
    [
     [  0  -6  -5  -4  -3  -2  -1   0   1   2   3   4   5   6]
     [ -6   0   0   0   0   0   0   0   0   0   0   0   0   0]
     [ -5   0   0   0   0   0   0   0   0   0   0   0   0   0]
     [ -4   0   0   0   0   0   0   0   0   0   0   0   0   0]
     [ -3   0   0   0   0   0   0   0   0   0   0   0   0   0]
     [ -2   0   0   0   0   0   0   0   0   0   0   0   0   0]
     [ -1   0   0   0   0   0   0   0   0  70   0   0   0   0]
     [  0   0   0   0   0   0   0  70  70  80  70   0   0   0]
     [  1   0   0   0   0   0  70  70  90  90  90   0   0   0]
     [  2   0   0  80  90   0  70  90  70  90 100  90   0   0]
     [  3   0  90 100 100   0   0   0 100  90 100   0   0   0]
     [  4   0   0 100   0  70  70   0  90  90   0  80   0   0]
     [  5   0   0   0   0   0   0   0   0   0   0   0   0   0]
     [  6   0   0   0   0   0   0   0   0   0   0   0   0   0]
    ]
    ```
    
4. To display coordinates of all sites/targets where probabilities were maximal as a table:

    ```
    Table 1. Coordinates of sites with highest probability of speech interference:

    |______Nr_______|_______X_______|_______Y_______|_____Area______|
    |       1       |       3       |       2       |   Wernicke    |
    |       2       |      -4       |       3       |     Broca     |
    |       3       |      -3       |       3       |     Broca     |
    |       4       |       1       |       3       |   Wernicke    |
    |       5       |       3       |       3       |   Wernicke    |
    |       6       |      -4       |       4       |     Broca     |
    ```

    <br><br>For language areas use these coordinates:
    1. **Broca's area** is defined as a rectangular area with coordinates:
       - top left: x = -5, y = 2
       - bottom right: x = -3, y = 4<br><br>
    2. **Wernicke's area** is defined as a rectangular area with coordinates:
       - top left: x = 1, y = 1
       - bottom right: x = 3, y = 4<br><br>
    
HINTS:
* In order to read the file directly to a NumPy array you may use: 
```python
   my_array = np.genfromtxt(path_to_file, delimiter=';', dtype = int)
```
* Preview the CSV file and see how the data is arranged, before you start writing the code. You will probably have to slice the original array;
* Array vectorization (scalar and array-array), aggregation methods, as well as boolean indexing will be very helpful;
* To create a table, f-string will be ideal.

In [212]:
### Configure IPython shell to print all outputs generated in a code cell
### --------------------------------------------------------------------------
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

import numpy as np

# this will ensure proper alignment of arrays in the output of jupyter notebook (without wrapping)
np.set_printoptions(linewidth = 120)


#============================== your code here: ==============================

path_to_file = "tms_mapping.csv"

my_array = np.genfromtxt(path_to_file, delimiter=';', dtype = int)

broca_topleft = (-5, 2)

broca_bottomright = (-3, 4)

wernicke_topleft = (1, 1)

wernicke_bottomright = (3, 4)

grid_size = 13

participants = 10

#=============================================================================


## Probability (%) of speech interference in all sites

### Data preparation

Separating the data so that it is stored in a list containing
numpy arrays of data gathered from single participant

In [213]:
def slice_matrix(sample_size, init_array, array_size):

    """
    :param sample_size: number of participants
    :param init_array: array which stores data gathered from all participants
    :param array_size: size of the grid
    :return: list or arrays where each array consists of data from one participant
    """

    step = array_size + 1

    sliced_data = []

    for i in range(sample_size):

        sliced_data.append(init_array[ (1 + step * i ) : step + step * i, :])

    return sliced_data


data = slice_matrix(participants, my_array, grid_size)


data

[array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0],
        [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0],
        [0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0],
        [0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0],
        [0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0],
        [0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]),
 array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
        [0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0],
        [0, 0, 0, 

Overlapping all the arrays into one array in which a value at a
given point is a sum of all the values at that point across all
the arrays

In [214]:
def sum_arrays(data_lst, array_size):

    """
    :param data_lst: a list containing numpy arrays with the data
    :param array_size: size of the grid
    :return: a 2D numpy array, where all the arrays passed are summed into one
    """

    sol = np.zeros( (array_size, array_size) )

    for elt in data_lst:

        sol += elt

    return sol.astype(int)


data = sum_arrays(data, grid_size)

data


array([[ 0,  0,  0,  0,  1,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  1,  0,  0,  1,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  2,  0,  0,  4,  1,  1,  6,  1,  1,  0,  1,  0],
       [ 0,  0,  1,  1,  0,  0,  0,  1,  0,  4,  2,  0,  0],
       [ 0,  1,  4,  1,  0,  0,  4,  4,  6,  1,  1,  0,  0],
       [ 0,  1,  1,  5,  1,  1,  5,  6,  7,  6,  2,  0,  0],
       [ 1,  2,  0,  0,  0,  6,  7,  7,  8,  7,  1,  1,  0],
       [ 0,  1,  2,  1,  6,  7,  7,  9,  9,  9,  1,  1,  0],
       [ 0,  5,  8,  9,  2,  7,  9,  7,  9, 10,  9,  0,  0],
       [ 0,  9, 10, 10,  1,  0,  3, 10,  9, 10,  5,  0,  0],
       [ 2,  3, 10,  3,  7,  7,  2,  9,  9,  4,  8,  1,  0],
       [ 1,  1,  0,  0,  0,  0,  0,  0,  0,  1,  2,  2,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0]])

### Calculating the probability

Calculating the probability of an inference occurring across all points of the grid

In [215]:
def calc_prob(data_array, sample_size):

    """
    :param data_array: numpy array with the stored data
    :param sample_size: number of participants
    :return:
    """

    return ( ( data_array / sample_size ) * 100 ).astype(int)


data = calc_prob(data, participants)

data

array([[  0,   0,   0,   0,  10,   0,   0,   0,   0,   0,   0,   0,   0],
       [  0,   0,  10,   0,   0,  10,   0,   0,   0,   0,   0,   0,   0],
       [  0,  20,   0,   0,  40,  10,  10,  60,  10,  10,   0,  10,   0],
       [  0,   0,  10,  10,   0,   0,   0,  10,   0,  40,  20,   0,   0],
       [  0,  10,  40,  10,   0,   0,  40,  40,  60,  10,  10,   0,   0],
       [  0,  10,  10,  50,  10,  10,  50,  60,  70,  60,  20,   0,   0],
       [ 10,  20,   0,   0,   0,  60,  70,  70,  80,  70,  10,  10,   0],
       [  0,  10,  20,  10,  60,  70,  70,  90,  90,  90,  10,  10,   0],
       [  0,  50,  80,  90,  20,  70,  90,  70,  90, 100,  90,   0,   0],
       [  0,  90, 100, 100,  10,   0,  30, 100,  90, 100,  50,   0,   0],
       [ 20,  30, 100,  30,  70,  70,  20,  90,  90,  40,  80,  10,   0],
       [ 10,  10,   0,   0,   0,   0,   0,   0,   0,  10,  20,  20,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0]])

Filtering out the probability values below 70 (threshold)

In [216]:
data[data < 70] = 0

data

array([[  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,  70,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,  70,  70,  80,  70,   0,   0,   0],
       [  0,   0,   0,   0,   0,  70,  70,  90,  90,  90,   0,   0,   0],
       [  0,   0,  80,  90,   0,  70,  90,  70,  90, 100,  90,   0,   0],
       [  0,  90, 100, 100,   0,   0,   0, 100,  90, 100,   0,   0,   0],
       [  0,   0, 100,   0,  70,  70,   0,  90,  90,   0,  80,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0]])

### Adding x and y coordinates to the grid

According to what was shown in the example
(I don't understand why the y-axis is reversed so that negative
values are on top)

In [217]:
def add_index(data_array, array_size):

    """
    :param data_array: numpy array with the stored data
    :param array_size: size of the grid
    :return: a numpy array with added x and y coordinates
    """

    start, end = 0, 0

    rows, columns = np.empty(0), np.empty(0)

    if array_size % 2 == 0:

        start = 0 - array_size/2

        end = array_size/2

        columns = np.concatenate( (np.arange(start, 0), np.arange( 1, end + 1) ) )

        rows = np.insert( columns, 0, 0 ).reshape(grid_size + 1, 1)

    else:

        start = 0 - ( array_size - 1 ) / 2

        end = ( array_size - 1 ) / 2

        columns = np.arange(start, end + 1, 1)

        rows = np.insert( columns, 0, 0 ).reshape(grid_size + 1, 1)

    sol = np.vstack( (columns, data_array) )

    sol = np.hstack( (rows, sol) )

    return sol.astype(int)


data = add_index(data, grid_size)

data

array([[  0,  -6,  -5,  -4,  -3,  -2,  -1,   0,   1,   2,   3,   4,   5,   6],
       [ -6,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [ -5,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [ -4,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [ -3,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [ -2,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [ -1,   0,   0,   0,   0,   0,   0,   0,   0,  70,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0,  70,  70,  80,  70,   0,   0,   0],
       [  1,   0,   0,   0,   0,   0,  70,  70,  90,  90,  90,   0,   0,   0],
       [  2,   0,   0,  80,  90,   0,  70,  90,  70,  90, 100,  90,   0,   0],
       [  3,   0,  90, 100, 100,   0,   0,   0, 100,  90, 100,   0,   0,   0],
       [  4,   0,   0, 100,   0,  70,  70,   0,  90,  90,   0,  80,   0,   0],
       [  5,   0,   0,   0,   0,   0,   0,   0,   0,

### Displaying coordinates where probabilities were maximal

In [218]:
coordinates = np.where(data == 100)

coordinates_zipped = (zip(coordinates[0], coordinates[1]))

header = ["Nr", "X", "Y", "Area"]

def wernicke_or_broca(x, y, data_array):

    """
    :param x: x coordinate of a point
    :param y: y coordinate of a point
    :param data_array: numpy array with the stored data
    :return: Broca if a point is within Broca's area, Wernicke if a
    point is within Wetnicke's area or else an empty string
    """

    if data_array[0, y] in range(broca_topleft[0], broca_bottomright[0] + 1) and \
            data_array[x, 0] in range(broca_topleft[1], broca_bottomright[1] + 1):

        return "Broca"

    elif data_array[0, y] in range(wernicke_topleft[0], wernicke_bottomright[0] + 1) and \
            data_array[x, 0] in range(wernicke_topleft[1], wernicke_bottomright[1] + 1):

        return "Wernicke"

    else:

        return ""


def print_table(data_array):

    """
    :param data_array: numpy array with the stored data
    :return: prints out the solution in a table-like format
    """

    print(" Table 1. Coordinates of sites with highest probability of speech interference:")

    print(f"|{header[0]:_^17}|{header[1]:_^17}|{header[2]:_^17}|{header[3]:_^17}|")

    i = 1

    for elt in coordinates_zipped:

        area = wernicke_or_broca(elt[0], elt[1], data_array)

        if area != "":

            print(f"|{i:_^17}|{elt[1] - 7:_^17}|{elt[0] - 7:_^17}|{area:_^17}|")

            i += 1

print_table(data)


 Table 1. Coordinates of sites with highest probability of speech interference:
|_______Nr________|________X________|________Y________|______Area_______|
|________1________|________3________|________2________|____Wernicke_____|
|________2________|_______-4________|________3________|______Broca______|
|________3________|_______-3________|________3________|______Broca______|
|________4________|________1________|________3________|____Wernicke_____|
|________5________|________3________|________3________|____Wernicke_____|
|________6________|_______-4________|________4________|______Broca______|
