<center><h1> Introduction to Scientific Packages I (Numpy) </h1></center>


> We will cover:
> 1. What are Packages and Modules
> 2. Importing modules
> 3. Basics of Numpy
>> - Arrays
>> - Functions that create arrays
>> - Indexing and Slicing
>> - Math operations with arrays (array-wise vs element-wise)
>> - More advanced math operations
>> - Boolean operations and masking
>> - Fancy indexing



## Packages and Modules
Python uses "modules" or "packages" that can consist of libraries of different premade and optimized functions that you can use. The number of functions can be overwhelming but each library also has documentation that you can find at their websites. We will be working with data that I've preprocessed and focusing on loading data, data transformations, and visualization of data for now. We will move to analysis later. This can be handled with a small number of packages that I've included in the next cell.

Visual analysis of raw data helps us build an intuition of phenomena that may be there. If we have sufficient domain knowledge, we can direct our deeper analyses based on our intuitions after just looking at the raw data. This also important when working with physiologists who want to see an observed phenomenon first before more abstract analyses.


<u>Main modules of interest</u>:
> <b>NumPy</b> : Essential functions for scientific programming. Very efficient for manipulating data vectors or "arrays". Much like Matlab if you're familiar. This package will be your most widely used package for data science in neuroscience. Most other scientific packages are supported by Numpy for their basic functionality making Numpy a necessary package to be familiar with. (http://www.numpy.org)


Essentially modules are used by specifiying a package <i>object</i>'s name (e.g., numpy) then an <i>attribute</i> (e.g, functions like numpy.ones() which makes a 1D vector or n-dimensional matrix). We can shorten function calls by storing an abbreviated module name, for example numpy as np so that we can use np.ones() instead. It's not necessary, but common practice so that your code can be read easily by others.

## Importing Modules

In [6]:
# importing the whole package
import brain_package
import brain_package as bp

# importing modules within the package
import brain_package.utils as utils
from brain_package import utils
from brain_package.utils import * # importing all functions within the module


> <b>Note:</b> The above serve as examples of importing modules to access the same functions. If the functions that you want are within a submodule (or a submodule of a submodule, etc) then it's easiest to import through dot notation just that specific submodule within the package library.

In [7]:
dir(brain_package)

['__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 'utils']

In [8]:
dir(bp)

['__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 'utils']

In [9]:
help(utils)

Help on module brain_package.utils in brain_package:

NAME
    brain_package.utils

FUNCTIONS
    gauss(x, y, center=(5, 5), sigma=3.5)
        Creates a 2-D gaussian.
        
        INPUT
        - x, y (float) = range of xs and ys
        - center (tuple/(float,float)) = mean of the gaussian
        - sigma (float) = standard deviation of gaussian
        
        OUTPUT
        - (float) = value of the gaussian at (x,y)
    
    get_spaced_colors(n)
        Function used for creating evenly spaced colors for n number of unique stimulus parameters 
        
        INPUT
        - n (int/float) = number of colors needed 
        
        OUTPUT
        - (list) = list of colors that are uniformly spaced visually

FILE
    /home/vrhaynes/PythonDataCamp/brain_package/utils.py




> Let's import the modules we need for the rest of this tutorial now.

In [10]:
# Import modules - stylistically, these are kept somewhere near the top of your file and group together
import numpy as np

import matplotlib.pyplot as plt 
# the percent sign below is something called Jupyter magic making the Jupyter Notebook interactive
# - (In this case, just plotting figures inside the notebook.)
%matplotlib inline 


## Basics of Numpy
> Write stuff
>
> Give a list of all modules in Numpy

### What are arrays?
***Arrays*** are a very useful concept shared by many programming languages. Arrays are:

<ol>
<li>Containers that hold many items, all of the same type. 
<li>Represented by a rectangular structure. Arrays may have any number of dimensions, but each dimension (axis) of the array has a fixed length.
</ol>

For example:

<ul>
<li>A 1D array of 10M samples recorded from an electrode that shows electical potential over time

<table style='margin: 10px; margin-left: 50px; background-color: #FFF'>
<tr><td style="background-color: #fff; border: 1px solid #000;">0.0531</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">0.0547</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">0.0522</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">0.0525</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">0.0536</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">0.0531</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;&nbsp;.&nbsp;&nbsp;&nbsp;.&nbsp;&nbsp;&nbsp;.&nbsp;&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">0.0530</td style="background-color: #fff; border: 1px solid #000;"></tr>
</table>

<li>A 5x5 (2D) matrix

<table style='margin: 10px; margin-left: 50px'>
<tr><td style="background-color: #fff; border: 1px solid #000;">&nbsp;1&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"></tr>
<tr><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;1&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"></tr>
<tr><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;1&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"></tr>
<tr><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;1&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"></tr>
<tr><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;0&nbsp;</td style="background-color: #fff; border: 1px solid #000;"><td style="background-color: #fff; border: 1px solid #000;">&nbsp;1&nbsp;</td style="background-color: #fff; border: 1px solid #000;"></tr>
</table>

In [11]:
# we can test the performance with the time module 
from time import perf_counter # don't worry about this! - you can also import just single functions!!!
import random # useful for getting random numbers



In [12]:
# Recall that arrays are similar to lists. Lists can have a several elements as a single type
# and these can be nested.
n_rows = 10
n_cols = 12

# start timer
tic = perf_counter()

list_array = []
for x in range(n_rows):
    row = []
    for y in range(n_cols):
        row.append(int(10*random.random()))
    list_array.append(row)
    
    
# stop timer   
toc = perf_counter()
print('The list array took %.6f seconds'%(toc-tic))
t1 = toc-tic

list_array



The list array took 0.000359 seconds


[[8, 4, 7, 4, 6, 6, 0, 8, 2, 7, 6, 1],
 [4, 1, 2, 7, 9, 0, 4, 5, 3, 0, 0, 0],
 [5, 1, 8, 5, 0, 8, 5, 2, 3, 0, 9, 3],
 [0, 4, 5, 7, 8, 9, 1, 5, 5, 2, 4, 4],
 [5, 8, 6, 1, 1, 2, 8, 4, 9, 7, 0, 9],
 [9, 8, 8, 3, 9, 6, 8, 2, 7, 5, 2, 7],
 [6, 4, 1, 1, 8, 4, 9, 7, 7, 4, 8, 2],
 [1, 9, 9, 2, 5, 9, 3, 8, 8, 4, 5, 5],
 [3, 2, 0, 4, 0, 7, 6, 1, 5, 3, 8, 0],
 [3, 8, 9, 1, 7, 8, 5, 3, 7, 2, 8, 4]]

In [13]:
list_array[2,0], list_array[0,2] # this doesn't work 



TypeError: list indices must be integers or slices, not tuple

In [14]:
list_array[2][0], list_array[0][2] # which row then which column



(5, 7)

In [15]:
# operations on this are even more awkward...
tic = perf_counter()
means = []
for y in range(n_cols):
    col = [row[y] for row in list_array]
    means.append(sum(col)/float(n_rows)) # this may get confusing


toc = perf_counter()
print('The awkward array took %.6f seconds'%(toc-tic))
t2 = toc-tic

means



The awkward array took 0.000381 seconds


[4.4, 4.9, 5.5, 3.5, 5.3, 5.9, 4.9, 4.5, 5.6, 3.4, 5.0, 3.5]

In [16]:
# Numpy!

tic = perf_counter()
array = np.random.randint(10,size=(n_rows, n_cols)) # creates a random array of integers with size n_rows x n_cols and maximum value 10
means = array.mean(axis=0) # computes mean over the columns (axis 0)

toc = perf_counter()
print('The Numpy array took %.6f seconds vs %.6f seconds for list array'%(toc-tic,t1+t2)) # this would scale with the size of the array

array, means



The Numpy array took 0.000569 seconds vs 0.000740 seconds for list array


(array([[3, 3, 0, 2, 0, 8, 4, 3, 4, 6, 5, 7],
        [6, 8, 8, 6, 7, 2, 5, 3, 7, 0, 3, 7],
        [1, 5, 7, 7, 5, 1, 2, 6, 6, 8, 3, 9],
        [4, 2, 2, 2, 3, 1, 8, 4, 9, 7, 5, 0],
        [6, 0, 6, 2, 3, 1, 8, 7, 5, 1, 8, 7],
        [8, 5, 7, 9, 2, 9, 9, 4, 6, 1, 7, 6],
        [2, 2, 8, 9, 5, 2, 2, 1, 5, 7, 1, 8],
        [7, 4, 3, 7, 7, 7, 5, 5, 3, 8, 6, 7],
        [2, 3, 1, 5, 8, 5, 1, 1, 4, 5, 8, 9],
        [1, 9, 0, 8, 2, 1, 7, 1, 5, 8, 0, 3]]),
 array([4. , 4.1, 4.2, 5.7, 4.2, 3.7, 5.1, 3.5, 5.4, 5.1, 4.6, 6.3]))

In [17]:
# arrays have their own instance attributes (they are a class!)
array.ndim, array.shape, array.size, array.itemsize # this last one says that each item uses 8 bytes



(2, (10, 12), 120, 8)

In [18]:
array.dtype # 64-bit integers



dtype('int64')

> All elements in the array are of the same data type, a 64-bit (8 byte) floating-point number.
> 
> This is the default ***dtype*** in most cases. We may want to change the dtype if we are using large amounts of data and are worried about memory/speed.
> 
> This is a list of the different dtypes, their sizes, the numerical precision, and the range of values they can represent.

<table style="margin-left: 50px">
<tr><td> dtype  </td><td> bytes     </td><td> precision  </td><td> approx. range       </td></tr>
<tr><td>float64 </td><td> 8         </td><td> 16         </td><td> ±10<sup>308</sup>   </td></tr>
<tr><td>float32 </td><td> 4         </td><td> 7          </td><td> ±10<sup>38</sup>    </td></tr>
<tr><td>int64   </td><td> 8         </td><td> 0          </td><td> ±10<sup>18</sup>    </td></tr>
<tr><td>int32   </td><td> 4         </td><td> 0          </td><td> ±10<sup>9</sup>     </td></tr>
<tr><td>int16   </td><td> 2         </td><td> 0          </td><td> ±10<sup>4</sup>     </td></tr>
<tr><td>uint64  </td><td> 8         </td><td> 0          </td><td> 0 to 10<sup>19</sup></td></tr>
<tr><td>uint32  </td><td> 4         </td><td> 0          </td><td> 0 to 10<sup>9</sup> </td></tr>
<tr><td>uint16  </td><td> 2         </td><td> 0          </td><td> 0 to 10<sup>4</sup> </td></tr>
<tr><td>uint8   </td><td> 1         </td><td> 0          </td><td> 0-255               </td></tr>
<tr><td>bool    </td><td> 1         </td><td> 0          </td><td> 0-1                 </td></tr>
</table>


> All dtypes: https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html<br>
> All ndarray attributes: https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html

----------------------

To convert an array to a different dtype, use the *astype()* method.
<div> <!-- NOTE: this div is a workaround for a jupyter HTML export bug --> </div>
</div>

In [19]:
# compare a converted type
converted_array = array.astype('int32')

print(array.size*array.itemsize, 'bytes >', converted_array.size*converted_array.itemsize, 'bytes')



960 bytes > 480 bytes


In [None]:
# check out others with tab completion
array.



### Functions that create basic arrays


In [20]:
# Our nested list can be remade into an array
list_to_array = np.array(list_array)

list_to_array



array([[8, 4, 7, 4, 6, 6, 0, 8, 2, 7, 6, 1],
       [4, 1, 2, 7, 9, 0, 4, 5, 3, 0, 0, 0],
       [5, 1, 8, 5, 0, 8, 5, 2, 3, 0, 9, 3],
       [0, 4, 5, 7, 8, 9, 1, 5, 5, 2, 4, 4],
       [5, 8, 6, 1, 1, 2, 8, 4, 9, 7, 0, 9],
       [9, 8, 8, 3, 9, 6, 8, 2, 7, 5, 2, 7],
       [6, 4, 1, 1, 8, 4, 9, 7, 7, 4, 8, 2],
       [1, 9, 9, 2, 5, 9, 3, 8, 8, 4, 5, 5],
       [3, 2, 0, 4, 0, 7, 6, 1, 5, 3, 8, 0],
       [3, 8, 9, 1, 7, 8, 5, 3, 7, 2, 8, 4]])

<h1></h1>
***Exercise 1*** Create an array with shape (2,3,4). Don't use random. Hardcode the structure of the array.

In [25]:
dim1 = 2
dim2 = 3
dim3 = 4

# here we have 2 3x4 lists
array = np.array([[[1,2,3,4],    # first 3x4 list
                   [5,6,7,8],
                   [9,10,11,12]],
                  [[13,14,15,16], # second 3x4 list
                   [17,18,19,20],
                   [21,22,23,24]]]
)

print(array)
print(array.shape)



[[[ 1  2  3  4]
  [ 5  6  7  8]
  [ 9 10 11 12]]

 [[13 14 15 16]
  [17 18 19 20]
  [21 22 23 24]]]
(2, 3, 4)


In [26]:
# creating zeros
array1 = np.zeros((2,2),dtype='uint16')

# creating ones
array2 = np.ones((3,3))

# creating zeros with the same shape and type as another array
array3 = np.zeros_like(array2)

# ... same for ones
array4 = np.ones_like(array1)



In [27]:
print('--1--\n',array1, '\n--2--\n',array2, '\n--3--\n',array3, '\n--4--\n',array4)

--1--
 [[0 0]
 [0 0]] 
--2--
 [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]] 
--3--
 [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]] 
--4--
 [[1 1]
 [1 1]]


In [28]:
array4, array3

(array([[1, 1],
        [1, 1]], dtype=uint16),
 array([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]))

> Notice type only appears for the array we set it for and the one we "copied."

In [33]:
# Numpy preduces iterables too
np.arange(4), np.arange(0,4)



(array([0, 1, 2, 3]), array([0, 1, 2, 3]))

In [30]:
np.arange(4,10)



array([4, 5, 6, 7, 8, 9])

### Indexing and slicing arrays
> Indexing and slicing are two ways of accessing values from arrays. Both are part of the built-in functionality of Python, but they work slightly differently with Numpy.

In [34]:
array = np.array([[i for i in range(4)],
         [j for j in range(4,8)],
         [k for k in range(8,12)]])



In [36]:
array

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [35]:
array.shape

(3, 4)

### Indexing

In [37]:
print(array)

# reading a single value, we can provide a common separated list
print(array[0,3])

# ... or similar to how we would with lists - this is not proper, though!
print(array[0][3])

# we can also change values in an array
array[0,3] = -1
print(array)



[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
3
3
[[ 0  1  2 -1]
 [ 4  5  6  7]
 [ 8  9 10 11]]


<h1></h1>

***Exercise 2*** Create a 2D array of any size. Using a nested for-loop, add the value 3 to all of the elements in the array. 

In [41]:
n_rows = 2
n_cols = 2
X = np.ones((n_rows,n_cols))
print(X)

# example 1
for i,row in enumerate(X):
    for j,item in enumerate(row):
        X[i,j] = item+3

print(X)

# example 2
for i in range(n_rows):
    for j in range(n_cols):
        X[i,j]+=3
        
print(X)




[[1. 1.]
 [1. 1.]]
[[4. 4.]
 [4. 4.]]
[[7. 7.]
 [7. 7.]]


### Slicing
Unlike indexing, slicing an array allows to access rectangular subregions of an array. You will be doing this a lot, particularly when wanting to analyze labeled data!

When we access a <i>slice</i> from an array, we aren't just pointing to the data stored in memory. We are creating a COPY of the slice. However when we modify it, we are modifying the new instance and the original instance! This is because Python creates a new object instance  which provides a view onto the original data. 

One major difference between a ***slice*** and an ***index*** is that a slice maintains the dimensionality of the axes, while indices reduces them. 

<b>NOTE:</b> BEFORE you run the next cell, let's think about it together.

In [45]:
X = np.arange(10)
X[-3:]

array([7, 8, 9])

In [73]:
array = np.arange(100).reshape(10,10)

print(array)

[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]
 [30 31 32 33 34 35 36 37 38 39]
 [40 41 42 43 44 45 46 47 48 49]
 [50 51 52 53 54 55 56 57 58 59]
 [60 61 62 63 64 65 66 67 68 69]
 [70 71 72 73 74 75 76 77 78 79]
 [80 81 82 83 84 85 86 87 88 89]
 [90 91 92 93 94 95 96 97 98 99]]


In [74]:
# here is an example of a slice
array_sliced = array[3:7,3:7]
print(array_sliced)



[[33 34 35 36]
 [43 44 45 46]
 [53 54 55 56]
 [63 64 65 66]]


In [46]:
X = np.random.randint(0,10,size=(100,100,50))

# we don't want to look at this whole thing. So let's slice and index it.
# Here, we want the same item values from the original array.
X_slice = X[:1,-1:,:20]
X_index = X[0,-1,:20]

print('Slice vs Index \n--------------')
print(X_slice,X_index)

# Different addresses
print(id(X_slice),id(X_index))

# These have the same number of elements
print('Number of elements:',X_slice.size,X_index.size)

# ... different lengths
print('Lengths:',len(X_slice),len(X_index))

# ... and only one maintains all axes (even if just flattened)
print('Shapes:',X_slice.shape,X_index.shape)


Slice vs Index 
--------------
[[[2 5 4 4 4 2 1 6 7 4 8 2 6 4 4 3 4 7 4 9]]] [2 5 4 4 4 2 1 6 7 4 8 2 6 4 4 3 4 7 4 9]
140092472927888 140092472927120
Number of elements: 20 20
Lengths: 1 20
Shapes: (1, 1, 20) (20,)


> <b>CAUTIONARY NOTE</b> Always know which you're working with! Array operations are sensitive to shape. Let's look at a confusing example.

In [48]:
# can also assign values to slices like when we index
array = np.arange(10)
print(array)

new_slice = array[3:7]
print(new_slice)

# we need to EXPLICITLY work with the slice, not the variable
new_slice = -1
print(new_slice)

# ...so like this
array[3:7] = -1
print(array)



[0 1 2 3 4 5 6 7 8 9]
[3 4 5 6]
-1
[ 0  1  2 -1 -1 -1 -1  7  8  9]


In [49]:
new_slice = array[3:7]
new_slice[0:3] = -1

In [51]:
array

array([ 0,  1,  2, -1, -1, -1, -1,  7,  8,  9])

### ... and then there are strides.

In [52]:
X = np.arange(11)

# The basic layout of the stride for numpy arrays is [low:high:stride size]
print('X1:',X)
print('X2:',X[0:11:2]) # this third index shows every other
print('X3:',X[2::2])   # you don't need the upper bound
print('X4:',X[:8:2])   # you don't need the lower bound
print('X5:',X[::2])    # you don't actually need anything other than the stride size (NOTE: X5<=>X2)
print('X6:',X[2::])    # this is the same as the next
print('X7:',X[2:])     # ... so default high is the size of the axis and default stride is 1

X1: [ 0  1  2  3  4  5  6  7  8  9 10]
X2: [ 0  2  4  6  8 10]
X3: [ 2  4  6  8 10]
X4: [0 2 4 6]
X5: [ 0  2  4  6  8 10]
X6: [ 2  3  4  5  6  7  8  9 10]
X7: [ 2  3  4  5  6  7  8  9 10]


### Math operations with arrays 

***"Array-wise"***
These are your conventional operations you may be familiar with from linear algebra. Multiplying two arrays (with the appropriate dimensions) produces inner and outer products, i.e., another array or a float. To do these, we need to call specific Numpy functions.


***Element-wise***
These operations apply to each of the elements independently. The operations could be identical or depend on the position within the array.

Let's look at our nested for-loop exercise again.

In [53]:
array = np.array([[i for i in range(4)],
         [j for j in range(4,8)],
         [k for k in range(8,12)]])

print(array,'\n|->')
print(array+3,'\n<-|') # three is added element-wise
print(array)
array+=3 # if we want to edit it in-place
print(array) 


[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]] 
|->
[[ 3  4  5  6]
 [ 7  8  9 10]
 [11 12 13 14]] 
<-|
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[[ 3  4  5  6]
 [ 7  8  9 10]
 [11 12 13 14]]


In [56]:
# We can add two arrays together
array1 = np.arange(-4,1)
array2 = np.arange(0,5) # run next cell, then change this


In [57]:
# this only works if both the arrays have the same shape
array1 + array2


array([-4, -2,  0,  2,  4])

<h1></h1>

***Exercise 3*** Create an array that has values that are 1 less than powers of 3 for exponents ranging from 0 to 10.  (The only function you need is numpy.arange!!!)

In [59]:
x = 3**np.arange(0,11)-1

x


array([    0,     2,     8,    26,    80,   242,   728,  2186,  6560,
       19682, 59048])

In [60]:
# element wise arithmetic works for multiplication as well
print('Multiplication:',array1*array2) # NOTE: this is not the dot product

# also for division
print('Division:',array1/array2)

# be careful not to divide by zero
array2+=1 
print('Division:',array1/array2)



Multiplication: [ 0 -3 -4 -3  0]
Division: [       -inf -3.         -1.         -0.33333333  0.        ]
Division: [-4.         -1.5        -0.66666667 -0.25        0.        ]


  print('Division:',array1/array2)


In [62]:
# These works for a single array, too
array1*=2
print(array1)

# NOTE: /= doesn't actually work for arrays, but does for floats
array2 = np.arange(0,5)
# array2/=2 
array2 = array2/2
print(array2)


[-16 -12  -8  -4   0]
[0.  0.5 1.  1.5 2. ]


### More advanced math operations with arrays

In [63]:
# this is an array of values from a continuous uniform distribution
# between -5 and 5 (i.e., centered at 0)
array = 10*np.random.random(size=(10,10))-5

# we have basic stats as instance methods
print('mean:',array.mean())
print('std:',array.std())
print('min:',array.min())
print('max:',array.max())

# we have other math functions
print('median:',np.median(array))
print('sum:',np.sum(array))
print('index of max value:',np.argmax(array))
print('index of min value',np.argmin(array))
print('array with only non-negative values:',np.abs(array[:4,0]))

# NOTE: This one is niche, but useful in neuroscience. Can you think of an example?
simple_array = np.arange(5)
print('increment sizes:',np.diff(simple_array)) # this one returns an array with elements 
                             # that are the difference between adjacent elements
                             

mean: 0.12021583740652103
std: 2.91980393970234
min: -4.914167291449758
max: 4.974396680667061
median: 0.46745938231627404
sum: 12.021583740652103
index of max value: 75
index of min value 1
array with only non-negative values: [3.99274649 0.24938535 3.76553528 3.43046668]
increment sizes: [1 1 1 1]


In [92]:
# another useful math related function is linspace. If you use matlab this should be familiar to you
W = np.linspace(0,1,1000) # returns 1000 uniformly distributed values between 0 and 1 inclusive
X = np.linspace(0,1,1001) # returns 1001 uniformly distributed values between 0 and 1 inclusive
Y = np.arange(0,1,0.001)  # returns values from 0 to 1 spaced at 0.001
Z = np.arange(0,1+0.001,0.001)  # returns values from 0 to 1.001 spaced at 0.001

print('----Using LINSPACE----')
print('--For W--')
print('The beginning:',W[:5])
print('The end:',W[-5:]) # the final value matters and is independent
print('size:',len(W))

print('\n--For X--')
print('The beginning:',X[:5])
print('The end:',X[-5:]) # the final value matters and is independent
print('size:',len(X))

print('\n----Using ARANGE----')
print('--For Y--')
print('The beginning:',Y[:5])
print('The end:',Y[-5:]) # the final value depends on the step size
print('size:',len(Y))

print('\n--For Z--')
print('The beginning:',Z[:5])
print('The end:',Z[-5:]) # the final value depends on the step size
print('size:',len(Z))


----Using LINSPACE----
--For W--
The beginning: [0.       0.001001 0.002002 0.003003 0.004004]
The end: [0.995996 0.996997 0.997998 0.998999 1.      ]
size: 1000

--For X--
The beginning: [0.    0.001 0.002 0.003 0.004]
The end: [0.996 0.997 0.998 0.999 1.   ]
size: 1001

----Using ARANGE----
--For Y--
The beginning: [0.    0.001 0.002 0.003 0.004]
The end: [0.995 0.996 0.997 0.998 0.999]
size: 1000

--For Z--
The beginning: [0.    0.001 0.002 0.003 0.004]
The end: [0.996 0.997 0.998 0.999 1.   ]
size: 1001


### Boolean operations and masking

> Using boolean operations allows us to set true-false conditions for how we index an array. Numpy has several useful ways of doing this. Let's start with the simplest.


In [66]:
truth_array = np.array([[True, False, False, True, False],
  [False, False, True, False, False],
  [False, False, True, False, False]])

print(truth_array)
print(np.where(truth_array))    # return an array of indices where the condition is True
print(np.argwhere(truth_array)) # return an array of indices where the condition is True

[[ True False False  True False]
 [False False  True False False]
 [False False  True False False]]
(array([0, 0, 1, 2]), array([0, 3, 2, 2]))
[[0 0]
 [0 3]
 [1 2]
 [2 2]]


In [67]:
# one returns "x"s separate from "y"s
# the other as ordered pairs of "x"s and "y"s
type(np.where(truth_array)),type(np.argwhere(truth_array))

(tuple, numpy.ndarray)

In [68]:
array = np.arange(10)

# this works for mathematical conditions, too
np.argwhere(array>5)



array([[6],
       [7],
       [8],
       [9]])

<h1></h1>

***Exercise 4*** Create an "step" function array (values are either 0 or 1). Return the indices with values greater than 0.

In [70]:
X = np.zeros(10)
X[3:7] = 1

print(X)

np.argwhere(X>0)



[0. 0. 0. 1. 1. 1. 1. 0. 0. 0.]


array([[3],
       [4],
       [5],
       [6]])

### Masking and fancy indexing
> A ***mask*** refers to making a boolean condition apart from the array (but defined using the array) then using that obtain a slice of the array. ***Fancy indexing*** refers to obtain  values of the array with explicitly defined indices.

In [71]:
array = np.arange(10)

# we define masks using parentheses and boolean control flow statements
mask1 = (array>2) & (array<4) # masking using and
mask2 = (array<2) | (array>4) # masking using or
mask3 = (array%2)
mask4 = np.argwhere(~array%2) # example with negation

print(array[mask1])
print(array[mask2])
print(array[mask3])
print(array[mask4]) # notice the shape of this



[3]
[0 1 5 6 7 8 9]
[0 1 0 1 0 1 0 1 0 1]
[[0]
 [2]
 [4]
 [6]
 [8]]


In [None]:
# fancy indices can be useful if we don't have 
# a simple way of defining the index conditions
array = array+10 

fancy = [1,3,4,5,8]
print(array[fancy])