# Intro to Python and Numpy

Welcome to the first notebook of this course! In this notebook, we are
going to learn how to represent different kinds of data in Python. We
are also taking our first look at creating arrays in Numpy and we are
going to analyze some actual neuroscience data. Finally, we are going to
explore the differences in performance between Numpy and built-in Python
functions.

Execute the cell below to install the packages required for this
notebook.

In [None]:
%pip install numpy

## 1 Storing Data in Variables

In the first section, we are going to learn how to represent different
kinds of data and store them in variables. We are going to encounter
four basic data types: integers, floating-point numbers, Boolean values
and text strings. We are also going to use lists which are collections
of data. Data can be assigned to a variable using the `=` operator which
takes the value on the right and assigns it to the variable on the left.
In this sense, a variable is simply a container that we can use to store
and access data. The data type of a variable can be determined with the
`type()` function. We can also convert variables from one type to
another - for example, the `int()` function will try to convert a
variable to an integer. Finally, Python provides operators for the
arithmetic operations like addition `+`, subtraction `-`, multiplication
`*` and division `/`. Let’s test how this works!

| Code | Description |
|------------------------------------|------------------------------------|
| `x = 3.14` | Assign the floating-point number `3.14` to the variable `x` |
| `x = True` | Assign the boolean value `True` to the variable `x` |
| `x = "hello"` | Assign the string `"hello"` to the variable `x` |
| `x = [1,2,3]` | Assign the list of integers `[1,2,3]` to the variable `x` |
| `type(x)` | Get the data type of variable `x` |
| `int(x)` | Convert the variable `x` to an integer, if possible |
| `+`, `-`, `*`, `/` | Add, subtract, multiply, divide values |

------------------------------------------------------------------------

<span class="theorem-title">**Example 1**</span> Assign the integer
value `1` to a variable called `one` and print its `type()`.

In [11]:
one = 1
type(one)

int

<span class="theorem-title">**Exercise 1**</span> Subtract 0.5 from the
variable `one`

In [12]:
one - 0.5

0.5

<span class="theorem-title">**Exercise 2**</span> Assign the floating
value `0.001` to a variable called `small` and print its type.

In [14]:
small = 0.001
type(small)

float

In [15]:
one = 1.0
type(one)

float

<span class="theorem-title">**Exercise 3**</span> Assign the Boolean
value `False` to a variable called `this_is_false` and convert it to an
integer.

In [17]:
this_is_false = False
int(this_is_false)

0

<span class="theorem-title">**Exercise 4**</span> Assign the Boolean
value `True` to a variable called `this_is_true` and convert it to an
integer.

In [5]:
this_is_true = True
int(this_is_true)

1

<span class="theorem-title">**Exercise 5**</span> Assign the string
value `"goodbye"` to a variable called `goodbye` and print its type

In [27]:
goodbye = 'goodbye'
type(goodbye)

str

<span class="theorem-title">**Exercise 6**</span> Add the string
`"hello"` to the variable `goodbye`

In [28]:
goodbye = goodbye + 'hello'
goodbye

'goodbyehello'

<span class="theorem-title">**Exercise 7**</span> Create a list with the
numbers 1 through 6 to a variable called `dice` and print its type

In [34]:
dice = [1,2,3,4,5,6]
dice

[1, 2, 3, 4, 5, 6]

<span class="theorem-title">**Exercise 8**</span> Multiply the list
`dice` by 2. What happens?

In [35]:
dice*2

[1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]

<span class="theorem-title">**Exercise 9**</span> Try to add 1 to the list. What error message do you observe?

In [36]:
dice + 1

TypeError: can only concatenate list (not "int") to list

## 2 Analyzing Neural Spiking Data with Numpy

Numpy offers many useful functions for data analysis - let’s test them
on some actual neuroscience data! In this section, we are going to load
and analyze the spiking of a neuron in the primary visual cortex of a
mouse. The spikes are represented as a sorted list of time points where
spikes were observed. For example, `[0.05, 0.24, 1.5]` indicates that a
spike was observed 50, 240 and 1500 milliseconds after the start of the
recording. Using the functions below, we can answer some interesting
questions about the firing behavior of a given neuron.

| Code | Description |
|------------------------------------|------------------------------------|
| `import numpy as np` | Import the module `numpy` under the alias `np` |
| `x = np.load("data.npy")` | Load the file `"data.npy"` into an array and assign it to the variable `x` |
| `np.size(x)` | Get the total number of element stored in the array `x` |
| `np.min(x)` | Get the minimum value of the array `x` |
| `np.max(x)` | Get the maximum value of the array `x` |
| `np.mean(x)` | Compute the mean of all values in the array `x` |
| `np.std(x)` | Compute the standard deviation of all values in the array `x` |
| `np.diff(x)` | Compute the difference between consecutive elements in the array `x` |

------------------------------------------------------------------------

<span class="theorem-title">**Exercise 10**</span> Import the Numpy
module under the alias `np`.

In [37]:
import numpy as np

<span class="theorem-title">**Exercise 11**</span> Load the file
`"spikes.npy"` into a Numpy array.

In [38]:
spikes = np.load('spikes.npy')

In [42]:
spikes[:10]

array([0.05400352, 0.10857034, 0.26107077, 0.5307382 , 0.56383829,
       0.83653906, 1.21854014, 1.30070704, 2.12090935, 2.26177641])

<span class="theorem-title">**Exercise 12**</span> What is the total
number of spikes in this recording?

In [39]:
np.size(spikes)

721

<span class="theorem-title">**Exercise 13**</span> What is the duration
of the recording (assuming the recording stopped after the last spike
was recorded)?

In [40]:
spikes.max()

np.float64(298.4843451836275)

<span class="theorem-title">**Exercise 14**</span> Compute the neuron’s
average firing rate (the total number of spikes divided by the duration
of the recording).

In [43]:
np.set_printoptions(legacy='1.25')

In [44]:
np.size(spikes)/spikes.max()

2.415537067970653

<span class="theorem-title">**Exercise 14**</span> Compute the
inter-spike intervals (i.e. the time differences between subsequent
spikes).

In [None]:
np.diff(spikes)

array([5.45668206e-02, 1.52500430e-01, 2.69667427e-01, 3.31000934e-02,
       2.72700769e-01, 3.82001077e-01, 8.21668984e-02, 8.20202313e-01,
       1.40867064e-01, 5.27234820e-01, 1.28067028e-01, 1.42000400e-01,
       1.48433752e-01, 7.44468766e-01, 4.85234702e-01, 9.23469271e-01,
       1.14633657e-01, 2.14933940e-01, 2.31833987e-01, 1.33067042e-01,
       8.11102288e-01, 1.93367212e-01, 5.40334857e-02, 2.30433983e-01,
       1.16866996e-01, 1.00566950e-01, 3.59667681e-02, 7.12702010e-01,
       3.39734291e-01, 1.78000502e+00, 8.98369200e-01, 4.10801159e-01,
       6.40335139e-01, 4.96501400e-01, 1.27600360e-01, 1.45180409e+00,
       6.80735253e-01, 6.49868499e-01, 1.13300320e-01, 2.46667362e-01,
       1.29233698e-01, 7.74435517e-01, 1.20443673e+00, 2.88034146e-01,
       1.49060420e+00, 1.80567176e-01, 3.03467523e-01, 1.12310317e+00,
       2.69567427e-01, 9.76002753e-02, 2.34933996e-01, 5.25801483e-01,
       1.39460393e+00, 1.12130316e+00, 8.93702521e-01, 2.91967490e-01,
      

<span class="theorem-title">**Exercise 16**</span> What is the average
inter-spike interval for this neuron?

In [47]:
isi = np.diff(spikes)
isi.mean()

0.4144865856420776

<span class="theorem-title">**Exercise 17**</span> What is the standard
deviation of inter-spike intervals for this neuron?

In [None]:
np.diff(spikes).std()

np.float64(0.47663480650055273)

<span class="theorem-title">**Exercise 18**</span> What is the shortest
time between two spikes?

In [None]:
np.diff(spikes).min()

np.float64(0.0005666682648097776)

## 3 Creating Arrays in Numpy

Numpy also offers many functions for generating arrays. The simplest way
to create an array is to convert a list but there are other functions
for specific purposes like generating arrays of random numbers or
numbers within a certain range. Like variables, Numpy arrays can have
different data types. The type of an array is stored in the `.dtype`
attribute. In this section, we are going to create and explore different
kinds of arrays.

| Code | Description |
|------------------------------------|------------------------------------|
| `x = np.array([2,5,3])` | Create an array from the list `[2,5,3]` and assign it to the variable `x` |
| `x = np.random.randn(100)` | Create an array with 100 normally-distributed random numbers and assign it to the variable `x` |
| `x = np.arange(2,7)` | Create an array with all integers between 2 and (not including) 7 and assign it to the variable `x` |
| `x = np.arange(2,7,0.3)` | Create an array with evenly spaced values between 2 and 7 with a step size of 0.3 and assign it to the variable `x` |
| `x = np.linspace(2,3,10)` | Create an array with 10 evenly spaced values between 2 and 3 and assign it to the variable `x` |
| `x.dtype` | Get the data type of the numpy array `x` |

------------------------------------------------------------------------

<span class="theorem-title">**Example 2**</span> Create an array from
the list `[1, 2, 3]`, assign it to the variable `a` and display its type.

In [54]:
a = np.array([1,2,3])
a

array([1, 2, 3])

<span class="theorem-title">**Exercise 19**</span> Multiply the array
`a` by 2 and add 1 to it

In [55]:
a + 1

array([2, 3, 4])

<span class="theorem-title">**Exercise 20**</span> Create an array from
the list `[0.1, 0.2, 0.3]`, assign it to the variable `b` and display its
type.

In [56]:
b = np.array([0.1,0.2,0.3])
type(b)

numpy.ndarray

<span class="theorem-title">**Exercise 21**</span> Create an array with
from the list `[1, True, "a"]`, assign it to the variable `c` and display
its type.

In [58]:
list_of_different_elements = [1, 1.5, 'text', False]
list_of_different_elements

[1, 1.5, 'text', False]

In [59]:
c = np.array([1, True, "text"])
c

array(['1', 'True', 'text'], dtype='<U21')

In [61]:
type(c[0])

numpy.str_

<span class="theorem-title">**Exercise 22**</span> Try to add 1 to the variable `c`. What error message do you observe?

In [62]:
c+1

UFuncTypeError: ufunc 'add' did not contain a loop with signature matching types (dtype('<U21'), dtype('int64')) -> None

<span class="theorem-title">**Exercise 23**</span> Make an array
containing the integers from 1 to 15.

In [63]:
array_of_numbers = np.arange(1,15,1)
array_of_numbers

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

<span class="theorem-title">**Exercise 24**</span> Create an array that
contains all even numbers up to and including 100.

In [None]:
np.arange(0,100+2,2)

array([  0,  20,  40,  60,  80, 100])

<span class="theorem-title">**Exercise 23**</span> Make an array of only
6 evenly-spaced numbers between 1 and 10.

In [67]:
np.linspace(1,15,15)

array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
       14., 15.])

<span class="theorem-title">**Exercise 24**</span> Create an array of 10
normally-distributed random numbers and compute its mean and standard
deviation.

In [None]:
x = np.random.randn(10)
x, x.mean(), x.std()

(array([ 2.34795829,  0.52005694,  1.36875553, -0.66261736,  0.04160531,
        -2.35199175,  0.16371802,  0.66095843,  0.34011498,  0.56342371]),
 np.float64(0.2991982080817157),
 np.float64(1.1675236800496918))

In [71]:
np.random.randn()

1.1873709433638768

<span class="theorem-title">**Exercise 25**</span> Now, create arrays
with of 100 and 1000 normally-distributed random numbers and compute
their means and standard deviations.

In [72]:
x = np.random.randn(100)
x

array([ 0.09995222, -0.79858309, -0.58740017,  0.67346384, -0.87992126,
        0.13992849,  0.8490937 , -0.21670657,  0.26340046,  1.16125726,
       -0.71996176, -0.32060423, -0.21888624, -1.75468802,  0.07823674,
        0.59847714, -1.44610935, -0.14909564, -0.1903433 , -0.67847888,
       -1.08933325, -0.10057422,  1.10417093,  0.67961533,  1.91178015,
       -0.22458347,  2.18035083,  0.23734406, -1.24812332,  0.70920045,
       -1.13351337, -1.23826799,  0.55232649, -0.4131826 ,  0.73582418,
       -1.29748587,  0.47273285,  1.2661954 , -1.04886771,  0.3175788 ,
       -0.07841847, -0.0024405 , -1.08314188, -0.46205594,  1.25205768,
       -0.27237946,  0.30808409, -0.01569304, -1.09751897, -0.33345094,
       -1.10208373, -1.75513551,  1.53234599,  0.73935623,  0.33808026,
       -2.21620997,  0.80615859,  0.19841355, -0.29090015,  0.02971049,
        0.36636089, -0.69211102, -0.79704694, -0.87200123, -1.16067853,
        0.25867737,  0.89035821, -1.78806491, -0.82810299, -0.76

In [73]:
mean_of_x = np.mean(x)
mean_of_x

-0.09600715263400218

In [74]:
x.mean()

-0.09600715263400218

In [None]:
x = np.random.randn(1000)
x.mean(), x.std()

(np.float64(-0.023275449006644897), np.float64(0.9564360001195195))

## 4 Quantifying Numpy’s Performance

One of the key advantages of Numpy is that it is a lot faster than basic
Python. How much faster? Let’s find out! The code below creates an array
of ten thousand random numbers as well as a list with exactly the same
data. We can use these to test how Numpy compares to basic Python with
respect to performance.

In [75]:
my_array = np.random.randn(10000)
my_list = list(my_array)

In [76]:
sum(my_list)

-158.36010769431917

In [78]:
np.sum(my_array)

-158.36010769431957

To time our code, we are going to use the `%%timeit` command. Adding
`%%timeit` at the top of a cell makes it so that running that cell
displays the time it took to run the code. By default, the code is
executed ten times in a loop and the result is averaged over all loops.
This procedure is repeated seven times so that we get one average
duration for each run. The reported numbers are the average duration
across the seven runs and its standard deviation.

------------------------------------------------------------------------

<span class="theorem-title">**Example 3**</span> Estimate the time for
computing the sum of `my_list` using Python’s built-in `sum()` method
with `%%timeit`.

In [77]:
%%timeit
sum(my_list)

1.42 ms ± 136 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


<span class="theorem-title">**Exercise 27**</span> Use `%%timeit` to
estimate how long it takes to compute `np.sum()` of `my_array`.

In [79]:
%%timeit

np.sum(my_array)

16.6 μs ± 1.82 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


<span class="theorem-title">**Exercise 28**</span> Use `%%timeit` to
estimate how long it takes for Python’s built-in `max()` function to
find the maximum of `my_list`.

In [None]:
%%timeit

max(my_list)

200 μs ± 107 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


<span class="theorem-title">**Exercise 29**</span> Use `%%timeit` to
estimate how long it takes for the `np.max()` function to find the
maximum of `my_array`.

In [None]:
%%timeit

np.max(my_array)

4.78 μs ± 215 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


<span class="theorem-title">**Exercise 30**</span> The code below
estimates the time it takes to multiply every element of `my_list` by 2.
Use `%%timeit` to test how long it takes to multiply `my_array` by 2
(hint: use the `*` operator).

In [None]:
%%timeit
[item*2 for item in my_list]

657 μs ± 38.8 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [None]:
%%timeit

my_array*2

2.95 μs ± 63.4 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


<span class="theorem-title">**Exercise 31**</span> What is faster:
multiplying an array by 2 or adding the array to itself?

In [None]:
%%timeit

my_array+my_array

2.62 μs ± 91.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
