# Intro to Python and Numpy

Welcome to the first notebook of this course! In this notebook, we are
going to learn how to represent different kinds of data in Python. We
are also taking our first look at creating arrays in Numpy and we are
going to analyze some actual neuroscience data.Finally, we are going to
explore the differences in performance between Numpy and builtin Python
functions.

## Storing Data in Variables

In the first section, we are going to learn how to represent different
kinds of data and store them in varibles. We are going to encounter four
baisc data types: integers, floating-point numbers, boolean values and
text strings. We are also going to use lists which are collections of
data. Data can be assigned to a variable using the `=` operator which
takes the value on the right and assigns it to the variable on the left.
In this sense, a variable is simply a container that we can use to store
and access data.

| Code          | Description                                                |
|------------------------------------|------------------------------------|
| `x = 3.14`    | Assign the floatig-point number `3.14` to the variable `x` |
| `x = True`    | Assign the boolean value `True` to the variable `x`        |
| `x = "hello"` | Assign the string `"hello"` to the variable `x`            |
| `x = [1,2,3]` | Assign the list of integers `[1,2,3]` to the variable `x`  |

------------------------------------------------------------------------

<span class="theorem-title">**Exercise 1**</span> Assign the floating
value `0.001` to a variable called `small`

*Solution.*

In [1]:
small = 0.001

<span class="theorem-title">**Exercise 2**</span> Assign the boolean
value `False` to a variable called `this_is_false`

*Solution.*

In [3]:
this_is_false = False

<span class="theorem-title">**Exercise 3**</span> Assign the string
value `"goodbye"` to a variable called `goodbye`

*Solution.*

In [5]:
goodbye = "goodbye"

<span class="theorem-title">**Exercise 4**</span> Assign a list with the
number 0 through 9 to a variable called `digits`

*Solution.*

In [7]:
digits = [0,1,2,3,4,5,6,7,8,9]

<span class="theorem-title">**Exercise 5**</span> Assign a list with the
values `3`, `False` and `"red"` to a variable called `random`

*Solution.*

In [9]:
random = [3, False, "red"]

## Analyzing Neural Spiking Data with Numpy

Numpy offerst many useful functions for data analysis - let’s test tem
on some actual neuroscience data! In this section, we are going to load
and analyze the spiking of a neuron in the primary visual cortex of a
mouse. The spikes are represented as a sorted list of time points where
spikes where observed. For example, `[0.05, 0.24, 1.5]` indicates that a
spike was observed 50, 240 and 1500 milliseconds after the start of the
recording. Using the functions below, we can answer some interesting
questions about the firing behavior of a given neuron.

| Code | Description |
|------------------------------------|------------------------------------|
| `import numpy as np` | Import the module `numpy` under the alias `np` |
| `x = np.load("data.npy")` | Load the file `"data.npy"` into an array and assign it to the variable `x` |
| `np.size(x)` | Get the total number of element stored in the array `x` |
| `np.min(x)` | Get the minimum value of the array `x` |
| `np.max(x)` | Get the maximum value of the array `x` |
| `np.mean(x)` | Compute the mean of all values in the array `x` |
| `np.std(x)` | Compute the standard deviation of all values in the array `x` |
| `np.diff(x)` | Compute the difference between the elements in the array `x` |

------------------------------------------------------------------------

<span class="theorem-title">**Exercise 6**</span> Import the Numpy
module under the alias `np`

*Solution.*

In [11]:
import numpy as np

<span class="theorem-title">**Exercise 7**</span> Load the file
`"spikes.npy"` into a numpy array

*Solution.*

In [13]:
spikes = np.load("spikes.npy")

<span class="theorem-title">**Exercise 8**</span> What is the total
number of spikes in this recording?

*Solution.*

In [15]:
np.size(spikes)

721

<span class="theorem-title">**Exercise 9**</span> What is the duration
of the recording (assuming the recording stopped after the last spike
was recorded)

*Solution.*

In [17]:
np.max(spikes)

np.float64(298.4843451836275)

<span class="theorem-title">**Exercise 10**</span> Compute the neuron’s
average firing rate (the number of spikes divided by the duration of the
recording)

*Solution.*

In [19]:
np.size(spikes)/np.max(spikes)

np.float64(2.415537067970653)

<span class="theorem-title">**Exercise 11**</span> Compute the
inter-spike intervals (i.e. the time differences between subsequent
spikes)

*Solution.*

In [21]:
isi = np.diff(spikes)

<span class="theorem-title">**Exercise 12**</span> What is the average
inter-spike interval for this neuron?

*Solution.*

In [23]:
np.mean(isi)

np.float64(0.4144865856420776)

<span class="theorem-title">**Exercise 13**</span> What is the standard
deviation of inter-spike intervals for this neuron?

*Solution.*

In [25]:
np.std(isi)

np.float64(0.47663480650055273)

<span class="theorem-title">**Exercise 14**</span> What is the shortest
time between two spikes?

*Solution.*

In [27]:
np.min(isi)

np.float64(0.0005666682648097776)

## Creating Arrays in Numpy

Numpy also offers many functions for generating arrays. The simplest way
to create an array is to convert a list but there are other functions
for specific purposes like generating arrays of random numbers or
numbers within a certain range. In this section, we are going to create
different kinds of arrays

| Code | Description |
|------------------------------------|------------------------------------|
| `x = np.array([2,5,3])` | Turn the list `[2,5,3]` into a numpy array and assign it to the variable `x` |
| `x = np.random.randn(100)` | Create an array with 100 normally-distributed random numbers and assign it to the variable `x` |
| `x = np.arange(2,7)` | Create an array with all integers between 2 and (not inculding) 7 and assign it to the variable `x` |
| `x = np.arange(2,7,0.3)` | Create an array with evenly spaced values between 2 and 7 with a step size of 0.3 and assign it to the variable `x` |
| `x = np.linspace(2,3,10)` | Create an array with 10 evenly spaced values between 2 and 3 and assign it to the variable `x` |

------------------------------------------------------------------------

<span class="theorem-title">**Exercise 15**</span> Turn the list `a`
defined below into an array

In [29]:
a = [1, 2, 3]

*Solution.*

In [30]:
np.array(a)

array([1, 2, 3])

<span class="theorem-title">**Exercise 16**</span> Turn the list `b`
defined below into an array

In [32]:
b = [1, True, "a"]

*Solution.*

In [33]:
np.array(b)

array(['1', 'True', 'a'], dtype='<U21')

<span class="theorem-title">**Exercise 17**</span> Make an array
containing the integers from 1 to 15.

*Solution.*

In [35]:
np.arange(1, 15+1)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

<span class="theorem-title">**Exercise 18**</span> Create an array that
contains all even numbers up to and inculding 100

*Solution.*

In [37]:
np.sum(np.arange(0, 100+2, 2))

np.int64(2550)

<span class="theorem-title">**Exercise 19**</span> Make an array of only
6 evenly-spaced numbers between 1 and 10.

*Solution.*

In [39]:
np.linspace(1, 10, 6)

array([ 1. ,  2.8,  4.6,  6.4,  8.2, 10. ])

<span class="theorem-title">**Exercise 20**</span> Create an array of 10
normally-distributed random numbers and compute its mean and standard
deviation

*Solution.*

In [41]:
x = np.random.randn(10)
print(np.mean(x))
print(np.std(x))

-0.32918759037023054
0.6836782688743029

<span class="theorem-title">**Exercise 21**</span> Now, create arrays
with of 100 and 1000 normally-distributed random numbers and compute
their means and standard deviations

*Solution.*

In [43]:
x = np.random.randn(100)
print(np.mean(x))
print(np.std(x))

x = np.random.randn(1000)
print(np.mean(x))
print(np.std(x))

-0.0018101372644804747
0.9398487338454936
-0.04925505680788843
1.0208672737687117

<span class="theorem-title">**Exercise 22**</span> Assume we recorded a
signal with a duration of 20 seconds at a sampling rate of 100 Hz.
Create an array of time points for this signal using both `np.arange()`
and `np.linspace()`. The array should start at 0, end at 20 and contain
150 points for. Are the arrays generated with the two methods identical?

*Solution.*

In [45]:
t1 = np.linspace(0, 20, 100*20+1)
t2 = np.arange(0, 20+1/100, 1/100)

The array created with `np.arange()` may be longer because of rounding
errors.

## Quatifying Numpy’s Performance

One of the key advantages of Numpy is that it is a lot faster than basic
Python. How much faster? Let’s find out! The code below creates an array
of ten thousand random numbers as well as a list with exactly the same
data. We can use these to test how Numpy compares to basic Python with
respect to performance.

In [47]:
my_array = np.random.randn(10000)
my_list = list(my_array)

To time our code, we are going to use the `%%timeit` command. Adding
`%%timeit` at the top of a cell makes it so that running that cell
displays the time it took to run the code.

------------------------------------------------------------------------

<span class="theorem-title">**Example 1**</span> Estimate the time for
computing the sum of `my_list` using Python’s builtin `sum()` method
with `%%timeit`.

``` python
%%timeit
sum(my_list)
```

    743 μs ± 10.9 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Per default, the code is executed ten times in a loop and the result is
averaged over all loops. This procedure is repeated seven times so that
we get one average duration for each run. The reported numbers are the
average duration across the seven runs and its standard deviation.

------------------------------------------------------------------------

<span class="theorem-title">**Exercise 23**</span> Use `%%timeit` to
estimate how long it takes to compute `np.sum()` of `my_array`

*Solution.*

In [49]:
%%timeit
np.sum(my_array)

12.7 μs ± 800 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

<span class="theorem-title">**Exercise 24**</span> Use `%%timeit` to
estimate how long it takes for Python’s builtin `max()` function to find
the maximum of `my_list`

*Solution.*

In [51]:
%%timeit
max(my_list)

212 μs ± 459 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

<span class="theorem-title">**Exercise 25**</span> Use `%%timeit` to
estimate how long it takes for the `np.max()` function to find the
maximum of `my_array`

*Solution.*

In [53]:
%%timeit
np.max(my_array)

12.5 μs ± 195 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

<span class="theorem-title">**Exercise 26**</span> The code below
estimates the time it takes to multiply every element of `my_list` by 2.
Use `%%timeit` to test how long it takes to multiply `my_array` by 2
(hint: use the `*` operator)

In [55]:
%%timeit
[item*2 for item in my_list]

1.4 ms ± 3.35 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

*Solution.*

In [56]:
%%timeit
my_array*2

7.32 μs ± 10.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

<span class="theorem-title">**Exercise 27**</span> What is faster:
multipying an array by 2 or adding the array to itself?

*Solution.*

In [58]:
%%timeit
my_array + my_array

7.03 μs ± 9.44 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [59]:
%%timeit
my_array*2

7.31 μs ± 18 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

They take roughy the same time