<a id='list'></a>
## PART 1: The List
_Lists_ are ordered collections of data objects/items. Create lists using square brackets `[]` with objects separated by commas.

Here is an example: `my_list = [item1, item2, item3]`

When to use lists:
* Storing ordered strings (like file names or paths)
* Series of mixed data types
* Storing information that you want to iterate through (foreshadowing for loops)
* Storing short, ordered group of numbers

When to look for alternatives:
* Storing large amounts of numbers: numpy package
* Storing multi-dimensional data: numpy package
* Grouped data: pandas package

In [None]:
list_numbers = [1, 3, 4, 2, 10]
list_numbers

In [None]:
sorted(list_numbers)

In [None]:
list_mixed = [2, 'one',  'three', 5]
list_mixed

In [None]:
sorted(list_mixed)

In [None]:
list(range(0,10)) # use list() to explicity create a list of consecutive numbers

In [None]:
list(range(0,10,2))

In [None]:
list_nested = [['first_sublist',1], ['second_sublist',2], ['third_sublist',3]]
print(list_nested[1])
list_nested[1][1]

In [None]:
# we can use set() to isolate the unique values in a list; set() doesn't return a list that we can slice (returns a set, which is unindexable), so we convert back to a list
list(set([1, 1, 2, 2, 2, 3]))

### Recall that we can extract specific entries of a string using slicing; the same can be performed with lists

In [None]:
my_list = [2, 'one',  'three', 5, 'ten', 31, 'one hundred', 101]

print(my_list[0:5])
print(my_list[6:]) # go from 6th entry onward; automatically uses last index as end
print(my_list[:-5]) # go up to the entry 5th from the end 

In [None]:
print(my_list[0:7:2]) # start_index:end_index:interval, get every other item
print(my_list[::2]) # automatically use first and last index
print(my_list[1::2]) # start on index 1, get every other item

Lists are _mutable_: we can reassign elements of a list or add to/take from list.

> __*mutable*__

> Objects that can change its value  
Contrast to _immutable_ objects that cannot change, e.g. numbers, strings, tuples (covered shortly)

In [None]:
# strings are immutable!
tmp = 'adsad'
tmp[0] = 'd'

# but lists are!
tmp = [0, 1, 2, 3]
tmp[0] = 1
print(tmp)

Variable assignment _binds_ objects to variable name. This means that if the original variable is assigned to a second (copy), any modifications on the second will also propagate to the original - this is in part for python efficiency and resource management. Be careful when working with mutable objects such as lists. 

While this is a common theme across python data structures (we will see this phenomenon a few more times in later lessons), it does not usually cause problems in neuroscience computing. It is still good to keep in the back of your mind.

In [None]:
var2 = [0, 1, 1, 2, 3, 5, 8]
var3 = var2

In [None]:
var2

In [None]:
var3

In [None]:
var3[3] = 42
print(var2)
print(var3)

In [None]:
# Here's the fix
var4 = [0, 1, 1, 2, 3, 5, 8]
var5 = list(var4)
var5[3] = 42
print(var4)
print(var5)

### List "functions"

In [None]:
# use the append method to add an item to a list.
my_list.append('g')
print(my_list)

In [None]:
my_list = [2, 'one',  'three', 5, 'ten', 31, 'one hundred', 101]

my_list[0:1] = [200, 201]
print(my_list)

In [None]:
popped_item = my_list.pop(1) # remove nth (1st) item
print(my_list)
print("Popped item:", popped_item)

In [None]:
my_list.insert(1,201)
my_list

In [None]:
my_number_list = [2, 10, 30, 1]
my_number_list.sort()
print(my_number_list)

In [None]:
len(my_number_list)

In [None]:
max(my_number_list)

##### At this point, you might be wondering why in python we have:
1. Repeating names followed by variables enclosed in parens, like `len(my_number_list)` or `max(my_number_list)`. 
2. Why there are what seems to be functions as well but come after a period following the target variable, like `my_number_list.sort()`. 
 
These are called __functions__ and __methods__, respectively. Let me break them down.

___
Let's focus on functions first. 

Functions are blocks of code that can be rerun and called (ie. executed) with a single statement. It's good practice to make sure functions serve a single purpose. `print()` is one such function where it takes in a single required argument in parens, typically a series of alphabets enclosed with quotes, and displays it as output. 

There are two components to functions in implementation; in my own words and concepts, there are:
1. An inward-facing "function definition"
2. An outward-facing "function call" line

Let's first take a look at the function definition (instructions):

`def` _function_name_():  
&nbsp;&nbsp;&nbsp;&nbsp; _some computations_  
&nbsp;&nbsp;&nbsp;&nbsp; `return` _some_variable_

where _function_name_ is an arbitrary name that describes what the function will do, _some computations_ are a few lines of code to perform some analyses, and _some_variable_ is a variable calculated within the body of the function that you want to return to the main code as an output.

In some cases, __required or optional inputs can be indicated__ - these reside within the parens after the function name and are separated by commas.

__Importantly__, the syntax of writing a function must have `def` before the function name, and a colon at the end of the line. All following indented lines are a part of the function.


Let's look at an example of a pre-existing function. This one is in the numpy package that converts complex numbers to angles in degrees. We can use the inspect module to help us identify and view the function source code.

In [None]:
import inspect
import numpy as np
lines = inspect.getsource(np.angle) 
print(lines)

In [None]:
print( np.angle(1+1j, deg=True) )
print( np.angle(2+4j, deg=False) ) 


Let's now see how this plays out in our own example. Let's say we wanted to multiply each element of a list by 2 and then print the output like so:

In [None]:
my_number_list = [1,2,3]

In [None]:
input_var = my_number_list[0]
mult_var = input_var * 2
print(mult_var)

input_var = my_number_list[1]
mult_var = input_var * 2
print(mult_var)

input_var = my_number_list[2]
mult_var = input_var * 2
print(mult_var)

If we wanted to perform that calculation just a few times, then the code above kind of suffices (and still ugly), but what if we needed to do it 1000 times? Then the code will get bloated and cumbersome because you will have 1000 duplicates of the code above. Functions are useful here in that they can help reduce redundant code by compartmentalizing stereotyped lines of code and allow for the reuse of that code as a single line with dynamic input variables.

Here is the code above, but turned into a function:

Notice there is no output when you execute a function. That is because you are simply defining the existence of the function. It is only until you call the function (see cells further below) that any code within the function will execute.

In [None]:
def multiply_by_two(x):
    temp_var = x * 2
    print(temp_var)
    return temp_var

Most functions require some sort of input variable that the user supplies. They are specified within the parentheses. One user-defined input entry is called an argument, and individual arguments are separated by commas. The number of arguments required depends on the specific function and how the authors structured it. 

Functions can be arbitrarily named, can have any number of inputs, and any number of outputs.

The "outward-facing" component of functions is structured as follows:

_function_name_(`argument 1`, `argument 2`, ...)

Again, the number of arguments supplied (if any) is dependent on the function. Note many functions also have optional arguments that the user can supply. And if the function returns an output (as the above function does) using the return segment, it can be assigned to a variable like below:

In [None]:
function_ouput = multiply_by_two(20)
print('The number above gets printed from within the function')
print('function_output variable contains:', function_ouput)

Let's look at the strength of reusability in functions:

In [None]:
multiply_by_two(my_number_list[0])
multiply_by_two(my_number_list[1])
multiply_by_two(my_number_list[2]);

__Important__: Every instance when a function is called, the variables that are passed to the function and computed within the function are self-contained. This means that any variable created within the function is not accessible to code outside of it. This is a good feature of functions especially if you need to call multiple instances of the function and don't want variables in each instance to conflict with other instances.

Also good to note: The input and output variable names defined in the internal-facing arguments and return lines only matter within the function. 
The input variables in the outward function call can be named whatever so long as they are in the proper position. In other words, the input variable name(s) in the function call does not need to match the argument names defined in the inward-facing function definition. In fact, to reduce ambiguity, they should be distinctly named.

In [None]:
# note the x in the function defintion is not accessible outside of the function
x

### Python objects and their methods

<img src="pics\method_air.jpg" width="300" />

We introduced the concept of the object-oriented nature of python. To recap, 
1. In python we are able to create blocks of code (contains a group of variables and functions that serve a unified purpose) that can serve as templates for a single instantiated entity. These templates are called _classes_. 
2. A declared entity based on a class is called an _object_ - usually happens when you assign a data item/structure to a variable
3. You can apply actions to objects called _functions_
4. You can also apply an action __specific__ to that instantiated object, called a _method_. 

A real world analogy would be:
1. Class: car blueprint (variables that are described by the class may include speed, tire size, engine type, etc.)
2. Object: a specific manufactured car, e.g. my 2012 Honda Civic
3. Function: an action that is applied to the object, e.g. oil change/rotate tires
4. Method: an innate action that the object can perform, e.g. accelerate/brake

Python example:
1. Class: List
2. Object: Specific list variable that you declared
3. Function: Print function
4. Method: Sort method

https://www.google.com/search?q=python+object+cars&tbm=isch&ved=2ahUKEwi20_Ka3KqAAxUCIH0KHfuBCwcQ2-cCegQIABAA&oq=python+object+cars&gs_lcp=CgNpbWcQA1AAWABg-gVoAHAAeACAATeIATeSAQExmAEAqgELZ3dzLXdpei1pbWfAAQE&sclient=img&ei=mTPAZPaiLYLA9AP7g644&bih=663&biw=1376&client=firefox-b-1-d

Methods are attributes of Python objects that are functions that act on the particular object instance itself.  
They are referenced (as other attributes of an object) by object name, period, method name: `object.method`

____

To recap, functions and methods are indeed separate entities, but serve very similar purposes:
1. Starting with methods: parsimoniously, methods are functions that are associated with specific python objects. 
    * For example, the `.sort()` method above is specifically associated with list objects. If `.sort()` is attached to a string object, an error will arise.
2. Back to functions: Because functions can be called independently from objects, there is a little more flexibility in what input arguments can be passed to functions. 
    * Though incompatibilities between downstream code and the input argument can still occur (i.e. inputting a string as `x` in the above example `multiply_by_two` function).

Practically, functions and methods can do the same exact thing; it is up to the original authors of the codebase if they want to organize their functions to be more streamlined and specific by associating the function to an object, thereby making them methods.

As an aside: Classes and methods are particularly useful for the machine-learning package, where stereotyped templates that can instantiate an instance of a model, its parameter values, and a preset group of methods (that may aid in optimizing model parameters) can facilitate the organization of variables and parameters associated with a particular model.
____

### Brief on tuples (you probably won't actively use these)

Recall that lists are mutable (entries can be altered). Tuples act similarly to lists; however their entries are immutable. They are characterized by the parentheses that bound their entries. For example

`(0, 'hi')`

In [None]:
tmp_tuple = (0, 'hi')

tmp_tuple[1]

In [None]:
tmp_tuple[1] = 'let me change hi to this string'

# PART 2: Lists with loops

### Iteration and loops

Across your coding endeavors, you will likely need to modify certain variables repeatedly. A very simple example of this is to add a number by `2` numerous times until it reaches a certain value (10). Below is called a while loop.

In [None]:
i = 0
while i < 10:
    i += 2
    print(i)

Another form of iteration is a for loop; for loops help simplify and reduce the code required to perform an action. Here if we wanted to add a number by `2` five times without a loop, it'll look like this:

In [None]:
add_variable = 0
print(add_variable)

add_variable += 2
print(add_variable)
add_variable += 2
print(add_variable)
add_variable += 2
print(add_variable)
add_variable += 2
print(add_variable)
add_variable += 2
print(add_variable)

The above code has a lot of repeated lines that can be consolidated and made more readable using a for loop. A for loop is structured like:

`for` _incremented_variable_ `in` _iterable_`:`  
&nbsp;&nbsp;&nbsp;&nbsp; _some computations_

where _iterable_ is a set of discrete items that will be sequentially looped through. For each iteration of the loop, all following lines that are indented will be run.

In addition to the for loop, we introduce another inbuilt python function called range, which given an input integer number provides all integers from 0 up until the specified number. This will be helpful to specify how many times we want the loop to run:

In [None]:
print(range(5))
print(list(range(5)))

When using range in a for loop, you don't need to convert to a list (although you can if you want) - it aids in readability.

In [None]:
for i in range(5):
    print(i)

In [None]:
# here's a for loop that does a bit more

add_variable = 0

for increment in range(5):
    add_variable += 2
    print(add_variable)

We can use for loops to increment through lists (which is a type of iterable python object)

In [None]:
my_list = [21, 'four', 10 ]

print(len(my_list))
print(range(len(my_list)))

In [None]:
for index in range(len(my_list)): # `index` is usually encoded as `i`
    print(index)
    print(my_list[index])

In [None]:
# we can do the above in a simpler manner (unique to python):
for item in my_list:
    print(item)

In [None]:
# here's a pythonic way of getting the item and index within a for loop
for index, item in enumerate(my_list):
    print(index)
    print(item)

Let's preview dataframes and the pandas package, and combine them with lists and for loops.

In [None]:
import pandas as pd

In [None]:
path_to_2p_events = r'sample_data\2022_06_10_abb12_events.csv'
data_2p_events = pd.read_csv(path_to_2p_events)
data_2p_events

We can use the code below to loop through each entry in the table.

In [None]:
for row, row_item in data_2p_events.iterrows():
    print(row_item)
    print(' ')
    

Switching gears slightly: we can use the list function to convert a pandas series to a list. This is mostly preference as one can use pandas series in place of lists in most cases (though series take up a small amount more memory than lists).

In [None]:
event_list = list(data_2p_events['event_id_char'])
print(event_list)

This isn't terribly informative since we have so many events that we can't get a good idea of how many unique event types there are.

In that case, we use the `set()` function (I basically googled "unique values in list") to obtain unique values in the list.

In [None]:
behav_conds = list(set(event_list)) # note, we use the list function again 
behav_conds

<a id='listcomp'></a>
## List comprehension
- Used to define lists by iteratively defining elements, much like a for loop
- Often more efficient than for loops
- Can be more readable (plus it's one line!)
- Can utilize if statements within (although can get a bit unweildy)

Skeleton of list comprehension
```
[<expression> for <variable> in <values for variable>]
[<expression> for <variable> in <values for variable> if <condition often based on variable>]
[<expression 1> if <condition> else <expression 2> for <variable> in <values for variable>]
```

In [None]:
my_list = [1, 12, 33, 14]

In [None]:
my_list_squared = [] # preinitialize list for appending
for item in my_list:
    my_list_squared.append(item ** 2)

print(my_list_squared)

In [None]:
my_list_squared_compreh = [item ** 2 for item in my_list]

print(my_list_squared_compreh)

In [None]:
import timeit

statement = '''
my_list = [1, 12, 33, 14]
my_list_squared = []
for item in my_list:
    my_list_squared.append(item ** 2)
'''

timeit.timeit(stmt=statement, number = 10000)

In [None]:
statement = '''
my_list = [1, 12, 33, 14]
my_list_squared2 = [item ** 2 for item in my_list]
'''

timeit.timeit(stmt=statement, number = 10000)

Let's combine list comprehension with if statements.

Here, we'll use our behavioral condition list from our loaded pandas data. To add the if statement, just tack it on at the end of the list comprehension.
Below I demonstrate how to keep only the list entries that contain a substring ('cs'). Note we could use the `.remove()` method on the original list itself; however,  we would be limited because it only removes the first instance and only removes items that exactly match the supplied argument.

In [None]:
filtered_conds = []

for entry in behav_conds:
  if 'period' in entry:
    filtered_conds.append(entry)

filtered_conds

In [None]:
filtered_conds_compreh = [entry for entry in behav_conds if 'period' in entry]
filtered_conds_compreh

Let's discuss a critical topic related to analyzing time-series data (which we will start to work with soon for the photometry data) that can be introduced using lists: how sampling rate related to recording samples and time, and how to perform necessary conversions.

Discussion topics:
* The __sampling rate's__ units are samples per 1 second interval. In other words, how many datapoints acquired within a 1 second time window. It is usually abbreviated as `fs`
* The __sampling period__ describes the period of time between two successive sampled points
* A __time vector (tvec)__ can be calculated given the sampling period, and describes the mapping of time points to samples, and vice versa. This is crucial for figuring out which sample in the recording corresponds to a behavioral event, which we probably have in units of time (as an aside, if the behavioral data are acquired using a different sampling rate, it is important to first convert those occurences to time before mapping onto the neural data's samples)

<br />

* Ideally the data acquisition device saves timestamps that correspond to each data point - this simplifies things greatly because the conversion of sample to time points is basically given
* If this information is not readily available, usually the sampling rate is noted down by the experimenter. We can use the following premise/code below to determine what times correspond to sample/data points.

<img src="pics\sampling_time_conversion.jpg" width="600" />

In [None]:
fs = 10 # samples per second

period = 1.0/fs

print(f'The sampling period at {fs} Hz is {period*1000} ms')
print(f'Another way to think about this is within the period of 1 second, there are {fs} blocks of {period*1000} ms ')

In [None]:
tvec = []
for sample in range(fs):
    tvec.append(sample*period)

print(tvec)
print(len(tvec))


In [None]:
np.linspace(0, 1, 11)