<a id='list'></a>
## PART 1: The List
_Lists_ are ordered collections of Python objects. Create lists using square brackets `[]` with objects separated by commas `,`.

When to use lists:
* Storing ordered strings (like file names or paths) or mixed data types
* Storing information that you want to iterate through (foreshadowing for loops)
* Storing short ordered group of numbers

When to look for alternatives:
* Storing large amounts of numbers
* Storing multi-dimensional data

In [None]:
list_numbers = [1, 3, 4, 2, 10]
sorted(list_numbers)

In [None]:
list_mixed = [2, 'one',  'three', 5]

In [None]:
list(range(0,10)) # use list() to explicity create a list of consecutive numbers

In [None]:
list_nested = [['first_sublist',1], ['second_sublist',2], ['third_sublist',3]]
print(list_nested[1])
list_nested[1][1]

In [None]:
# we can use set() to isolate the unique values in a list; set() doesn't return a list that we can slice, so we convert back to a list
list(set([1, 1, 2, 2, 2, 3]))

### Recall that we can extract specific entries of a string using slicing; the same can be performed with lists

In [None]:
my_list = [2, 'one',  'three', 5, 'ten', 31, 'one hundred', 101]

print(my_list[0:5])
print(my_list[6:]) # go from 6th entry onward; automatically uses last index as end
print(my_list[:-5]) # go up to the entry 5th from the end 

In [None]:
print(my_list[0:7:2]) # start_index:end_index:interval, get every other letter
print(my_list[::2]) # automatically use first and last index
print(my_list[1::2]) # start on index 1, get every other letter

Lists are _mutable_: we can reassign elements of a list or add to/take from list.

> __*mutable*__

> Objects that can change its value  
Contrast to _immutable_ objects that cannot change, e.g. numbers, strings, tuples (covered shortly)

In [None]:
# strings are immutable!
tmp = 'adsad'
tmp[0] = 'd'

# but lists are!
tmp = [0, 1, 2, 3]
tmp[0] = 1
print(tmp)

Variable assignment _binds_ objects to variable name. Be careful when working with mutable objects such as lists. 

While this is a common theme across python data structures (we will see this phenomenon a few more times in later lessons), it does not usually cause problems in neuroscience computing. It is still good to keep in the back of your mind.

In [None]:
var2 = [0, 1, 1, 2, 3, 5, 8]
var3 = var2

In [None]:
var2

In [None]:
var3

In [None]:
var3[3] = 42
print(var2)
print(var3)

In [None]:
# Here's the fix
var4 = [0, 1, 1, 2, 3, 5, 8]
var5 = list(var4)
var4[3] = 42
print(var4)
print(var5)

### List "functions"

In [None]:
# use the append method to add an item to a list.
my_list.append('g')
print(my_list)

In [None]:
my_list = [2, 'one',  'three', 5, 'ten', 31, 'one hundred', 101]

my_list[0:1] = [200, 201]
print(my_list)

In [None]:
popped_item = my_list.pop(1) # remove nth (1st) item
print(my_list)
print("Popped item:", popped_item)

In [None]:
my_list.insert(1,201)
my_list

In [None]:
my_number_list = [2, 10, 30, 1]
my_number_list.sort()
print(my_number_list)

In [None]:
len(my_number_list)

In [None]:
max(my_number_list)

##### At this point, you might be wondering why in python we have:
1. Repeating names followed by variables enclosed in parens, like `len(my_number_list)` or `max(my_number_list)`. 
2. Why there are what seems to be functions as well but come after a period following the target variable, like `my_number_list.sort()`. 
 
These are called functions and methods, respectively. Let me break them down.

___
Let's focus on functions first. 

Functions are blocks of code that can be rerun and called (ie. executed) with a single statement. It's good practice to make sure functions serve a single purpose. `print()` is one such function where it takes in a single required argument in parens, typically a series of alphabets enclosed with quotes, and displays it as output. 

There are two components to functions; in my own words and concepts, there are:
1. An inward-facing "function definition"
2. An outward-facing "function call" line

Let's first take a look at the function definition:

`def` _function_name_():  
&nbsp;&nbsp;&nbsp;&nbsp; _some computations_  
&nbsp;&nbsp;&nbsp;&nbsp; `return` _some_variable_

where _function_name_ is an arbitrary name that describes what the function will do, _some computations_ are a few lines of code to perform some analyses, and _some_variable_ is a variable calculated within the body of the function that you want to return to the main code as an output.

__Importantly__, the syntax of writing a function must have `def` before the function name, and a colon at the end of the line. All following indented lines are a part of the function.


Let's say we wanted to multiply each element of a list by 2 and then print the output like so:

In [None]:
my_number_list = [1,2,3]

In [None]:
input_var = my_number_list[0]
mult_var = input_var * 2
print(mult_var)

input_var = my_number_list[1]
mult_var = input_var * 2
print(mult_var)

input_var = my_number_list[2]
mult_var = input_var * 2
print(mult_var)

If we wanted to perform that calculation just a few times, then the code above kind of suffices (and still ugly), but what if we needed to do it 1000 times? Then the code will get bloated and cumbersome because you will have 1000 duplicates of the code above. Functions are useful here in that they can help reduce redundant code by compartmentalizing stereotyped lines of code and allow for the reuse of that code as a single line with dynamic input variables.

Here is the code above, but turned into a function:

Notice there is no output when you execute a function. That is because you are simply defining the existence of the function. It is only until you call the function (see cells further below) that any code within the function will execute.

In [None]:
def multiply_by_two(x):
    temp_var = x * 2
    print(temp_var)
    return temp_var

Most functions require some sort of input variable that the user supplies. They are specified within the parentheses. One user-defined input entry is called an argument, and individual arguments are separated by commas. The number of arguments required depends on the specific function and how the authors structured it. 

Functions can be arbitrarily named, can have any number of inputs, and any number of outputs.

The "outward-facing" component of functions is structured as follows:

_function_name_(`argument 1`, `argument 2`, ...)

Again, the number of arguments supplied (if any) is dependent on the function. Note many functions also have optional arguments that the user can supply. And if the function returns an output (as the above function does) using the return segment, it can be assigned to a variable like below:

In [None]:
function_ouput = multiply_by_two(20)
print('function_output contains:', function_ouput)

Let's look at the strength of reusability in functions:

In [None]:
multiply_by_two(my_number_list[0])
multiply_by_two(my_number_list[1])
multiply_by_two(my_number_list[2]);

__Important__: Every instance when a function is called, the variables that are passed to the function and computed within the function are self-contained. This means that any variable created within the function is not accessible to code outside of it. This is a good feature of functions especially if you need to call multiple instances of the function and don't want variables in each instance to conflict with other instances.

In [None]:
# note the x in the function defintion is not accessible outside of the function
x

### Python objects and their methods

<img src="pics\method_air.jpg" width="300" />

An object is python's elegant solution to define reusable blueprints for a group of specialized concepts and functions. Abstracting to a more broader context, objects are to python as software programs are to operating systems. 

In windows, we have microsoft excel, which is a program that enables us to create organized tabular representations of data. The elegance comes from excel's ability to create new sheets and files, and users can supply each new sheet/file with completely new data without having to recode the foundations of excel.

In python the equivalent __object__ example is pandas dataframes - their overstructure and associated actions that users/coders can enact on them follow a stereotyped blueprint (i.e. tabular visualization and the ability to add entries into table cells). 

Methods are attributes of Python objects that are functions that act on the particular object instance itself.  
They are referenced (as other attributes of an object) by object name, period, method name: `object.method`

____

To recap, functions and methods are indeed separate entities, but serve very similar purposes:
1. Starting with methods: parsimoniously, methods are functions that are associated with specific python objects. 
    * For example, the `.sort()` method above is specifically associated with list objects. If `.sort()` is attached to a string object, an error will arise.
2. Back to functions: Because functions can be called independently from objects, there is a little more flexibility in what input arguments can be passed to functions. 
    * Though incompatibilities between downstream code and the input argument can still occur (i.e. inputting a string as `x` in the above example function).

Practically, functions and methods can do the same exact thing; It is up to the original authors of the codebase if they want to organize their functions to be more streamlined and specific by associating the function to an object, thereby making them methods.

____

### Brief on tuples (you probably won't actively use these)

Recall that lists are mutable (entries can be altered). Tuples act similarly to lists; however their entries are immutable. They are characterized by the parentheses that bound their entries. For example

`(0, 'hi')`

In [None]:
tmp_tuple = (0, 'hi')

tmp_tuple[1]

In [None]:
tmp_tuple[1] = 'let me change hi to this string'

# PART 2: Lists with loops

Recall the following code block from the intro lesson:

```
for i in range(3):
    print(i)
```

which outputs:

```
0
1
2
```

We can use for loops to increment through lists (which is a type of iterable python object)

In [None]:
my_list = [21, 'four', 10 ]

print(len(my_list))
print(range(len(my_list)))

In [None]:
for index in range(len(my_list)): # `index` is usually encoded as `i`
    print(index)
    print(my_list[index])

In [None]:
# we can do the above in a simpler manner (unique to python):
for item in my_list:
    print(item)

In [None]:
for index, item in enumerate(my_list):
    print(index)
    print(item)

Let's preview dataframes and the pandas package, and combine them with lists and for loops.

In [None]:
import pandas as pd

In [None]:
path_to_2p_events = r'sample_data\2022_06_10_abb12_events.csv'
data_2p_events = pd.read_csv(path_to_2p_events)
data_2p_events

In [None]:
for row, row_item in data_2p_events.iterrows():
    print(row_item)

In [None]:
event_list = list(data_2p_events['event_id_char'])
print(event_list)

This isn't terribly informative since we have so many events that we can't get a good idea of how many unique event types there are.

In that case, we use the `set()` function (I basically googled "unique values in list") to obtain unique values in the list.

In [None]:
behav_conds = list(set(event_list)) # note, we use the list function again 
behav_conds

<a id='listcomp'></a>
## List comprehension
- Used to define lists by iteratively defining elements, much like a for loop
- Often more efficient than for loops
- Can be more readable (plus it's one line!)
- Can utilize if statements within (although can get a bit unweildy)

Skeleton of list comprehension
```
[<expression> for <variable> in <values for variable>]
[<expression> for <variable> in <values for variable> if <condition often based on variable>]
[<expression 1> if <condition> else <expression 2> for <variable> in <values for variable>]
```

In [None]:
my_list = [1, 12, 33, 14]

In [None]:
my_list_squared = []
for item in my_list:
    my_list_squared.append(item ** 2)

print(my_list_squared)

In [None]:
my_list_squared_compreh = [item ** 2 for item in my_list]

print(my_list_squared_compreh)

In [None]:
import timeit

statement = '''
my_list = [1, 12, 33, 14]
my_list_squared = []
for item in my_list:
    my_list_squared.append(item ** 2)
'''

timeit.timeit(stmt=statement, number = 10000)

In [None]:
statement = '''
my_list = [1, 12, 33, 14]
my_list_squared2 = [item ** 2 for item in my_list]
'''

timeit.timeit(stmt=statement, number = 10000)

Let's combine list comprehension with if statements.

Here, we'll use our behavioral condition list from our loaded pandas data. To add the if statement, just tack it on at the end of the list comprehension.
Below I demonstrate how to keep only the list entries that contain a substring ('cs'). Note we could use the `.remove()` method on the original list itself; however,  we would be limited because it only removes the first instance and only removes items that exactly match the supplied argument.

In [None]:
filtered_conds = []

for entry in behav_conds:
  if 'period' in entry:
    filtered_conds.append(entry)

filtered_conds

In [None]:
filtered_conds_compreh = [entry for entry in behav_conds if 'period' in entry]
filtered_conds_compreh