
## At the end of this notebook, you'll be able to:
* Compare & contrast the types of structures that Python uses to store data points
* Recognize & create lists, tuples, and dictionaries in Python
* Index, slice, cast, and mutate lists
* Understand the implications of mutability and object-oriented programming

<hr>

# Data Structures

In this notebook, we'll explore different types of data structures that Python can use to store information, namely **lists, tuples, and dictionaries.**

## Lists
A _list_ is a mutable collection of ordered items, that can be of mixed type.

**Mutable** means that individual items in the object can be changed. Lists are mutable. Tuples and strings are not -- they're **immutable**.

Lists are created using square brackets `[ ]`, and individual elements are separated by commas.

In [5]:
# Create a list of the parts of a neuron 


### Useful list methods
- Check the length of your list by using `len(my_list)`
- Use `my_list.append()` to add elements to a list
- Remove elements by index using `del my_list[index]`
- Remove elements by value by using `my_list.remove('value')`
- Sort by using `my_list.sort()`

In [6]:
# Try different list methods here


### List indexing & slicing
**Indexing** refers to selecting an item from within a collection (e.g., lists, tuples, and strings). Indexing is done by placing the **index number** in square brackets, directly after the list variable.

For example, if `my_list = [1,3,5]`, we can get the second value using `my_list[1]`. (Remember that Python starts indexing at zero!)

### Reminders
- Python is zero-based (The first index is '0')
- Negative indices index backwards through a collection

In [7]:
# Try indexing our list of neuron parts here


### If we want multiple items, we can **slice** the list.

There are a few ways to slice:

1. We can **slice** a part of a list using the syntax `[start:stop]`, which extracts characters between index start and stop (-1).

**Notes**
- `start` is __included__ then every element __until__ `stop` is included.
- Negative values count backwards through the list.

2. If we omit either (or both) of start or stop from `[start:stop]`, the default is the beginning and the end of the string, respectively, e.g. `[:3]`
3. We can also define the step size (instead of default 1) using the syntax `[start:stop:step]`

<div class="alert alert-success"><b>Task:</b> For our list of neuron parts, create three different slices, and save them as different variables:
    
1. A slice of the first two parts.
2. A slice of the middle three parts.
3. A slice of the last part.
    
</div>

In [8]:
# Your code here!


### Checking length
We can use the function `len( )` to check the length of lists.

**Note**: We can also use this to get the number of characters in a string!

### Checking membership
We can use `in` to see if an item exists in a list. The `in` operator checks whether an element is present in a collection, and can be negated with `not`. _(More on operators in the next lecture)_

### Mutating lists
After definition, we can update members of our list _because lists are mutable!_ This also impacts aliases of our lists.

In [9]:
# Create alias of our list

# Update the original list

# Check both lists


### Creating lists of lists
Sometimes, it's useful to create lists of lists. Often, if we import big datasets as lists, this is how it will be organized.

![](https://swcarpentry.github.io/python-novice-inflammation/fig/indexing_lists_python.png)
<div align="center"><a href="https://swcarpentry.github.io/python-novice-inflammation/04-lists/index.html">Image source</a></div>

In [None]:
trial_1 = ['trial 1',0,1,1,1,0,0,1,1]
trial_2 = ['trial 2',1,1,1,1,0,0,0,0]
trial_3 = ['trial 3',1,0,1,0,1,0,1,0]
all_trials = [trial_1, trial_2, trial_3]

# We can use this syntax to get a specific value
print(all_trials[0])

**The lists above are actually a classic way to represent neuronal data.** In this toy example, each list represents a recording from a neuron, where each entry in the list representing a time point. all_trials represents the whole recording of many trials.

Neurons either spike or they don't, so an easy way to keep track of neuron behavior is using 1's for when they spike and 0's for when they don't. 

Thus, during trial 1, the neuron didn't spike during timepoint one, spiked three times, then didn't spike at all for the next two points, and so forth. We will learn about this type of neural data - called spike trains - in more detail in the future. 

## Tuples
A _tuple_ is an **immutable** collection of ordered items, that can be of mixed type.

* Tuples are created using parentheses.
* Indexing works similar to lists.

In [11]:
# Define a tuple that contains a string denoting a trial number as the first entry and a spike train as the second entry


<div class="alert alert-success"><b>Question</b>: Before running the cell below, try to predict: What will be printed out from running this code?</div>

In [None]:
lst = ['a', 'b', 'c']
tup = ('b', 'c', 'd')

if lst[-1] == tup[-1]:
    print('EndMatch')
elif tup[1] in lst:
    print('Overlap')
elif len(lst) == tup:
    print('Length')
else:
    print('None')

### Casting between variable types
We can use `list( )` or `tuple( )` to convert variables into different types. This is called **casting**.

This is particularly useful when we use an operator like `range( )` which generates a range, but in the form of an **iterator**.

**Note**: `range`, like indexing, is defined with `start`,`stop`, and `step`, but commas in between each. Remember that you can always use `?range` or `help(range)` to get details on how a function works.

In [None]:
help(range)

In [None]:
?range

In [15]:
# Test range here


# Dictionaries
Dictionaries are also like lists, except that each element is a key-value pair. The syntax for dictionaries is `{key1 : value1, ...}:`

### When dictionaries are useful
1. Flexible & efficient way to associate labels with heterogeneous data
2. Use where data items have, or can be given, labels
3. Appropriate for collecting data of different kinds (e.g., name, addresses, ages)

> In the cell below, create a dictionary for three neuron parts and their functions using the syntax `{part:function,...}`. Remember that strings still need parentheses!

**Note**: You can also create an empty dicitionary using `{}` and fill it using `dictionary['key'] = 'value'`.

In [18]:
anatomy = ...

<div class="alert alert-success"><b>Question:</b> Before running the cell below, predict: What would the following code produce?</div>

In [None]:
anatomy.update({'Myelin':'Sheath of fatty tissue that insulates the axon'})
anatomy

<div class="alert alert-success"><b>Task</b>: What happens if we look for a key that doesn't exist? Try this above.


### Additional dictionary functionality
- Use `anatomy.update(anatomy)` to add another dictionary entry
- Use `del anatomy['soma']` to delete entries
- Loop by key or values, or both

### Example of dictionaries using some NEUSCI301 data

Soon we will learn a lot about neuscitk, a package developed by the previous instructor of this class that contains functionality for analyzing LabChart data. Below, there will be some cells importing the package and laoding in the data. You do not need to understand the functionality of this code yet (except for the structure of the dictionary that we will produce).

Go to the canvas and download the **demo_data.mat** file in the Week two module. Then, put it in the same folder as this notebook that you are currently working on, or write to its location where indicated.

In [20]:
# Don't worry if you don't understand this cell yet! Currently, its only purpose is to load our data 

import neuscitk as ntk # imports neuscitk
dataset = ntk.LabChartDataset(r'neusci302_demo.mat') # loads in our data. 
                                                    # Put your path between the single quotes if the current call isnt working, for example r'Users\pascha\Downloads\demo_data.mat' 
                                                                                                                                # (with whatever backslash convention works for your laptop)
block1_data = dataset.get_block([1]) # retrieves the piece of data we want (more on this syntax later!)

In [None]:
# check the type of block1_data

# print block1_data

#optionally, practice with looking at different blocks by changing the number in dataset.get_block([X]) and printing the keys and values of the resulting dictionary

Each of those 'blocks' we were accessing above are actually the pages from your LabChart recording! We practice in detail exactly how to access and subsequently analyze those pages in the future. 

<hr>

## Additional resources
<a href="https://swcarpentry.github.io/python-novice-gapminder/11-lists/index.html">Software Carpentries Lists</a>

<a href="https://python101.pythonlibrary.org/chapter3_lists_dicts.html">Python 101: Lists, Tuples, and Dictionaries</a>

<a href="https://github.com/jakevdp/WhirlwindTourOfPython/blob/6f1daf714fe52a8dde6a288674ba46a7feed8816/06-Built-in-Data-Structures.ipynb">Whirlwind Tour of Python: Built-In Data Structures</a>


## About this notebook
This notebook is largely derived from UCSD COGS18 Materials, created by Tom Donoghue & Shannon Ellis, as well as the <a href="https://github.com/jrjohansson/scientific-python-lectures/blob/master/Lecture-1-Introduction-to-Python-Programming.ipynb">Scientific Python Lecture</a> by J.R. Johansson.