## Contents

* Data Structures
	* Tuples
	* Dictionaries
	* Lists
        * Slicing
	* NumPy arrays
* Loops
    * `while` Loops
    * `for` Loops
    * Writing Tips
    * Debugging Tips
    * List Comprehensions

## Data Structures

Different ways to store data or group related data together.

### Tuples

In [None]:
some_book = ("The Watcher", 582, 4.5)

# Defined using ()
# Elements/items can have different types
# Items can be duplicates of each other
# Elements can be accessed by index because their order matters
print(some_book[0])
print(some_book[1])
print(some_book[2])

In [None]:
# Tuples themselves CANNOT be altered in any way

# These should cause errors (Python will always stop at the first)
some_book[0] = "Watchmen"   # Cannot reassign items to tuples
some_book[3] = "new item"   # Cannot add new items to or delete items from tuples

Ideal for:
* Unchanging data &mdash; tuples themselves can't ever change
* Small groups of related data that's *only* used in nearby code
* Packaging 2 values together in places where only 1 is typically expected **(more on this later)**
  * i.e. returning multiple values at once from functions
  * i.e. dictionary keys based on multiple related values at once

Drawbacks:
* Accessing items *only* by index easily gets confusing
  * Especially if there are lots of items
  * Especially if the tuple was defined far from where it's used
  * Not very self-documenting, the coder must remember what each item means by themselves

### Dictionaries

Strap in because these things are powerful but the needed baseline knowledge is a doozy.

In [None]:
some_book = {
  "title": "The Watcher",
  "pages": 582,
  "rating": 4.5,
  42: 42,
  69.0: 69,
  (): "my key is a tuple"
}

# Defined using {}
# Contains "key-value pairs"
# Can be defined entirely on 1 line like the tuple above, but spreading across multiple lines helps with readability

# Keys are how you access values INSTEAD of indexes
# Keys MUST be unique
# Keys can be strings, numbers, or tuples /containing/ strings and numbers

# Values are,,, the values/items
# Values can be literally anything you want, and can be duplicates of existing values
# Even other entire tuples, lists, dictionaries, files, class objects, etc

In [None]:
# Access items with [] like usual, but use the desired key instead of an index

print(some_book['title'])
print(some_book[42.0])  # Type conversions are automatically done as necessary, typical Python
print(some_book[()])

In [None]:
# New items can be added just by using a new key
print('before', some_book, '\n')

some_book['new_key'] = 'I didnt exist before'

print('after', some_book)

In [None]:
# Existing items can be changed
# The expression for accessing a value can also be used like a variable name to change the same value
print('before:', some_book[42])

some_book[42] = 420

print('after:', some_book[42])

Dictionaries have lots of methods (associated functions) that perform more useful and complex operations. A list of them can be found [here](https://www.w3schools.com/python/python_dictionaries_methods.asp).

Weirdly enough, I couldn't find a clean list like this in [Python's official documentation](https://docs.python.org)

Benefits:
* Easily represent things whose list of properties will grow and shrink a lot
  * Easy to add and remove key-value pairs
  * This is probably how the classes we've made so far should be done. Much easier than constantly creating new ```this.X``` values inside class functions.
* Can still loop through all the key-value pairs like tuple/list elements **(more on this later)**

Drawbacks:
* Basically none in most cases. There are jokes about how, if you don't know the answer to a technical interview question, just throw a dictionary/hash map/map at it and you'll get a good enough answer lmao

### Lists

In [None]:
titles = ["The Lightning Thief", "The Sea of Monsters", "The Titan's Curse"]
random_list = ["The Watcher", 420, ()]

# Defined using []
# Elements/items can have different types
# Items can be duplicates of each other
# Basically a tuple whose elements you can change

In [None]:
# Access items using an index
print(titles[1])

# Change items in a similar way to dictionaries
titles[1] = "The Son of Neptune"
print(titles[1])

You may have noticed that I left something pretty important out of the dictionary and list sections so far. What if you need to delete elements from dictionaries or lists?

I don't think removing things from lists/dictionaries is something you will be doing often, at least not until you find yourself *desperately* needing to optimize your scripts.

You can always clone a data structure and edit that new copy. It will require more memory, but this way you'll always have access to old versions of that structure. It can be helpful if you need to retrace your steps while writing or if you just want to see data both before and after some change.

Changing data structures, especially in the middle of loops that use/reference them, can also easily cause bugs. Algorithms/loops you create may not work as expected because the data you're working on changes in the middle of execution.

We've run into this before on your projects, but here's another (contrived) example for reference:

In [None]:
alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# You might initially expect this loop to print the entire alphabet, but...
for letter in alphabet:
  if letter == 'k':
    alphabet.remove(letter) # changing the list as we go through it...

  print(letter) # may give unexpected results.

#### Slicing

You can easily isolate subsections of strings or lists using **slice notation**. This involves multiple numbers/expressions between the `[]` separated by colons.

In [None]:
alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# [ start : end : step ]
# "end" is EXCLUSIVE
print(alphabet[0:3:1])
print(alphabet[0:len(alphabet):2])
print(alphabet[-3:len(alphabet):1])
print(alphabet[11:16:1])

Slice notation has lots of nice shortcuts and defaults that can make it really easy to use.

In [None]:
# These all print the same value
# 0 is the default "start"
# 1 is the default "step"
print(alphabet[0:3:1])
print(alphabet[:3:1])
print(alphabet[:3:])
print(alphabet[:3])

In [None]:
# These all also print the same value
# len(str) or len(list) is the default "end"
print(alphabet[11:len(alphabet):1])
print(alphabet[11:len(alphabet):])
print(alphabet[11::])  # leaving the "step" at the default

# These all also also print the same value
print(alphabet[0:len(alphabet):2])
print(alphabet[0::2])
print(alphabet[::2])

In [None]:
# Negative "start" and "stop" numbers indicate counting from the end

print(alphabet[-3:len(alphabet):1]) # sublist consisting of elements spanning:
print(alphabet[-3:])                # "3rd from the end" to "the end of the list"

print(alphabet[-3:-1])  # "3rd from the end" up to "final element" (exclusive)
print(alphabet[-4:-2])  # "4th from the end" to "2nd from the end"

print(alphabet[:-21])  # everything BUT the last 21 items

In [None]:
# Negative "step" numbers indicate reversing the order
# If you use these, "start" and "stop" should also be in the opposite order
print(alphabet[::-1])
print(alphabet[len(alphabet):0:-1]) # There's no way to use negative steps to
                                    # reverse an ENTIRE list, only part of it
print(alphabet[3:0:-1])

### NumPy Arrays

In [None]:
import numpy as np
print(np.array([1,2,3]))
print(np.array([
  [1,2],
  [3,4]
]))

# Basically Python lists but with some extra restrictions that allow them to be MUCH more efficient
# All items MUST be the same type (NumPy will automatically convert as needed)

Benefits:
* MUCH faster and more space-efficient inherently
  * This may not be noticeable until you get to massive million-cell arrays or larger, but the bonuses are always there
* Lots of free NumPy functions for faster calculations
* Accurate to what arrays are like in basically all other languages

Drawbacks:
* Less flexible
  * ALL items must be the same type
  * Fixed size defined at creation, cannot add/remove items
    * But you can create large, empty arrays and loop to populate them
  * For a list of lists, all sublists MUST be the same size

## Loops

Allows easy repetition of specific blocks/chunks of code.

ALL loops have 4 parts to outline or consider:
1. the setup
  * creating empty lists, initializing counter variables, etc
2. the condition
  * what must be **TRUE** to keep the loop going (i.e. this condition will be *false* when the loop finishes)
3. the body
  * the code that will be executed inside the loop
  * this can include more loops (nesting)
4. the state update
  * how the code moves on to the next step/item/iteration/etc



### `while` Loops

In [None]:
# Best when you don't know exactly how many iterations you want
# or when the loop condition isn't based on a numberic value
rand_letters = ['a', 'j', 'x', 'w', 'p', 'q', 'i', 'o', 'b', 'v']

i = 0                         # setup
while rand_letters[i] != 'q': # condition
  print(rand_letters[i])      # body
  i += 1                      # update

Ideal in *any* of these cases:
* There's no (easy) expression to calculate or figure out how many iterations you need to do
* The loop condition is not based on a simple incrementing/decrementing value
  * i.e. when the looping variable is set to *specific* numbers like in a [binary search](https://www.geeksforgeeks.org/binary-search/) or is based on whether some variable is `True`/`False`

Other benefits:
* More flexible than `for` loops (more on that next section)
* More explicit loop parts

Drawbacks:
* Manual setup and updating
  * Since you explicitly write out each part of the loop, you must know all the details of what you want it to do

### 'for' Loops

One particular loop use case is *so* common that a special type of loop was created to handle them. `for` loops make looping through iterables much easier. All the above data structures are iterable (though dictionaries, with their keys being of any type are a little more difficult).

There are LOTS of variations, but they're all meant to assist in different sub-cases of that core problem of going through all elements in a collection one by one.

In [None]:
rand_letters = ['a', 'j', 'x', 'w', 'p']

# Simplest and most common version of Python's "for" loop
# Each "letter" is an alias for the item itself from the "rand_letters" iterable

for letter in rand_letters: # setup, condition, update all in 1 line
  print(letter)             # body

In [None]:
rand_letters = ['a', 'j', 'x', 'w', 'p']

# range() returns an iterable which lets you easily loop through most series of numbers

# range's arguments are just like that of slices: "start", "stop" (exclusive), "step"
print(range(5), "->", list(range(5)))
print(range(1, 5), "->", list(range(1, 5)))
print(range(2, 10, 2), "->", list(range(2, 10, 2)))

# With "len()", it lets you easily get the indices of any tuple/list/array/etc
# This is closer to how it's done in other languages,
# but not very common in Python due to better options.

for i in range(len(rand_letters)):  # setup, condition, update all in 1 line
  print(rand_letters[i])            # body

In [None]:
rand_letters  = ('a', 'j', 'x', 'w', 'p')

# enumerate() returns an iterable which lets you easily
# access both the items and their indices of another iterable
print(enumerate(rand_letters))
print(list(enumerate(rand_letters)))

# Indices and items are given in a tuple, which you can
# deconstruct like this to easily access both elements
for i, letter in enumerate(rand_letters): # setup, condition, update all in 1 line
  print(f"{i}: {letter}")                 # body

This is the more Python-y way to use the indices of an iterable.
You rarely need to loop through a list *without* accessing its items, and `enumerate()` gives easy access to both.

In [None]:
rand_letters  = ['a', 'j', 'x', 'w', 'p']
rand_caps     = ['I', 'O', 'B', 'V', 'Z']

# zip() returns an iterable which lets you easily loop
# through multiple other iterables at the same time
print(zip(rand_letters, rand_caps))
print(list(zip(rand_letters, rand_caps)))

# Items are given in a tuple, which you can
# deconstruct like this to easily access both elements
for letter, cap in zip(rand_letters, rand_caps):  # setup, condition, update all in 1 line
  print(letter, cap)                              # body

The argument order matters for `zip()`. Tuple elements will appear in the same order as the order their original lists appeared in.

### Tips for Writing Loops

* (again) If you're looping through a data structure, avoid changing its contents in the middle of the loop
  * Can cause unexpected behaviors and mess up your intended algorithms
  * Safer and more consistent (but more memory-intensive) to create copies

* If the loop is too big to fit on one screen, that's probably a good sign to simplify it
  * Break its body into smaller functions (more on these later)
  * Break it into smaller loops with clearer jobs
  * Up to personal preference/judgment though (but it is **much** easier to understand and debug shorter loops)

* Nested loops are most commonly used when:
  * Comparing each item in a data structure with everything else in that structure
  * Looping through matrices (2-or-higher dimensional arrays/lists)

### Tips for Debugging Loops

* Most loop issues stem from:
  * incorrect loop structure (setup, condition, update)
  * confusion about which variables are a single item vs. a collection of items
    * and by extension which one you need to work with for this loop

* For most debugging, just double-checking the "setup", "condition", and data structure accesses should be enough.
  * Printing values can be helpful here, and debuggers are a more automated/advanced version of this.
  
* When intensively debugging, imagine yourself as the computer going through the steps (with simple sample data if needed)
  * Start at element 1
  * Write down all your variables, and manually track how they change
    * Debuggers will help with this (more on them later)
  * Think as literally as possible

#### List Comprehensions

These build `lists` using a technique from functional programming. Comprehensions can be thought of as list-building loops but all on 1 line.

Just like `for` loops, the "setup", "condition", and "update" are handled in the same `for`-`in` syntax. However, the body is split into 2 new sections:
1. the output
  * the item to be appended to the new list
2. the filter
  * the `if` statement/condition deciding which items in the old list

In [None]:
rand = [4, 7, 1, 9, 3, 5, 2, 0, 8, 6, 1, 8, 2, 8, 3, 8, 4, 6, 2, 6]

under3_indexes = [
  i                         # output
  for i in range(len(rand)) # setup, condition, update (normal "for" loop stuff)
  if rand[i] < 3            # filter
]
print(under3_indexes)

Complex comprehensions can be split across multiple lines for readability.

In [None]:
rand = [4, 7, 1, 9, 3, 5, 2, 0, 8, 6, 1, 8, 2, 8, 3, 8, 4, 6, 2, 6]

# the output can include entire expressions
under3_squared = [num**2 for num in rand if num < 3]
print(under3_squared)

# or even an "if" statement if the output depends on some condition
peepoo = [
  "pee" if num < 5 else "poo" # modifying output with "if" statement
  for num in rand
]
print(peepoo)

Due to their similarities, the process for list comprehensions and (`for`) loops is very similar. If you're struggling on how to write a comprehension, you can write it as a loop first, then translate it. However, this also means that comprehensions are totally optional.

In [None]:
rand = [4, 7, 1, 9, 3, 5, 2, 0, 8, 6, 1, 8, 2, 8, 3, 8, 4, 6, 2, 6]

under3_squared_loop = []
for num in rand:
  if num < 3:
    under3_squared_loop.append(num**2)
print(under3_squared_loop)

under3_squared_comp = [num**2 for num in rand if num < 3]
print(under3_squared_comp)