# 01_02: Lists, tuples, and the slicing syntax

This is a review of lists and tuples, which are maybe the quintessential Python data structures. While being extrememly useful on their own, they are foundational for data science as they set the standard interface for accessing elements and ranges of elements but they're all in an index.

Being able to select only a specific portion of the list, string, or tuple is referred to as slicing in Python. You will remember it from Matlab, Slicing is when we do something like this to a matrix:
sequence[start:stop:step]
NumPy uses the same interface; NumPy being the most important library that is able to manipulate large amounts of numerical data.

Lists in Python provide a way to store an arbitrary number of Python objects, such as strings, floating point numbers, other lists, or any other object, all accessible using a numerical index. 

In [40]:
import math
import collections
import dataclasses
import datetime

import numpy as np
import pandas as pd
import matplotlib.pyplot as pp  

In [41]:
#In Python, Lists are denoted by brackets, and their elements are separated by commas
nephews = ['Huey', 'Dewey', 'Louie']

In [42]:
nephews

['Huey', 'Dewey', 'Louie']

In [43]:
# len provides us with the length of the list
len(nephews)

3

In [44]:
# The empty list, [], has a length of zero, obviously
len([])

0

LIST INDEXING AND SLICING

Individual list elements can be accessed by index, starting with zero for the first element and ending at the length of the list minus one


In [45]:
#first element
nephews[0]

'Huey'

In [46]:
#third element
nephews[2]

'Louie'

In [47]:
#this would be the fourth entry in the list, but since there are only 3 and indexing starts at 0, not 1, this throws an error
nephews[3]

IndexError: list index out of range

In [None]:
#last element
nephews[-1]

'Louie'

In [None]:
#second-to-last element
nephews[-2]

'Dewey'

In [None]:
#bracket indexing can also be used to reassign elements 
for i in range(3):
  nephews[i] = nephews[i] + ' Duck'

In [None]:
nephews #That inserted ' Duck' after the name of each duck in the list

['Huey Duck', 'Dewey Duck', 'Louie Duck']

In [None]:
#another point is that lists do not need to have homogenous content; a list does not need to be all strings or all numbers
mix_it_up = [1, [1, 2], 'alpha']

In [None]:
mix_it_up

[1, [1, 2], 'alpha']

In [None]:
#The in operator allows to verify that certain items exist inside of a list
#This will return False since we changed the first entry from 'Huey' to 'Huey Duck', so 'Huey' is no longer in the list that is nephews
'Huey' in nephews

False

In [None]:
#This will return true because 'Huey Duck' is an element in our list
'Huey Duck' in nephews

True

In [None]:
#append allows us to add a single item to the end of a list
nephews.append('April Duck')

As you can see here we're using Python in an object-oriented way by accessing a method, specifically append, of the list object. It sounds sophisticated, but it's actually very natural

In [None]:
nephews

['Huey Duck', 'Dewey Duck', 'Louie Duck', 'April Duck']

In [None]:
# To add multiple elements in one go, we can use extend
nephews.extend(['May Duck', 'June Duck'])

In [None]:
nephews

['Huey Duck',
 'Dewey Duck',
 'Louie Duck',
 'April Duck',
 'May Duck',
 'June Duck']

In [None]:
# In order to concatenate two lists we can just use a plus sign
ducks = nephews + ['Donald Duck', 'Daisy Duck']

In [None]:
ducks

['Huey Duck',
 'Dewey Duck',
 'Louie Duck',
 'April Duck',
 'May Duck',
 'June Duck',
 'Donald Duck',
 'Daisy Duck']

In [None]:
#Last, we can insert elements at any position in the list using the insert method
ducks.insert(0, 'Scrooge McDuck')

In [None]:
ducks

['Scrooge McDuck',
 'Huey Duck',
 'Dewey Duck',
 'Louie Duck',
 'April Duck',
 'May Duck',
 'June Duck',
 'Donald Duck',
 'Daisy Duck']

We've seen how to build up lists, now let's break them down

In [None]:
#WE can delete elements either by their index, with del...
del ducks[0]

In [None]:
ducks

['Huey Duck',
 'Dewey Duck',
 'Louie Duck',
 'April Duck',
 'May Duck',
 'June Duck',
 'Donald Duck',
 'Daisy Duck']

In [None]:
#..or by their value, with remove
ducks.remove('Donald Duck')

In [None]:
#This removed Uncle Donald
ducks

['Huey Duck',
 'Dewey Duck',
 'Louie Duck',
 'April Duck',
 'May Duck',
 'June Duck',
 'Daisy Duck']

In [None]:
#If we wanted our list sorted, we could do this in place modifying the existing list with sort...
ducks.sort()

In [None]:
ducks

['April Duck',
 'Daisy Duck',
 'Dewey Duck',
 'Huey Duck',
 'June Duck',
 'Louie Duck',
 'May Duck']

In [None]:
#...Or we can make a new sorted list out of an existing one with sorted.
reverse_ducks = sorted(ducks, reverse=True)

In [None]:
reverse_ducks

['May Duck',
 'Louie Duck',
 'June Duck',
 'Huey Duck',
 'Dewey Duck',
 'Daisy Duck',
 'April Duck']

In [None]:
# It's also very easy to loop over a list
for duck in ducks: #indices aren't even necessary
    print(duck, "quacks!")

April Duck quacks!
Daisy Duck quacks!
Dewey Duck quacks!
Huey Duck quacks!
June Duck quacks!
Louie Duck quacks!
May Duck quacks!


Moving on to slices

In [None]:
#Beyond working with individual list elements, we can manipulate them in contiguous groups.
#We'll take a numerical example, the first few squares of the natural numbers.
squares = [1, 4, 9, 16, 25, 36, 49]

In [None]:
#The convention to select the slice in Python is the same for loops
# The first index is included, the last index is not.
squares[0:2]
#It pays to imagine that the indices sit on the edges of the lements. So if we anted the first two squares, like above, we'd write a slice that goes from zero to two like above

[1, 4]

There's a few more tricks and shortcuts that we can use in slicing.

In [None]:
#If we omit the first it gives us all leading up to the second index
squares[:4]

[1, 4, 9, 16]

In [49]:
#If we omit the second index, it gives us everything following the first index
squares[3:]

[16, 25, 36, 49]

In [50]:
#When slicing, if we omit both indices, then we get the whole list
squares[:]

[1, 4, 9, 16, 25, 36, 49]

In [53]:
#This allows us to move through the indices in steps
#This gives us every other entry between the first and the sixth entry
squares[0:7:2]

[1, 9, 25, 49]

In [52]:
#Negative indices allow us to count backwards from the end
squares[-3:-1]
#This gives us the third-to-last through last elements of the list

[25, 36]

In [54]:
# reverse the list!
#This works because the order of entries for stepping is [(start):(end):(step)]
#The negative step moves us through the list backwards
squares[::-1]

[49, 36, 25, 16, 9, 4, 1]

In [55]:
#Slices also allow us to reassign a subset of the larger list inline
squares[2:4] = ['four', 'nine']

In [56]:
squares

[1, 4, 'four', 'nine', 25, 36, 49]

In [57]:
# We can also use slices to delete
del squares[4:6]

In [58]:
squares

[1, 4, 'four', 'nine', 49]

Now for tuples. They look like lists but with parenthesis instead of brackets, Tuples are sometimes described as immutable versions of lists. ONce a tuple is defined, we cannot modify its elements or add new ones.

This is a feature, not a bug. It ensures the integrity and makes it possible to usetuples as keys in dictionaries or indices.

Nevertheless, we can perform the same indexing and slicing tricks as for lists, just not assignment.

One context where we see tuples often in Python is triple unpacking, where Python statements or expressions are automatically evaluated in parallel over a tuple.

In [59]:
integers = ('one', 'two', 'three', 'four')

In [60]:
integers

('one', 'two', 'three', 'four')

In [61]:
integers[-1], integers[1:3]

('four', ('two', 'three'))

In [63]:
#We cannot reassign after definition
integers[0] = 1

TypeError: 'tuple' object does not support item assignment

In [None]:
# Triple Unpacking
# This is allowed since the in/out is a tuple
(a, b) = (1, 2)

In [69]:
#As mentioned above, parentheses can be omitted, so this has the same effect
c, d = 3, 4

In [68]:
print(a)
print(b)
print(c)
print(d)

1
2
3
4


In [None]:
#Tuples also appear when we iterate over multiple variables at once
#enumerate makes a tuple out of the indices of the items in a list and the items in the list
#This allows us to iterate over both concurrently
for i, duck in enumerate(ducks):
    print(i, duck)

0 April Duck
1 Daisy Duck
2 Dewey Duck
3 Huey Duck
4 June Duck
5 Louie Duck
6 May Duck


In [71]:
#One final useful trick is unpacking a tuple to pass it to a function that requires multiple arguments.
def print_three_args(a, b, c):
    print(a, b, c)

In [None]:
#we save the arguments to a tuple
my_args = (1,2,3)

In [None]:
#We unpack those arguments into the function using an asterisk
print_three_args(*my_args)

1 2 3


In [74]:
#That star function can also be used to define arguments with a variable number of arguments
def any_args(*args):
    print(args)

In [76]:
#Same thing, but we could do the same now with (1,2) or (1,2,3,4,5)
any_args(1,2,3)

(1, 2, 3)
