# Python Review: Data Structures and Functions

Reading:  *Python for Data Analysis*, Sections 3.1 and 3.2 from Chapter 3

# Tuples

A tuple is an immutable sequence of Python objects.  Tuples can be created with a comma-separated sequence of values.  Parentheses are not required, but can be used for readability.   The keyword ``tuple`` is used to convert another object to a tuple.


In [6]:
tup = 1, 2, 3
print(tup)

(1, 2, 3)


In [7]:
another_tup = (4,5,6,7)
print(another_tup)

(4, 5, 6, 7)


In [8]:
tuple_from_list = tuple([1,2,3])

In [9]:
# new tuples can be created from existing tuples
tup + another_tup

(1, 2, 3, 4, 5, 6, 7)

In [10]:
tup * 3

(1, 2, 3, 1, 2, 3, 1, 2, 3)

In [11]:
# Tuples are immutable.  Individual elements can't be reassigned
tup[0] = 1

TypeError: 'tuple' object does not support item assignment

## Methods for tuples

Since tuples are immutable, there aren't as many methods for tuples as for, say lists. 

``tup.count(x)`` will count the number of occurences of ``x``.  
``tup.index(x)`` will return the first index of ``x`` and will give an error if ``x`` is not present

In [None]:
tup = (1,1,3,2)
tup.count(1)

In [None]:
tup.index(1)

In [None]:
tup.index(4)

## Tuple unpacking

Tuple unpacking in Python refers to the process of extracting individual elements from a tuple and assigning them to separate variables. This allows you to quickly and easily access the values stored in a tuple without having to access them one by one using indexing. Tuple unpacking is also known as tuple destructuring.

Here's a simple example of tuple unpacking:

```python
my_tuple = (1, 2, 3)
a, b, c = my_tuple

print(a)  # Output: 1
print(b)  # Output: 2
print(c)  # Output: 3
```

In this example, the values 1, 2, and 3 from the `my_tuple` are unpacked and assigned to the variables `a`, `b`, and `c` respectively.

Tuple unpacking is often used in scenarios where functions return multiple values as a tuple, and you want to assign those values to meaningful variable names for easier comprehension and use in your code.

In [2]:
tup = (12, 7, 19, (1, 3, 5))
a, b, c, d = tup

In [3]:
print(a)

12


In [4]:
print(d)

(1, 3, 5)


In [5]:
a, b, *rest = tup
print(b)

7


# Lists

A list is a mutable sequence of Python objects.  Lists are defined with square brackets ``[`` ``]``.  The keyword ``list`` is used to convert another object (such as a tuple or a generator) to a list.

In [12]:
my_list = [1, 2, 3]

In [13]:
another_list = list(tup)

In [14]:
print(another_list)

[1, 2, 3]


In [15]:
list(range(0,31,2))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30]

In [16]:
#  a list can also be a list of lists
list_of_lists = [[1,2,3], [4,5,6], [7,8,9]]
print(list_of_lists)

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]


In [17]:
# list are mutable and individual elements can be reassigned
list_of_lists[0] = 'Hello'
list_of_lists

['Hello', [4, 5, 6], [7, 8, 9]]

## Common Methods for Lists

Lists are mutable, meaning they can be changed after they are created.  Here are a few of the common methods for lists. (Note that most of these methods will change the list.)  All the methods can be shown by typing ``dir(list_object)`` 

The ``.index`` and ``.count`` work similar to the tuple methods.  Other common methods include:

* `my_list.append(x)`: *append an object to the end of a list*  
* ``my_list.insert(index, x)``: *insert an object at a specific location (index will be the index of the inserted item)*   
* ``my_list.extend(list)``: *append another list to a list*   
* ``my_list.remove(value)``: *remove the first occurrence of value*  
* ``my_list.pop(index)``: *remove and return the item at index (default is index = -1, the last element)*
* ``my_list.reverse()``: *reverse the list*   
* ``my_list.sort()``: *sort the list* 

In [23]:
my_list = [1, 3, 5, 7, 9, 12]

In [24]:
# add the string "fun" to the end of my_list
my_list.append('fun')
print(my_list)

[1, 3, 5, 7, 9, 12, 'fun']


In [25]:
# add the string "data science" between the numbers 3 and 5
my_list.insert(2, "data science")
print(my_list)

[1, 3, 'data science', 5, 7, 9, 12, 'fun']


In [26]:
# append the the strings "i", "love", and "fun" to the end of my_list in one line of code.
my_list.extend(["i", " love", "fun"])
print(my_list)

[1, 3, 'data science', 5, 7, 9, 12, 'fun', 'i', ' love', 'fun']


In [27]:
# Notice the difference between "extend" and "append".
# If we had used append:
my_list.append(["i", " love", "fun"])
print(my_list)

# appends takes the entire object and appends it to the list
# extend adds each element to the old list

[1, 3, 'data science', 5, 7, 9, 12, 'fun', 'i', ' love', 'fun', ['i', ' love', 'fun']]


In [28]:
my_list.pop()
print(my_list)

[1, 3, 'data science', 5, 7, 9, 12, 'fun', 'i', ' love', 'fun']


In [29]:
# remove the first "fun"
my_list.remove('fun')
print(my_list)

[1, 3, 'data science', 5, 7, 9, 12, 'i', ' love', 'fun']


## Indexing and Slicing

Before moving on to dictionaries, let's talk about indexing and slicing.  As we've said, the index for Python objects begins at 0.  Indexing of most (maybe all) indexable objects is done with bracket notation.  Python also supports negative indexing, so index -1 is the last element, -2 is the second to last, etc.

In [30]:
my_string = "James Hathaway"
my_list = [0, 1, 1, 2, 3, 5, 8, 21, 34]
my_tuple = (4, 5, 6)

In [31]:
my_string[0]

'J'

In [32]:
my_list[-2]

21

In [33]:
my_tuple[3]

IndexError: tuple index out of range

Slicing means selecting sections of objects.  Basic form of slice notation is ``start:stop``.  Note that the ``start`` index is included, but the ``stop`` index is excluded.  The ``start`` and ``stop`` indexes can be omitted, in which case they default to the start and end of the sequence, respectively.

A third argument will add a step size, ``start:stop:step``.   

In [34]:
# Get the word "James" from my_string
my_string[0:5]

'James'

In [36]:
# From my_list, get the numbers 5, 8, 21, and 34
my_list[5:]

[5, 8, 21, 34]

In [38]:
# From my_list, get the numbers 1, 1, 2, 3
my_list[1:5]

[1, 1, 2, 3]

In [41]:
# From my_string, get every other character
my_string[::2]

'JmsHtaa'

In [42]:
# reverse my_list (without changing it)
my_list[::-1]

[34, 21, 8, 5, 3, 2, 1, 1, 0]

# Dictionaries

Dictionaries or ``dict`` objects are mappings, which are a collection of unordered objects that are stored by a key.  These are similar to hash tables from other languages.

## Constructing a Dictionary

Dictionaries can be constructed with curly braces ``{key:value,...}``.    The *values* of a ``dict`` object can be any Python object, and the *keys* generally scalars, like integers or strings.  

In [43]:
my_dict = {'key1': 12, 'key2': 13, 'key3': 14}

In [44]:
dict_two = {1:'Country',2:'County',3:'City', 4:'Zip', 5:'Street'}

In [45]:
another_dict = {'A': 456, 'B': [1, 2 ,3], 'C': (42, 'Answer')}

In [46]:
# call items from the dictionary with the key
dict_two[4]

'Zip'

In [47]:
another_dict['C']

(42, 'Answer')

In [48]:
another_dict['C'][1]

'Answer'

In [49]:
# Check if a dict contains a certain key
1 in dict_two

True

In [50]:
1 in another_dict

False

Dictionaries can also be constructed from a sequences of 2-tuples or built up from an empty dictionary.

In [51]:
tups = (('Dog','Bark'), ('Duck','Quack'), ('Cow','Moo'), ('Cat','Meow'))
tups

(('Dog', 'Bark'), ('Duck', 'Quack'), ('Cow', 'Moo'), ('Cat', 'Meow'))

In [52]:
dict(tups)

{'Dog': 'Bark', 'Duck': 'Quack', 'Cow': 'Moo', 'Cat': 'Meow'}

In [53]:
d = {}
d

{}

In [54]:
# Built from empty
d = {}
d['Dog'] = 'Bark'
d['Duck'] = 'Quack'
d

{'Dog': 'Bark', 'Duck': 'Quack'}

It is somewhat common to build up a dictionary in a loop. For example, let's build a dictionary that takes a list of words and categorizes them alphabetically.

In [55]:
list_of_words = ['blue', 'red', 'Read', 'apple', 'Baseball', 'bear', 'car']

In [56]:
d = {}
for word in list_of_words:
    first_letter = word[0].lower()
    if first_letter in d:
        d[first_letter].append(word)
    else:
        d[first_letter] = [word]

d

{'b': ['blue', 'Baseball', 'bear'],
 'r': ['red', 'Read'],
 'a': ['apple'],
 'c': ['car']}

## Some Methods for Dictionaries

In [None]:
# return a list of keys
d.keys()

In [None]:
# return a list of values
d.values()

In [None]:
# return tuples of keys and values
d.items()

# Sets

A set is an unordered collection of unique elements.  Sets can be created with the ``set`` function or with curly braces.

In [None]:
A = set([1,2,3,3,4,5])
A

In [None]:
B = {2, 3, 2, 3, 2, 3, 4}
B

Sets support mathematical set operations like union, intersection, difference, etc.  

# List Comprehensions

List comprehensions are a way to create a list from another list (or collection) that might normally need to be created with a ``for`` loop. 

The general form of a list comprension is 
~~~~
[expression for val in collection if condition]
~~~~
and is equivalent to 
~~~~
result = []
for val in collection:
    if condition:
        result.append(expression)
~~~~
(the filter condition can optionally be left out if not required.)

For example, find a list of numbers from 1 to 100 that are divisible by 7.

In [1]:
result = []
for num in range(1,101):
    if num % 7 == 0:
        result.append(num)
result

[7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84, 91, 98]

In [2]:
# as a list comprehension
# [expression for val in collection if condition]
result = [num for num in range(1,100) if num % 7 == 0]
result

[7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84, 91, 98]

## Dictionary and Set Comprehensions

Dictionary and sets can be created in a similar way.

~~~~
# dictionary comprehension
d = {key-expression : value-expression for value in collection if condition}

# set comprehesnion
s = {expression for val in collection if condition}
~~~~

As an example, let's create a dictionary where the key is a number between 1 and 20 that is divisible by 2 and the value is a list of the square and cube of the number.   

In [1]:
d = {}
for num in range(1,21):
    if num % 2 == 0:
        d[num] = [num**2, num**3]
d

{2: [4, 8],
 4: [16, 64],
 6: [36, 216],
 8: [64, 512],
 10: [100, 1000],
 12: [144, 1728],
 14: [196, 2744],
 16: [256, 4096],
 18: [324, 5832],
 20: [400, 8000]}

In [2]:
# as a dictionary comprehension
{num: [num**2, num**3] for num in range(1,21) if num % 2 == 0}

{2: [4, 8],
 4: [16, 64],
 6: [36, 216],
 8: [64, 512],
 10: [100, 1000],
 12: [144, 1728],
 14: [196, 2744],
 16: [256, 4096],
 18: [324, 5832],
 20: [400, 8000]}

# Functions

A function is a way to organize and reuse code.  Functions in Python are created with the ``def`` statement and take the following form:

~~~~
def name_of_function(arg1, arg2, more_args):
    '''
    Docstring that describes what the function does
    '''
    
    expression
    return
~~~~

In [None]:
def long_name(name):
    '''
    This function takes a name as input and returns True if the name is long
    and False if the name is short.
    '''
    val = 1 if len(name) > 7 else 0
    return bool(val)

    

In [None]:
long_name('Shannon')

In [None]:
long_name('Stephanie')

In [None]:
def long_name_n(name, n):
    '''
    This function returns True if name has greater than n letters and False otherwise
    '''
    val = 1 if len(name) > n else 0
    return bool(val)


In [None]:
long_name_n('Shannon', 3)

In [None]:
long_name_n('Shannon', 3)

In [None]:
long_name_n('Frank', 5)

In [None]:
def long_name_n(name, n=7):
    '''
    This function returns True if name has greater than n letters and False otherwise
    '''
    val = 1 if len(name) > n else 0
    return bool(val)


In [None]:
long_name_n('Shannon')

In [None]:
long_name_n('Shannon', 3)

# Lambda Expressions (Anonymous Functions)

Lambda expressions are single statement functions that are so-called anonymous because they aren't given a name.  They are declared with the keyword ``lambda``.  Lambda expressions are of the form

~~~~
lambda argument(s) : expression
~~~~

There are some functions (two examples are the ``map`` and ``filter`` functions) that take as input another function.  A lambda expression is a way to define an input function without actually defining a formal function.  It is like a temporary function that will only be used once.

In [None]:
my_names = ['Lewis', 'Hathaway', 'Hobson', 'Innocent']

In [None]:
def reverse_name(name):
    return name[::-1]

list(map(reverse_name, my_names))

In [None]:
list(map(lambda y : y[::-1], my_names))

In [3]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
