# Python Review: Data Structures and Functions

Reading:  *Python for Data Analysis*, Sections 3.1 and 3.2 from Chapter 3

# Tuples

A tuple is an immutable sequence of Python objects.  Tuples can be created with a comma-separated sequence of values.  Parentheses are not requires, but can be used for readability.   The keyword ``tuple`` is used to convert another object to a tuple.


In [None]:
tup = 1, 2, 3
print(tup)

In [None]:
another_tup = (4,5,6,7)
print(another_tup)

In [None]:
tuple_from_list = tuple([1,2,3])

In [None]:
# new tuples can be created from existing tuples
tup + another_tup

In [None]:
tup * 3

## Methods for tuples

Since tuples are immutable, there aren't as many methods for tuples as for, say lists. 

``tup.count(x)`` will count the number of occurences of ``x``.  
``tup.index(x)`` will return the first index of ``x`` and will give an error if ``x`` is not present

In [None]:
tup = (1,1,3,2)
tup.count(1)

In [None]:
tup.index(1)

In [None]:
tup.index(4)

## Tuple unpacking

One of the great things about using tuples is their ability to be *unpacked*.  Tuple unpacking is when we extract values of the tuple into other variables.

In [None]:
tup = (12, 7, 19, (1, 3, 5))
a, b, c, d = tup

In [None]:
print(a)

In [None]:
print(d)

In [None]:
a, b, *rest = tup
print(b)

# Lists

A list is a mutable sequence of Python objects.  Lists are defined with square brackets ``[`` ``]``.  The keyword ``list`` is used to convert another object (such as a tuple or a generator) to a list.

In [None]:
my_list = [1, 2, 3]

In [None]:
another_list = list(tup)

In [None]:
print(another_list)

In [None]:
list(range(0,31,2))

In [None]:
#  a list can also be a list of lists
list_of_lists = [[1,2,3], [4,5,6], [7,8,9]]
print(list_of_lists)

In [None]:
list_of_lists[0] = 'Hello'
list_of_lists

## Common Methods for Lists

Lists are mutable, meaning they can be changed after they are created.  Here are a few of the common methods for lists. (Note that most of these methods will change the list.)  All the methods can be shown by typing ``dir(list_object)`` or ``list_object.<tab>``

The ``.index`` and ``.count`` work similar to the tuple methods.  Other common methods include:

* ``my_list.append(x) #append an object to the end of a list``  
* ``my_list.insert(index, x) #insert an object at a specific location (index will be the index of the inserted item)``   
* ``my_list.extend(list) #append another list to a list``    
* ``my_list.remove(value) #remove the first occurrence of value``   
* ``my_list.pop(index) #remove and return the item at index (default is index = -1, the last element)``  
* ``my_list.reverse() #reverse the list``    
* ``my_list.sort() # sort the list ``

In [None]:
my_list = [1, 3, 5, 7, 9, 12]

In [None]:
# add the string "fun" to the end of my_list
my_list.append('fun')

In [None]:
print(my_list)

In [None]:
# add the string "data science" between the numbers 3 and 5


In [None]:
# append the the strings "i", "love", and "fun" to the end of my_list in one line of code.


In [None]:
# remove the first "fun"


In [None]:
# remove and return the last element of my_list


In [None]:
# remove and return the number 5 in my_list


In [None]:
print(my_list)

In [None]:
#  sort my_list.  Does it make sense to sort this list?   

In [None]:
my_numbers = [34, 0, 1, 2, 5, 3, 1, 13, 8, 21]
my_numbers.sort()
print(my_numbers)

In [None]:
#  Lists of strings will be sorted alphabetically, with capital letters first, then lower case
my_words = ["I", "Love", "data", "science", "It", "is", "Fun"]
my_words.sort()
print(my_words)

Note that there is also a function called sorted that will sort a list without changing the list.

In [None]:
my_words = ["I", "Love", "data", "science", "It", "is", "Fun"]
sorted(my_words)

In [None]:
print(my_words)

## Indexing and Slicing

Before moving on to dictionaries, let's talk about indexing and slicing.  As we've said, the index for Python objects begins at 0.  Indexing of most (maybe all) indexable objects is done with bracket notation.  Python also supports negative indexing, so index -1 is the last element, -2 is the second to last, etc.

In [None]:
my_string = "James Hathaway"
my_list = [0, 1, 1, 2, 3, 5, 8, 21, 34]
my_tuple = (4, 5, 6)

In [None]:
my_string[0]

In [None]:
my_list[-2]

In [None]:
my_tuple[3]

Slicing means selecting sections of objects.  Basic form of slice notation is ``start:stop``.  Note that the ``start`` index is included, but the ``stop`` index is excluded.  The ``start`` and ``stop`` indexes can be omitted, in which case they default to the start and end of the sequence, respectively.

A third argument will add a step size, ``start:stop:step``.   

In [None]:
# Get the word "James" from my_string

In [None]:
# From my_list, get the numbers 5, 8, 21, and 34

In [None]:
# From my_list, get the numbers 1, 1, 2, 3

In [None]:
# From my_string, get every other character


In [None]:
# reverse my_list (without changing it)


# Dictionaries

Dictionaries or ``dict`` objects are mappings, which are a collection of unordered objects that are stored by a key.  These are similar to hash tables from other languages.

## Constructing a Dictionary

Dictionaries can be constructed with curly braces ``{key:value,...}``.    The *values* of a ``dict`` object can be any Python object, but the *keys* generally should to be scalars, like integers or strings.  (The keys actually need to be hashable objects which will be true for most immutable objects).

In [None]:
my_dict = {'key1': 12, 'key2': 13, 'key3': 14}

In [None]:
dict_two = {1:'Country',2:'County',3:'City', 4:'Zip', 5:'Street'}

In [None]:
another_dict = {'A': 456, 'B': [1, 2 ,3], 'C': (42, 'Answer')}

In [None]:
# call items from the dictionary with the key
dict_two[4]

In [None]:
another_dict['C']

In [None]:
another_dict['C'][1]

In [None]:
# Check if a dict contains a certain key
1 in dict_two

In [None]:
1 in another_dict

Dictionaries can also be constructed from a sequences of 2-tuples or built up from an empty dictionary.

In [None]:
tups = (('Dog','Bark'), ('Duck','Quack'), ('Cow','Moo'), ('Cat','Meow'))
tups

In [None]:
dict(tups)

In [None]:
d = {}
d

In [None]:
# Built from empty
d = {}
d['Dog'] = 'Bark'
d['Duck'] = 'Quack'
d

It is somewhat common to build up a dictionary in a loop. For example, let's build a dictionary that takes a list of words and categorizes them alphabetically.

In [None]:
list_of_words = ['blue', 'red', 'Read', 'apple', 'Baseball', 'bear', 'car']

In [None]:
d = {}
for word in list_of_words:
    first_letter = word[0].lower()
    if first_letter in d:
        d[first_letter].append(word)
    else:
        d[first_letter] = [word]

d

## Some Methods for Dictionaries

In [None]:
# return a list of keys
d.keys()

In [None]:
# return a list of values
d.values()

In [None]:
# return tuples of keys and values
d.items()

# Sets

A set is an unordered collection of unique elements.  Sets can be created with the ``set`` function or with curly braces.

In [None]:
A = set([1,2,3,3,4,5])
A

In [None]:
B = {2, 3, 2, 3, 2, 3, 4}
B

Sets support mathematical set operations like union, intersection, difference, etc.  

# List Comprehensions

List comprehensions are a way to create a list from another list (or collection) that might normally need to be created with a ``for`` loop. 

The general form of a list comprension is 
~~~~
[expression for val in collection if condition]
~~~~
and is equivalent to 
~~~~
result = []
for val in collection:
    if condition:
        result.append(expression)
~~~~
(the filter condition can optionally be left out if not required.)

For example, find a list of numbers from 1 to 100 that are divisible by 7.

In [None]:
result = []
for num in range(1,101):
    if num % 7 == 0:
        result.append(num)
result

In [None]:
# as a list comprehension
# [expression for val in collection if condition]
result = [num**2 for num in range(1,100) if num % 7 == 0]
result

## Dictionary and Set Comprehensions

Dictionary and sets can be created in a similar way.

~~~~
# dictionary comprehension
d = {key-expression : value-expression for value in collection if condition}

# set comprehesnion
s = {expression for val in collection if condition}
~~~~

As an example, let's create a dictionary where the key is a number between 1 and 20 that is divisible by 2 and the value is a list of the square and cube of the number.   

In [None]:
d = {}
for num in range(1,21):
    if num % 2 == 0:
        d[num] = [num**2, num**3]
d

In [None]:
# as a dictionary comprehension
{num: [num**2, num**3] for num in range(1,21) if num % 2 == 0}

# Functions

A function is a way to organize and reuse code.  Functions in Python are created with the ``def`` statement and take the following form:

~~~~
def name_of_function(arg1, arg2, more_args):
    '''
    Docstring that describes what the function does
    '''
    
    expression
    return
~~~~

In [None]:
def long_name(name):
    '''
    This function takes a name as input and returns True if the name is long
    and False if the name is short.
    '''
    val = 1 if len(name) > 7 else 0
    return bool(val)

    

In [None]:
long_name('Shannon')

In [None]:
long_name('Stephanie')

In [None]:
def long_name_n(name, n):
    '''
    This function returns True if name has greater than n letters and False otherwise
    '''
    val = 1 if len(name) > n else 0
    return bool(val)


In [None]:
long_name_n('Shannon', 3)

In [None]:
long_name_n('Shannon', 3)

In [None]:
long_name_n('Frank', 5)

In [None]:
def long_name_n(name, n=7):
    '''
    This function returns True if name has greater than n letters and False otherwise
    '''
    val = 1 if len(name) > n else 0
    return bool(val)


In [None]:
long_name_n('Shannon')

In [None]:
long_name_n('Shannon', 3)

# Lambda Expressions (Anonymous Functions)

Lambda expressions are single statement functions that are so-called anonymous because they aren't given a name.  They are declared with the keyword ``lambda``.  Lambda expressions are of the form

~~~~
lambda argument(s) : expression
~~~~

There are some functions (two examples are the ``map`` and ``filter`` functions) that take as input another function.  A lambda expression is a way to define an input function without actually defining a formal function.  It is like a temporary function that will only be used once.

In [None]:
my_names = ['Lewis', 'Hathaway', 'Hobson', 'Innocent']

In [None]:
def reverse_name(name):
    return name[::-1]

list(map(reverse_name, my_names))

In [None]:
list(map(lambda y : y[::-1], my_names))

In [None]:
long_name()