# Data Structures in Python
***
**Josh Wilkins<br>3/14/2018**

In [1]:
# CSS

from IPython.core.display import HTML
styles = open("custom.css", "r").read()
HTML(styles)

## Context
***
This is an intro into python data structures including lists, strings, numpy arrays, and more.<br>
Others are encouraged to improve and add onto this document; Add each new section in the block just prior to Revision History<br>
Use a Double-Point (##) header to automatically add it to the table of contents.

Current document location: \\bevfs01\Engineering_Projects\Training\Jupyter\Structures\

## Useful References
***
- [Introduction to Python - The Python Guru](http://thepythonguru.com/)
- [Python Tutorial at Tutorials Point](https://www.tutorialspoint.com/python/index.htm)
- [Exception Handling](https://stackoverflow.com/questions/16138232/is-it-a-good-practice-to-use-try-except-else-in-python)

## Numbers
***
### Numeric Types

- **int** (signed integers): Integers or ints are positive or negative whole numbers with no decimal point
- **long** (long integers): Also called longs, they are integers of unlimited size, written like integers and followed by an uppercase or lowercase L
- **float** (floating point): Floats are written with a decimal point. Floats may also be in scientific notation, with e or E indicating the power of 10
- **complex** (complex numbers): In the form a + bj, where a and b are floats and j (or J) is the imaginary number $\sqrt{-1}$.

In [None]:
# Assignments types are implicit, no need to declare it as an int
x = 3  # Ints are signed integers
print type(x)

In [None]:
# Any integer needing more precision is a long integer
x = 9876543210
print type(x)

In [None]:
# Anything with a decimal is recognized as a float
x = 3.
print type(x)

In [None]:
# Complex numbers represented with a j
x = 1 + 2j
x = complex(1, 2)  # Can also be made by casting two numbers with complex()
print type(x)

# Can grab the real and imaginary parts of a complex number
print x.real
print x.imag

### Casting
Casting is the process of coercing a number explicitly from one type to another

In [None]:
# Changing data types is fairly straightforward
x = 2  # Is an integer
print type(x)

# Convert to a float
x = float(x)
print type(x)

# Convert to a long integer
x = long(x)
print type(x)

# Convert to a complex number
x = complex(x) # Same as complex(x, 0)
print x
print type(x)

### Numeric Functions
A list of the most useful mathematical expressions in python (not extensive). Can either use the numpy package or the math package.<br>
[Here is a more extensive list](https://www.tutorialspoint.com/python/python_numbers.htm)
- **abs(x)**: Absolute value
- **exp(x)**: $e^x$
- **log(x)**: Log base e of x
- **log10(x)**: Log base 10 of x
- **sqrt(x)**: $\sqrt{x}$
- **round(x, n)**: Rounds a number to n places

In [None]:
import math as m
import numpy as np

x = -5.555
x = abs(x)
print x  # Absolute value of x
print m.exp(x)    # e^x
print np.log(x)   # log(x)
print m.log10(x)  # log10(x)
print np.sqrt(x)  # Square-root of x
print round(x, 1) # Rounding x to tenths place

### Additional Information

Difference between math and numpy packages:

The **math** package is part of the standard python library. It provides functions for basic mathematical operations as well as some commonly used constants. Use the math package if you are doing simple computations with only with scalars (and no lists or arrays). [Math functions and methods](https://docs.python.org/2/library/math.html)

The **numpy** package is a third party package geared towards scientific computing. It is the defacto package for numerical and vector operations in python. It provides several routines optimized for vector and array computations as a result, is a lot faster for such operations than say just using python lists. Use numpy if you are doing scientific computations with matrices, arrays, or large datasets. [Numpy functions & methods](https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html)

In [None]:
# Integer division

# Careful, if both numbers are integers, the resultant will be an integer
print 1 / 2    # Results in 0 because both numbers are integers
print 1.0 / 2  # But if either number is a float
print 1 / 2.0  # The result will be a float

# Rounding
print round(3.0 / 2.0)

## Lists
***
### Basics
List creation, indexing and slicing

In [None]:
# List creation, indexing, and slicing

# List creation, note the data types do not have to be the same
# But its common for them to be of all one data type
l = [1, '2', 3.0]
print l

# Can also create a repeated list like a list of ones
l = [1] * 4
print l

# Can also cast things to list (not really useful here, but sometimes comes in handy)
l = list([1, 2, 3, 4, 5])
print l

# Indexing
print l[0]  # Python is zero index (first element is l[0])
print l[-1] # Can index from end of list using negative numbers

# Slicing   # Use : to 'slice' list in a from:to format
print l[1:3]

# Last n elements
print l[-2:]  # Can index from end of list using negative numbers

# Up to last element
print l[:-1]

### List Functions
A list of the most useful list functions in python (not extensive)<br>
[Here is a more extensive list](https://www.tutorialspoint.com/python/python_lists.htm)

- **min(list)**: Returns the lowest value in a list
- **max(list)**: Returns the highest value in a list
- **len(list)**: Returns the number of elements in a list
- **sum(list)**: Returns the sum of the elements in a list

In [None]:
# List functions

# Create list to modify
l = [3, 2, 1]
print l

x = min(l)  # Minimum value from list
print "min: " + str(x)  # Can printout by concatenating strings (will get into later)

x = max(l)  # Maximum value from list
print "max: %d" % x  # Or by using this format for printing

x = sum(l)  # Sum of all values in list
print "sum: %s" % x  # Can also cast in printout (valid casts: %d, %f, %s)

x = len(l)  # Length of list (cardinality or number of elements)
print "length: %0.1f elements" % x  # Can specify sigfigs on floats with %X.Xf

### List Methods
A list of the most useful list methods in python (not extensive)<br>
[Here is a more extensive list](https://www.tutorialspoint.com/python/python_lists.htm)

- **append(item)**: Appends an item to a list
- **extend(list)**: Appends a list to a list
- **sort(list)**: Sorts a list lowest to highest value
- **reverse(list)**: Reverses the order of a list

In [None]:
# List methods

# Create list to modify
l = [2, 1]
print l

# Adding elements to a list
l.append(7)
print l

# Adding two lists
l += [6, 5]  # Can add two lists (l = l + [4, 5, 6])
print l

l.extend([4, 3])  # Can also extend the list with another
print l

# Sorting the list
l.sort()  # Can also use sorted(l)
print l

# Reversign a list
l.reverse()
print l

### Aliasing

Assigning a list to another list merely creates a pointer to the original list's memory location and does not create another object. Two variables with the same memory address but different variable names are considered aliased. Aliasing makes python memory and speed efficient; Its basically always using pointers instead of creating new objects every time an assignment is made. Only mutable objects have this aliasing property<br>
See [here](http://henry.precheur.org/python/copy_list.html) for additional information

In [None]:
# Aliasing

# Create list and 'copy' it
l1 = [1, 2, 3]
l2 = l1

# The memory addresses of the two lists are the same, meaning it is the same list
print id(l1)
print id(l2)

# Note l1 and l2 point to the same memory location
# So chaning one will alter the other
l2[0] = 100
print l1

# One way is to create an empty list, then extend it
l2 = []
l2.extend(l1)

# Show that the memory addresses are different
print id(l1)!=id(l2)

# Another way is to just cast list one into another list
l3 = list(l1)
print id(l1)!=id(l3)

### Misc List Info
Nested lists and memory management

In [None]:
# Nested lists

l = [[1], [2, 3], [4]]
print l

# Indexing nested list
print l[1]  # Will return list
print l[1][0] # Will return the first element of that list

# Removing elements or lists from memory the first element in a list
del l[1]  # Use del to remove things from memory
print l

## Sets
***
A set is essentially an unordered collection of elements that is iterable, mutable, and has no duplicate elements.<br>
They seem to be far less common than lists in python.

In [None]:
# Create a set
s = {1, 2, 3, 3, "3"}
print type(s)

# By creating a set, the duplicates are removed
print s

# Sets are mutable
s.add(9)
print s

# Unless they are frozen
s = frozenset(['a', 'b', 'c'])

# Which means they can no longer be changed
# s.add('d') # Would cause an error

# Also sets are not indexable
# print s[0] # Causes error, sets are not indexable

# A useful scenario if you want to remove the duplicates from a list
# Just cast it to a set, then back to a list
l = [1, 2, 3, 3, 3]
l = list(set(l))
print l

### Set Methods

- **union(set)**: Returns the union of two sets. Same as Set1 | Set2
- **intersect(set)**: Returns the intersection of two sets. Same as Set1 & Set2 

## Strings
***
Strings in python are essentially immutable lists.<br>
This means that most of the rules that apply to lists also apply to strings.<br>
Strings are immutable however which means they cannot be changed inplace; To alter a string, a new string must be created.

In [None]:
# String creation, indexing and slicing

# Create string
string = "ABCDEF"

# Essentially treat as a list of characters
# So indexing applies
print string[0]

# Slicing also applies
print string[:3]

# And min/max/len functions work
print len(string)

# But strings are immutable, they cannot be changed
# string[0] = 'Q'  # Would cause an error

### String Methods

A list of the most useful string methods in python (not extensive).<br>
[Here is a more extensive list](https://www.tutorialspoint.com/python/python_strings.htm)<br>

- **upper()**: Makes entire string uppercase
- **lower()**: Makes entire string lowercase
- **title()**: Capitalizes first letter of every word 
- **strip([chars])**: Removes leading and trailing characters
- **replace(char, char)**: Replaces all instances of the specified character with another character
- **split([chars])**: Returns list of strings, split by the specified character

In [None]:
# String methods
string = "My, StRiNg"

# Uppercase
string = string.upper()
print string

# Lowercase
string = string.lower()
print string

# Title format
string = string.title()
print string

# Stripping strings
string = string.strip()  # Removes all leading and trailing whitespace in string

# Replacing characters
string = string.replace(' ', '')  # Removing all whitespace
print string

# Splitting strings
print string.split(',')

# You can also combine methods
string = "My, StRiNg"
print string.title().replace(' ', '').split(',')

### Misc String Info

Identical strings have the same memory address

In [None]:
s1 = "string"
s2 = "string"
print id(s1)
print id(s2)

## Numpy Arrays
***
[Matlab Equivalencies](https://docs.scipy.org/doc/numpy-dev/user/numpy-for-matlab-users.html)<br>
[In-depth Tutorials](https://www.tutorialspoint.com/numpy/index.htm)

In [None]:
# Numpy

import numpy as np

arr1 = np.array([1, 2, 3])
print type(arr1)
print arr1

# Operations are performed element-wise
arr2 = np.array([1, 2, 3]).T
print arr1*arr2
print arr1 + arr2

# The dot product of two arrays. For 2-D vectors, it is the equivalent to matrix multiplication.
print arr1.dot(arr2)

# Matrix product of two arrays
print np.matmul(arr2, arr1)

# Always use numpy arrays instead of lists for any mathematical operations
list1 = [1, 2, 3]
list2 = [1, 2, 3]
print list1 + list2

## Tuples
***
### Basics
A tuple is a sequence of immutable Python objects<br>
The general len/min/max functions work with tuples, indexing and slicing work<br>
[More on tuples here](https://www.tutorialspoint.com/python/python_tuples.htm)

In [None]:
# Creating a tuple
tup1 = ('physics', 'chemistry', 1997, 2000);
tup2 = (1, 2, 3, 4, 5 );
tup3 = "a", "b", "c", "d";
print type(tup1)
print tup1

## Time and DateTime
***
The **time** module handles the collection of time data from the device or platform (See [here](https://docs.python.org/3/library/time.html))<br>
Not to be confused with the **datetime** module that handles time information (See [here](https://docs.python.org/3/library/datetime.html))<br>
For additional date time object help and conversion, try the astropy package [here](http://docs.astropy.org/en/stable/time/index.html)

### Time Module
This module provides various time-related functions.<br>
Although this module is always available, not all functions are available on all platforms.

In [None]:
# Timing a function

import time

start = time.time()
time.sleep(0.25)  # Do something here (example wait in seconds)
end = time.time()

elapsed = end - start
print(elapsed)

### Datetime Module: Time
A time instance only holds values of time, and not a date associated with the time<br>
See [here](https://pymotw.com/2/datetime/) for more information on the datetime module

In [None]:
# The time class in the datetime package

from datetime import time

t = time(1, 2, 3, 4)
print t
print type(t)
print 'hour  :', t.hour
print 'minute:', t.minute
print 'second:', t.second

# Highest resoultion of time object is integer microseconds
print 'microsecond:', t.microsecond

### Datetime Module: Date
A date instance only holds date values and not a time associated with the date.

In [None]:
# The date class in the datetime package

from datetime import date

today = date.today()
print type(today)
print today
print 'Year  :', today.year
print 'Month :', today.month

# The resolution for date objects is integer days
print 'Day   :', today.day

### Datetime Module: Datetime
A datetime instance holds both the date and the time.

In [None]:
# The datetime class in the datetime package

from datetime import datetime

t = datetime.now()
print type(t)
print t
print 'Year  :', today.year
print 'Month :', today.month
print 'Day   :', today.day
print 'hour  :', t.hour
print 'minute:', t.minute
print 'second:', t.second
print 'micro :', t.microsecond

### Datetime Module: Timedelta
Class in datetime module that handles time deltas. Holds only days, seconds, and microseconds data

In [None]:
from datetime import datetime as dt
import time

t1 = dt.now()
time.sleep(0.25)
t2 = dt.now()

elapsed = t2-t1
print elapsed
print type(elapsed)
print elapsed.total_seconds()

## Dictionaries
***
Dictionaries are essentially two lists tied together in a key:value pair.<br>
A more precise way of putting it is a dictionary is a list of length 2 tuples where every key has a corresponding value.<br>
[More on dictionaries here](https://www.tutorialspoint.com/python/python_dictionary.htm)

### Basics
Dictionary creation, keys and values.

In [None]:
# Dictionary creation, keys, and values

# Create Dictionary
d = {'name' : 'Jack', 'age' : 26, 'city' : 'Boston', 'foods' : ['Pizza', 'Spaghetti', 'Onions']}
print type(d)
print d
print d.keys()   # Country
print d.values() # Un-meaningful Value

### Converting Lists
One useful approach to creating a dictionary is to create two separate lists of equal length, convert them into paired tuples, and then convert them into a dictionary. This is typically done with the zip() function.

In [None]:
# Dictionary from two lists

# Create the two lists
keys = ['name', 'age', 'city']
values = ['Jack', 26, 'Boston']

# Create a list of tuples
tup = zip(keys, values)
print type(tup)
print type(tup[0])
print tup

# Then cast it to a dictionary
d = dict(tup)
print d

### Ordered Dictionaries

Dictionaries in python don't normally hold their order.<br>
If you need to maintain order, use an OrderedDict

In [None]:
from collections import OrderedDict
d = OrderedDict(sorted(tup))  # Sorts on key
print d

## Flow Control
***

### Basics
- **break**: Ends the loop
- **continue**: Jump to the next iteration in the loop
- **pass**: Do nothing (no-op)

In [None]:
# Flow Control

# The for loop
for i in range(10):
    if i == 4:
        break  # End the for loop at i == 4
    elif i == 1:
        continue  # Go to next i (in this case 2)
    elif i == 0:
        pass  # Do nothing
    else:
        print i

# A while loop
i = 0
while i < 3:
    if i == 2:
        print "While Loop Success"
    else:
        pass
    i += 1  # Same as i = i + 1

## Iterators
***
### Dictionary Iterations
Can loop through either the dictionary keys, values or both at the same time.

In [None]:
# Looping through dictionaries

d = {'name' : 'Jack', 'age' : 26, 'city' : 'Boston', 'foods' : ['Pizza', 'Spaghetti', 'Onions']}

# Can loop through just the keys or values
# print "Keys:"
# for key in d.keys():
#     print key

# print "\nValues:"
# for val in d.values():
#     print val

# Or over both the keys and the items
for key, val in d.iteritems():
    print "%s : %s" % (key, val)

### List Comprehension
Shorthand iterated list creation while possible addition of conditionals.

In [None]:
# List Comprehension

# Create list
numbers = np.arange(10)  # Integer list from 0-9

# Basic list comprehension, grab first three elements of numbers
new_numbers = [ x for x in numbers[:3] ]
print new_numbers

# Which is equivalent to
new_numbers = []
for x in numbers[:3]:
    new_numbers.append(x)
print new_numbers

# Can add conditional statements to list comprehension
evens = [x for x in numbers if x % 2 == 0]  # Grab just the evens into a list
print evens

# Can also modify the element before it goes into the new list
even_strings = [str(x) for x in numbers if x % 2 == 0]  # Grab just the evens into a list
print even_strings

# Just so its clear, this is equivalent to
even_strings = []
for x in numbers:
    if x % 2 == 0:
        even_strings.append(str(x))
print even_strings

### Misc

The enumerate and range functions

In [None]:
# Enumerate

# Create the list
l = ["a", "b", "c"]

for item in enumerate(l):
    print item  # enumerate creates a tuple

# So you can split the tuple in the for loop
for count, item in enumerate(l):
    print str(count) + " : " + item
    
for i in range(2):
    print i

### Lambda Functions
Creating anonymous functions on the fly<br>
Generally used in conjunction with map(), filter(), and reduce();<br>
because these functions require a function as an argument:
- **map(function, list)**: Applies the function to all the items the list
- **filter(function, list)**: Creates a list of elements for which a function returns true
- **reduce(function, list)**: Applies a rolling computation to sequential pairs of values in a list

In [None]:
arr = np.arange(6)

# Grab even numbers from array
print filter(lambda x: x % 2 == 0, arr)

# Double each element of x
print map(lambda x: x * 2, arr)

# Returns sum of array
print reduce(lambda x, y: x + y, arr)

In [None]:
# Applying multiple functions on the same object

def multiply(x):
    return (x * x)

def add(x):
    return (x + x)

funcs = [multiply, add]
for i in range(4):
    value = list(map(lambda x: x(i), funcs))
    print(value)

In [None]:
# Making anonymous functions on the fly

def make_incrementor (n):
    return lambda x: x + n
 
f = make_incrementor(1)
g = make_incrementor(2)

print f(1), g(2)

### Generators
Generators are iterators that you can only iterate over once<br>
This is because they do not store all the values in memory, they generate a value on the fly<br>
Most of the time generators are implemented as functions, but they do not return a value, they yield a value<br>
**Watch this: [Loop Like a Native](https://nedbatchelder.com/text/iter.html)**

In [None]:
# Generators

def is_prime(number):
    for divisor in range(2, int(number ** 0.5) + 1):
        if number % divisor == 0:
            return False
    return True

def get_primes(max_number):
    number = 1
    while number < max_number:
        number += 1
        if is_prime(number):
            yield number

x = get_primes(8)
print type(x)

for item in x:
    print item

In [None]:
# The list object supports the iterator protocol
l = [1, 2, 3]
i = iter(l)
print i.next()
print i.next()

## Exception Handling
***
Probably overlooked more often than it should be, always consider using

### Basics
- **raise**: Throw an error; Probably called throwing because of exception handling
- **try**: Do something that might throw an error
- **except**: Handle the exception here
- **else**: Do this if no errors thrown
- **finally**: Always do this, even if an exception is thrown

In [None]:
# Exception Handling

try:
    # Do something that might raise exceptions
    raise NameError('This raises a NameError error') # Throws an error for the sake of this example
except NameError as NE:
    print('Handle the exception here')
else:
    print "Do this only if no exceptions"
finally:
    print "Always do this, even after expceptions"


# Good example of how to use it
def divide(x, y):
    try:
        z = float(x)/y
    except ZeroDivisionError as e:
        z = e  # Can capture the error into a variable
    finally:
        print z

divide(1, 2)  # 1/2
divide(1, 0)  # 1/0, note that no error is thrown

## Add Notebook Enhancements Here

## Revision History

This file is in a mercurial repository and the last committed revision is listed below.<br>
Cell only works if file is in a non-networked location

In [None]:
output = !hg summary
output = str(output)

# get version from mercurial and display last save revision as REV + 
start = output.find("parent:") + len("parent:")
output = output[start:]
print "Rev" + str(output[:output.find(':')])