# Python basics

In this brief tutorial, we'll explore some Python basics - math, strings, lists, dictionaries, tuples, sets, branching, loops, comments, logical values, functions and modules. We'll also introduce the concept of Pythonic code.  

If you're new to Python, don't try to absorb everything at once. It's an expansive language with a rich syntax, lot's of built in functions and modules for evertyhing from machine learning to symbolic computing.

### Library Dependancies

Need copy, numpy, sys. Copy and sys come with the Python Standard Library. Use pip to install numpy: ```pip install numpy```.

## Basic math

Floating point math operations work the way you would expect, following C/C++ conventions and operation precedence. As a convenience, an exponentiation operator (\*\*) is provided

In [None]:
((1.0 + 3.0 - 1.0) * 3.0)**2 / 5

Most integer operations work as expected too

In [None]:
((1 + 3 - 1) * 3)**2 % 7

Just be careful of integer divisions. Unlike C/C++ and other languages, integers are converted to floating point values first.  
If you want C/C++ integer division, use "//"

In [None]:
5/4

In [None]:
5//4

## Booleans and logical operations

Python's Boolean values are True and False (always capitalized). Many other things are treated as false - empty strings, zero (floating point, complex and integer), empty sets and lists. By default, everything else is treated as true - non-empty strings, numeric values that do not equate to zero, non-emtpy sets and lists.

In my opinion, Python feels like it was designed for humans. Contrast with C and Perl, which feel like they were designed for machines or robots. Syntax tends to more straightforward, more readable. Python uses "and, or, not" rather than "&&, ||, !"

In [None]:
True and False

In [None]:
True or False

In [None]:
not True

In [None]:
3 > 5

In [None]:
3 < 5

In [None]:
3 > 2 and 5 < 7

## Comments

Single line comments start with "#" and extend to the end of the line. The start and finish of multi-line comments are marked using three single quotes (''')

In [None]:
'''
Line 1 of a multi-line comment
Line 2 of a multi-line comment
Line 3 of a multi-line comment
'''
# Comment on its own line

1 + 2*3 # Comment at the end of a line

## Strings

Python strings act a lot like lists, which we'll discuss later. Characters within the string are accessed by index starting with 0, like C/C+. Ranges of characters can be accessed using slices [start:end], but note that the last index is not included.

In [None]:
mystr = 'abcdef'
print(mystr[2])
print(mystr[0:3])

If you were paying attention, you'll notice that we snuck in a new Python feature, the "print" function. If an expression is evaluated and not assigned to a variable, the result is printed to the standard output. If an iPython notebook cell contains more than one expression, only the last one is displayed. The following cell illustrates this.  

If you had been programming in Python 2, note that print has changed. In Python 2, print was a statement and was called without parentheses. In Python 3, it's a function and the parentheses are required.

In [None]:
mystr[2]
mystr[0:3]

The length of a string can be obtained with the len() function. This can be used to access the last element of the string [len-1].

In [None]:
print("String length: ", len(mystr))
print("Last element: ", mystr[len(mystr)-1])

The **Pythonic** way to access the last element is with index '-1'. Pythonic means code that doesn't just get the syntax right but that follows the conventions of the Python community and uses the language in the way it is intended to be used. Non-Pythonic code is not necessarily wrong, but tends to be uglier, awkward looking, harder to read and possibly less efficient.

In [None]:
mystr[-1]

One important way that string different from lists is that strings are immutable. We an assign a new value to the string, but we can't update individual elements.

In [None]:
mystr[3] = 'X' # This will produce an error

Python has a lot of built in string methods. A few are illustrated below, see the documentation for a more complete list https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str

In [None]:
mystr = "abCdefgHij"
print("Original string:", mystr)
print("Convert to upper case:", mystr.upper())
print("Convert to lower case:", mystr.lower())
print("Capitalize the string:", mystr.capitalize())
print("Is the string all alphabetical characters:", mystr.isalpha())
print("Is the string all lower case:", mystr.islower())
print("Is the string all digits:", mystr.isdigit())

Python has a method for finding the first occurence of a substring within a string, but if you just want to know whether a string contains a substring (don't care about location), the Pythonic way is to use the "in" operator

In [None]:
print(mystr.find('def')) # Returns index (-1 if not found)
print('def' in mystr)    # Returns True/False
print(mystr.find('xyz')) # Returns index (-1 if not found)
print('xyz' in mystr)    # Returns True/False

Strings can be concatenated using the "+" operator

In [None]:
str1 = 'abc'
str2 = 'def'
str3 = 'ghi'
str1 + str2 + str3

## Lists

Lists are arbitrary collections of elements. The elements can be strings, integers, Booleans, floating point numbers and even other lists, sets, tuples or dictionaries (we'll talk about sets etc. later).  

Unlike strings, lists are mutable, which means that we can change the values of elements. The same element can appear multiple times in a list and elements have defined positions. These properties distinguish lists from Python's other built-in data structures. Lists are enclosed by square brackets.

In [None]:
# list containing integers
ilist = [2, 3, 5, 7, 11]
print(ilist)

# list containing strings
slist = ['date', 'apple', 'banana', 'cantaloupe', 'date', 'apple']
print(slist)

# list containing lists
clist = [[1,2,3], [4,5], [6,7,8,9]]
print(clist)

# An empty list
elist = []
print(elist)

# A neat way to initialize a list with multiple copies of an element
rlist = ['x']*5 + ['y']*5
print(rlist)

List elements are accessed by index and length of list is obtained using the len() function, just like for strings. We demostrate this below along with the mutability of lists.

In [None]:
print(slist)
print(slist[0])
print(slist[-1])
print(slist[2:4])
slist[0] = 'avocado'
print("Number of elements in list is", len(slist))

Python contains built in functions for interrogating and modifying lists. A few examples are shown below, but see https://docs.python.org/3/tutorial/datastructures.html#more-on-lists for additional examples

In [None]:
# Reinitialize the list so that we can run this cell multiple times without growing list
slist = ['date', 'apple', 'banana', 'cantaloupe', 'date', 'apple']

print("Appending to end of a list")
print("Original list", slist)
slist.append('eggplant')
print("Modified list", slist)
print()

print("Deleting a list element")
print("Original list", slist)
del slist[2] # Using the del statement
print("Modified list", slist)
print()

print("Reversing a list")
print("Original list", slist)
slist.reverse() # Note that reverse is done in place
print("Modified list", slist)
print()

print("Sorting a list")
print("Original list", slist)
slist.sort() # Note that sort is done in place
print("Modified list", slist)
print()

print("Counting the number of times an item appears in a list")
print("apple appears", slist.count('apple'), "times")
print("banana appears", slist.count('banana'), "times")
print()

## Dictionaries

Dictionaries are unordered collections of key-value pairs. The equivalent structure in other languages is sometimes called a hash or an associative array. In my opinion, choosing the name "dictionary" over the alteratives is yet another example of how Python was designed to be more intuitive and human friendly.  

Like lists, dictionaries are mutable. While the values can be arbitrary, the key must be a simple "hashable" data type (e.g. lists cannot serve as keys). A key can only appear once and assigning a new value to a key will overwrite the original value. Values can appear multiple times. Dictionaries are enclosed by curly braces and values of the dictionary are accessed using the key name in square brackets.

In [None]:
#Initialize the dictionary with a few key-value pairs
fruit_colors = {'apple':'red', 'banana':'yellow'}

# Add a few more elements
fruit_colors['lime'] = 'green'
fruit_colors['lemon'] = 'yellow'

# And print out the current contents of the dictionary
print("Original dictionary")
print(fruit_colors)
print()

# Now delete an entry and print the new contents
del fruit_colors['lime']
print("Dictionary after deleting entry")
print(fruit_colors)
print()

print("List the keys and values")
print("keys:", fruit_colors.keys())
print("values:", fruit_colors.values())

## Sets

A set is an unordered collection of items without repeats. Python provides all the set operations you would expect

In [None]:
set1 = {'apple', 'banana', 'orange', 'blueberry'}
set2 = {'strawberry', 'persimmon', 'banana', 'clementine', 'grape'}

In [None]:
# Union - could also use the | operator
set1.union(set2)

In [None]:
# Intersection - could also use the & operator
set1.intersection(set2)

In [None]:
# Set difference - items in set1, but not set 2
set1 - set2 

Empty set must be defined using the set() function, {} creates an empty dictionary.

In [None]:
set3 = set()
set3.add('apple')
set3.add('orange')
set3

Members can be added with the add() method and removed with discard()

In [None]:
set3.add('kiwi')
set3.add('apple') # Adding an existing member does not change the set
print(set3)

set3.discard('orange')
print(set3)

Using the properties of sets, we can easily find the unique elements of the list. In the example below, we convert the result back to a list after finding unique element with the set function.

In [None]:
mylist = ['a', 'a', 'b', 'c', 'a', 'b', 'b', 'a']
list(set(mylist))

## Tuples (and a first look at iterators)

A tuple is a Python data structure that is similar to a list except that tuples are immutable. A tuple is enclosed by parentheses and tuple elements are accessed by index in the same way as list elements.

In [None]:
t = ('a', 'b', 'c')
print(t)
for x in t:
    print(x)

Sometimes you need a tuple that contains a single element. For example, a function may require a tuple as an argument, regardless of the number of elements in the tuple. In this case, follow the element of the tuple with a comma. Otherwise, the commas are interpreted in the mathematical sense. We introduce the type() function below and will explain shortly.

In [None]:
y = (3)
type(y)

In [None]:
y = (3,)
type(y)

The zip() built-in function takes one or more lists and generates an iterator of tuples built from the corresponding elements of the list. Convert to a list if you need to use in a list context.

In [None]:
list1 = ['a', 'b', 'c', 'd']
list2 = ['A', 'B', 'C', 'D']
list3 = [1, 2, 3, 4]
t = zip(list1, list2, list3)
print(list(t))

Normally you would use the zip() function with lists of the same lengths. If lists of different lengths are used, the number of tuples will be determined by the length of the shortest list.

In [None]:
list1 = ['a', 'b', 'c', 'd', 'e', 'f']
list2 = ['A', 'B', 'C', 'D', 'E']
list3 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
t = zip(list1, list2, list3)
print(list(t))

### Iterators

An iterator returns the next element in the sequence and is exhausted after the last element has been returned. You can think of an iterator as providing a stream of data. Iterators can improve performance and reduce memory usage since elements in the sequence are only generated as needed.

Iterators are a key concept in functional programming. We won't get into this topic, but if you're interested you can read more about it here https://docs.python.org/3/howto/functional.html

In [None]:
t = zip(list1, list2, list3)
print(list(t))
print(list(t))

Although you'll rarely need to do this, the next element in an iterator can be obtained using the next() function or the \_\_next\_\_() method.

In [None]:
t = zip(list1, list2, list3)
print(next(t))
print(next(t))
print(t.__next__())
print(t.__next__())
print(list(t))

A zipped object can be unzipped using the zip command with the argument preceded by an asterisk

In [None]:
t = zip(list1, list2, list3)
x, y, z = zip(*t)
print(x)
print(y)
print(z)

## Getting information on objects 

We've seen a number of different Python objects: integers, strings, lists, dictionaries, tuples, sets, iterators. It's easy to lose track of the objects that we've declared or the return types of functions. Fortunately, Python provides a few useful functions.

If you just need information on a single object, use the type() function. If you want information on all objects, you can use dir(), globals() or locals(), but when working within a Jupyter notebook, we have access to the IPython "magic" commands %who and %whos that are much more user friendly. A full listing if IPythons magic commands can be found here http://ipython.readthedocs.io/en/stable/interactive/magics.html

In [None]:
x=1
type(x)

In [None]:
x = 1.2
type(x)

In [None]:
type(list1)

In [None]:
t = zip(list1, list2, list3)
type(t)

In [None]:
%who

In [None]:
%whos

## Assignments, copies, deep copies and shallow copies

One of the biggest stumbling blocks in Python is dealing with assignment and copies. Let's start with a simple example

In [None]:
a = ['apple', 'banana', 'orange']
print(a)
b = a
b[0] = 'lemon'
print(a)

The assignment "b = a" does not create a new list. The two lists point to the same memory location and b is simply an alias or another name for a. To get a copy of the list we need to use the copy method.

In [None]:
import copy
a = ['apple', 'banana', 'orange']
print(a)
b = copy.copy(a)
b[0] = 'lemon'
print(a)

For compound objects, the behavior is more complex and we need to use a deep copy if we want to recursively copy the entire object.

In [None]:
a = ['apple', 'banana', 'orange', ['lime', 'peach']]
print(a)
b = copy.copy(a)
b[0] = 'cherry'
b[3][0] = 'mango'
print(a)

In [None]:
a = ['apple', 'banana', 'orange', ['lime', 'peach']]
print(a)
b = copy.deepcopy(a)
b[0] = 'cherry'
b[3][0] = 'mango'
print(a)

## Branching and conditionals

Branching is done using if-elif-else constructs. A few examples are shown below.  

The indentation is required and defines the boundaries of code blocks. I see this as a decision on the part of the Python developers to make the language more readable. The use of indentation in this way is sometimes called the "off-side rule", borrowed from the name of a penalty in American football.

In [None]:
x, y = 1, 2

# Note the colon (:) after if/elif/else and indentation of code block

print("If constuct")
if x < y:
    print ("x is less than y\n")
    
print("If-else constuct")  
if x < y:
    print ("x is less than y\n")
else:
    print ("x is not less than y\n")
    
print("If-elif-else constuct")
x, y = 2,2
if x < y:
    print ("x is less than y\n")
elif x > y:
    print ("x is greater than y\n")
else:
    print ("x is equal to y\n")



If you were still paying attention, you might have noticed that I slipped in one more Python feaure to simultaneously assign values to x and y. We could have done the following, but it's not very Pythonic ;)

```python
# Something a C progrmmer would do
x = 1; y = 2
```

And in the spirit of being Pythonic, note that we didn't put parentheses aroound the logical tests. For example, we wrote "x < y" instead of "(x < y)"

### Nested conditionals

As you would expect, conditionals can be nested to an arbitray level of depth. Just be sure to do your indentation correctly or you may get unexpected results. In the example

In [None]:
x, y, z = 1, 2, 0
print ("x,y,z = 1,2,0 - Both tests produce same results")

if x < y:
    if x < z:
        print ("x is less than y and z")
    elif x > z:
        print ("x is less than y, and greater than z")
else: # This else is associated with the test "x < y"
    print ("x is not less than y")
        
if x < y:
    if x < z:
        print ("x is less than y and z")
    elif x > z:
        print ("x is less than y, and greater than z")
    else: # This else is associated with the test "x < z"
        print ("x is not less than y")

print()
        
x, y, z = 1, 2, 1
print ("x,y,z = 1,2,1 - Tests produce different results")

if x < y:
    if x < z:
        print ("x is less than y and z")
    elif x > z:
        print ("x is less than y, and greater than z")
else: # This else is associated with the test "x < y"
    print ("x is not less than y")
        
if x < y:
    if x < z:
        print ("x is less than y and z")
    elif x > z:
        print ("x is less than y, and greater than z")
    else: # This else is associated with the test "x < z"
        print ("x is not less than y")

## Loops

The Python for loop is used to iterate over the items of a sequence. Unlike C/C++, you don't always need to loop over a numerical index.

In [None]:
names = ['Adam', 'Billie', 'Carlos', 'David', 'Ernesto', 'Francis']
for n in names:
    print(n, ': length =', len(n))

In fact, Python makes it very easy to avoid looping over a numerical index. In the example below, we use the enumerate function to return the list item and its position index

In [None]:
for i, n in enumerate(names):
    print(i, n)

We can use the zip command to print the corresponding elements of multiple lists, thereby avoiding the use of a  numerical index

In [None]:
lowers = ['a', 'b', 'c']
uppers = ['A', 'B', 'C']
for t in zip(lowers, uppers):
    print(t[0], t[1])

No need to split a string before iterating over characters

In [None]:
newstr = 'abcdefg'
for c in newstr:
    print(c)

If you do need to iterate over a numerical index, use the range() function. Note that range() returns an iterable object instead of generating the complete list of integers. The next item is generated as needed. This saves memory and is more efficient. 

In [None]:
for i in range(6):
    print(i)

But avoid doing the following, not very Pythonic

In [None]:
# How you might do this coming from a C/C++ programming background (not Pythonic)
names = ['Adam', 'Billie', 'Carlos', 'David', 'Ernesto', 'Francis']
for i in range(len(names)):
    print(names[i], ': length =', len(names[i]))

When iterating over the key-value pairs in a dictionary, use the items() method

In [None]:
fruit_colors = {'apple':'red', 'banana':'yellow', 'lime':'green', 'strawberry':'red'}
for k,v in fruit_colors.items():
    print(k,v)

The break and continue keywords work just like in C/C++. Break exits the loop and continue steps to the next iteration

In [None]:
numbers = [2, 7, 8, 6, 15, 3, 6]

In [None]:
for n in numbers:
    if n%5 == 0:
        print("list element divisible by 5, break out of loop")
        break
    print(n)

In [None]:
for n in numbers:
    if n%2 == 0:
        continue
    print(n)

As you would expect, loops can be nested to an arbitrary depth. Remember that code blocks are defined by the level of indentation.

In [None]:
for i in range(4):
    for j in range(3):
        print('(i,j) = ({},{})'.format(i,j))

### List comprehension

Python's list comprehension functionality makes it very easy to create and populate a new list. We'll show an example first using the loop syntax and then demonstrate how much easier it is to do with list comprehension.

In [None]:
# Populate a list of squares using a loop
squares = []
for x in range(10):
    squares.append(x*x)    
squares

In [None]:
# Populate a list of squares using list comprehension (very Pythonic)
squares = [x*x + x - 2 for x in range(10)]
squares

The general form of list comprehension is **[transformation iterator filter]**. In the example below, we use a more complex function and limit the list to values that are divisible by 3

In [None]:
squares = [x*x + x - 2 for x in range(10) if (x*x + x - 2)%3 == 0]
squares

### One more thing about the range function

Like many Python functions, range can be called with different numbers of arguments.

+ range(n) --> integers 0 through n-1
+ range(m,n) --> integers m through n-1
+ range(m,n,p) --> integers m through n-1 with stride p

In [None]:
list(range(2,20,3))

## Function definitions

Python functions are defined using the def keyword. 

From the Python reference manual: "The function definition does not execute the function body; this gets executed only when the function is called". One of the implications of this is that methods from modules that have not yet been loaded can be used at the time the function is declared (more on that soon).

In [None]:
def square(x):
    return x*x

In [None]:
square(3.0)

Functions can take any number of arguments and optional arguments can be given default values.

In [None]:
def func1(x, y, z=1.0):
    return x*y + z

In [None]:
func1(2,3)

In [None]:
func1(2,3,4)

Functions can return multiple objects packed into a tuple. The example below returns a tuple containing a scalar (integer or float depending on argument types) and a three element list

In [None]:
def func2(x, y, z=1.0):
    return x*y + z, [x*y, x*z, y*z]

In [None]:
func2(2, 3, 4)

We haven't gotten to modules yet, but we mentioned earlier that we can created a function definition that uses methods from modules that have not yet been loaded.

In [None]:
def uses_numpy(x):
    return np.sin(x) + np.cos(x) * np.sqrt(x)

## Map

The map built-in can be used to apply a function to all elements of an iterable. Note that map returns an iterator and will need to be converted to a list to use in a list context.

In [None]:
x = [0.0, -0.25, 0.5, -0.75, 1.0]

In [None]:
y = abs(x) # This does not do what you would hope and produces an error

In [None]:
y = map(abs, x) # Instead use map() which applies function (abs) to list (x)
print(list(y))

### Lambda functions

Python supports lambda functions. We won't get into the theory, but think of it as a way to declare a nameless function that can be used in certain contexts to simply your code

In [None]:
z = [2,3,5,7,11]
y = list(map(lambda a: a**2 + a**3, z))
y

Lambda functions can take multiple arguments

In [None]:
z = [2,3,5,7,11]
w = [1,2,4,6,10]
y = list(map(lambda a, b: a**2 + a**3 + b, z, w))
y

## Modules

Python modules provide a mechanism for packaging function definitions and statements. You can develop your own modules or import standard modules such as numpy (numerical computing), pandas (machine learning) and matplotlib (plotting).

The contents of a module are imported using the import statement

In [None]:
import numpy
numpy.sqrt(10.5)

Module names can be long and it's often convenient to abbreviate the name using the "as" clause. Although you can use any abbreviation you like, so long as it doesn't conflict with an existing name, many of the modules have standard abbreviations and we recommend using them.

In [None]:
import numpy as np
np.sqrt(10.5)

Earlier, we had defined a function, uses_numpy, that required numpy's sin, cos and sqrt functions. Now that we've imported numpy, we can execute the function.

In [None]:
uses_numpy(1.2)

You can import specific methods from modules using the following syntax. Note that the sqrt function no longer needs to be prefixed by "numpy" or "np".

In [None]:
from numpy import sqrt
sqrt(10.5)

It's possible to import all names from a module using "\*". For example,    

    from numpy import *  

This is not generally recommended. Since you don't generally know the full content of the module, you may introduce conflicts with existing function or variable names.

### Get version of a module and version of Python

+ To get a version of a module, use modulename.\_\_version.\_\_
+ To get the Python version, use sys.version

In [None]:
numpy.__version__

In [None]:
import sys
sys.version

## Phew, that was a lot to absorb! Fortunately, there's help

Unlike C, which is an extremely terse language, Python is very expansive. There are a lot of built-in functions, many of which can be called with variable numbers of arguments. Add in the commonly used modules, such as numpy, and we end up with a lot more information than most of us can remember.

Fortunately, Python provides excellent help capabilities. Just follow the function/method name with a question mark or pass the to the help() function. The former displays the help information in a new sub-window while the latter displays as the output of the cell.

In [None]:
help(abs)

In [None]:
np.sqrt?