In [1]:
thing = "blah"

# Python Basics
## A quick primer on the language.

* Introduction to Python
* Control Structures
* Basic Data Types
* Advanced Data Types
* Dictionaries
* Functions
* Exception Handling
* Classes
* Files
* Larger Programs
* Pattern Matching

###  Introduction to Python
* General purpose high level language:
* Emphasis placed on
    * Human Readability.
    * Self describing code.

The Zen of Python, by Tim Peters (PEP-20) (try `import this` in python)
```
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
```

During this lesson, we'll be using this web page to execute our python code. 

In [20]:
print('Hello World!', thing)

Hello World! blah


If you have an IDE already set up and would prefer to use that, then just copy and paste the code segments into it. 

### Python Indentations (white space)
In most programming languages, indentation is for readability only, in Python indentation is part of the language.

Python uses indentation to indicate a block of code.

In [21]:
if 10 > 8:
    print("Ten is greater than eight")

Ten is greater than eight


In [22]:
if 10 > 8:
print("Ten is greater than eight")

IndentationError: expected an indented block (<ipython-input-22-9a24f7160616>, line 2)

### Brackets

In a lot of languages brackets are used to denote blocks of code. In python this is not the case, they are used for:

* `[]`: Mutable data types - lists, list comprehensions and for indexing/lookup/slicing.
* `()`: Define tuples, order of operations, generator expressions, function calls and other syntax.
* `{}`: The two hash table types - dictionaries and sets.

If you don't understand what these are, don't worry. We'll get to them all eventually.

### Comments
Sometimes you want to write notes to yourself and others in your code, this is where we use comments.

Any line starting with a # will basically be ignored by python, but will be there for others to read in your source code.

In [23]:
# This is a comment.
print("Hello, World!")

Hello, World!


### Docstrings
There is an extended documentation capability, called docstrings.

Docstrings can be one line, or multiline. Docstring should describe what the function does, not how.

Strictly speaking, all functions should have a doc string. They can be used by IDEs and other tools to provide help text to others.

They are created by using triple quotes:

In [24]:
"""This is a 
multiline docstring."""
print("Hello, World!")

Hello, World!


### Creating Variables
In some languages you need a command to declare a variable, in python, a variable is created the moment you first assign a value to it.

In [25]:
age = 55
name = "John"
print(age)
print(name)

55
John


You don't need to tell python what data type a variable is, in fact you can change it's type after you've created it.

In [30]:
myVar = 4 # myVar has a data type of int
myVar = "Two fried eggs and a bottle of Rum" # myVar now has a data type of str
print(myVar)

Two fried eggs and a bottle of Rum


### TODO
* Syntax
* Colon
* Brackets
* White space
* A Beginner / int / med to invite
* First 60 = talk, last 30 = example (homework)

###  Basic Data Types
* Everything is an object, this means most python data types have methods (functions) that can be called.
* Interpreter can be left to decide what a variable is.
* Explicit casts are available (int(), str() etc) but are not recommended.

In [37]:
i = 4
f = 0.33
s = "This is a string "

print(s)
print(f * i)
print(s * i)

This is a string 
1.32
This is a string This is a string This is a string This is a string 


###  Basic Data Types

Since everything is an object, strings have builtin methods for modification.

To call an object's method (function) we use a `.`

In [40]:
# Since everything's an object. Strings have methods for manipiulation
s = "wE haVE uR DAugHTeR, bRinG $1,000,000 iN UNmARkEd NotEs"
print(s)
print(s.rstrip("manipiulation"))
print(s.upper())
print(s.lower())
print(s.swapcase())

wE haVE uR DAugHTeR, bRinG $1,000,000 iN UNmARkEd NotEs
wE haVE uR DAugHTeR, bRinG $1,000,000 iN UNmARkEd NotEs
WE HAVE UR DAUGHTER, BRING $1,000,000 IN UNMARKED NOTES
we have ur daughter, bring $1,000,000 in unmarked notes
We HAve Ur daUGhtEr, BrINg $1,000,000 In unMarKeD nOTeS


### String formatting

Python strings have lots of way to format them. We're going to be looking at the .format method here, if you're using python 3.6+ you might want to look at f-strings.

You might see tutorials using a `%`, don't use this. It's the old way of doing things and is less readable and flexible than the other two mentioned.

You can use integers inside curly brackets as an index for the item.

In [46]:
print("{0} some very important log line number: {1}".format(['this', 'is', 'a', 'list'], 987))

['this', 'is', 'a', 'list'] some very important log line number: 987


In [48]:
err = "Meteor strike"
num = 666
print("{error} less important but equally concerning at line {line_num}".format(error=err, line_num=num))

Meteor strike less important but equally concerning at line 666


This method works especially well with derefernced dictionaries

In [55]:
# This is an advanced use case
data = {"error": "Meteor strike",
        "line_num": 666}
print("{error} less important but equally concerning at line {line_num}".format(**data))

Meteor strike less important but equally concerning at line 666


They can also be used for padding and rounding

In [63]:
print('{0:<10} is a {1:>10}'.format('This', 'test'))
print('{:.2f}'.format(3.141592653589793))

This       is a       test
3.14


###  Sequence Data Types - *Lists*

* As described, lists are a collection of objects.
* Behave in the same way as arrays in other languages, but with the benifits of being an object in themselves!
* Lists are **mutable**. (They can be changed once they're created)
* Can be added to with .append() and concanated with .extend()

In [41]:
list_1        = [1, 2, 3, 4]
list_2        = ['a', 'b', 'c', 'd']
list_specials = [['a'], (1, 2, 3), False]

print('{0} is of length {1}'.format(list_1, len(list_1)))

list_1.append(5)
print('{0} is of length {1}'.format(list_1, len(list_1)))

list_1.extend(list_2)
print('{0} is of length {1}'.format(list_1, len(list_1)))

print('{0} is of length {1}'.format(list_specials, len(list_specials)))

[1, 2, 3, 4] is of length 4
[1, 2, 3, 4, 5] is of length 5
[1, 2, 3, 4, 5, 'a', 'b', 'c', 'd'] is of length 9
[['a'], (1, 2, 3), False] is of length 3


###  Sequence Data Types - *Tuples*
* Similar to lists except that they are **immutable**!
* Cannot be modified once created.
* Useful when returning more than one thing from *functions* (see later).
* They are also faster than lists, so if you're concerned about performance they can be useful.

In [6]:
x1 = ()
x2 = (1,)
x3 = (1, 2, 3)
x4 = (1, "mixed", 2, "tuple")
x5 = ((1, 2), (3, 4))

print('{0} is of length {1}'.format(x1, len(x1)))
print('{0} is of length {1}'.format(x2, len(x2)))
print('{0} is of length {1}'.format(x3, len(x3)))
print('{0} is of length {1}'.format(x4, len(x4)))
print('{0} is of length {1}'.format(x5, len(x5)))

() is of length 0
(1,) is of length 1
(1, 2, 3) is of length 3
(1, 'mixed', 2, 'tuple') is of length 4
((1, 2), (3, 4)) is of length 2


###  Sequence Data Types - *Dictionaries*
* Used to store **key value pairs**.
* Not stored in any particular order (in < 3.6), left up to interpreter to order when inserting/accessing.
* Values can be accessed/updated by refering to dictionary's key.


In [33]:
d = {}
d['name'] = 'Matt'
d['job'] = 'something or other'

print(d)
print(d.keys())
print('My name is {0}'.format(d['name']))
print('My job is {0}'.format(d['job']))

{'name': 'Matt', 'job': 'something or other'}
dict_keys(['name', 'job'])
My name is Matt
My job is something or other


###  Control Structures
#### IF
Only execute statement(s) if condtion is met.

In [8]:
x = [1, 2, 3]
if len(x) > 0:
    print('IF:{0}'.format(x))

IF:[1, 2, 3]


#### FOR
Iterates through items in collection (set, list, tuple etc...)

In [9]:
for item in x:
    print('FOR:{0}'.format(item))

FOR:1
FOR:2
FOR:3


#### WHILE
Keep executing statement(s) until condition is not met.

In [10]:
y = []
i = 0
while len(y) < len(x):
    i += 1
    y.append(i)
print('WHILE:{0}'.format(y))

WHILE:[1, 2, 3]



###  Functions
* Used to logically seperate off common code.
* Can return value(s) to calling function.

In [11]:
def add_up(num_list):
    x = 0
    for i in num_list:
        x += i
    return x

nums = [1, 2, 3, 4, 5, 6]
print('The sum of {0} is {1}'.format(nums, add_up(nums)))

The sum of [1, 2, 3, 4, 5, 6] is 21


###  Exception Handling
* Gracefully handle problems occuring in execution.
* exceptions are signals sent from interpreter at time of problem.
* Unless handled, they will cause a program to crash, displaying the exception.

In [12]:
x = 3
print(x / 0)

ZeroDivisionError: division by zero

* Exceptions can be used by wrapping code in try / except statements.

In [19]:
x = 3
try:
    print(x / 0)
except ZeroDivisionError:
    print("You're trying to print by zero, this isnt allowed!")

You're trying to print by zero, this isnt allowed!


###  Classes
* Used to group together variables/classes logically (usually into a model of something).
* All the types we have been using are classes!


In [None]:
class TrainingGroup:
    # class members
    date = ''
    participants = []

    # class methods
    def __init__(self, date, participants):
        self.date = date
        self.participants = participants

    def __str__(self):
        return '{0} participants on {1}'.format(len(self.participants), self.date)

    def kick_member(self, idiot):
        new_members = []
        for member in self.participants:
            if idiot != member:
                new_members.append(member)
            else:
                print('kicking {0}'.format(idiot))
        self.participants = new_members


group = TrainingGroup('python', ['Tom', 'Dick', 'Harry'])
print(group)
print(group.participants)
group.kick_member('Dick')
print(group)
print(group.participants)

###  Files

In [None]:
infile  = open('testfile.txt', 'r')       # opened for reading only
outfile = open('file.output.txt', 'w')    # opened for writing
for line in infile.readlines():
    outfile.write(line)

Files can be opened in different *modes*:

```|||
|:--|:-------------------------------------------------------------------------------------------------------------------|
|r  |Open text file for reading. The stream is positioned at the beginning of the file.                                  |
|r+ |Open for reading and writing. The stream is positioned at the beginning of the file.                                |
|w  |Truncate file to zero length or create text file for writing. The stream is positioned at the beginning of the file.|
|w+ |Open for reading and writing.  The file is created if it does not exist, otherwise it is truncated.  The stream is  |
|   |positioned at the beginning of the file.                                                                            |
|a  |Open for writing.  The file is created if it does not exist. The stream is positioned at the end of the file.       |
|   |Subsequent writes to the file will always end up at the then current end of file.                                   |
|a+ |Open for reading and writing. The file is created if it does not exist.  The stream is positioned at the end of the |
|   |file.  Subsequent writes to the file will always end up at the then current end of file.                            |
```

###  Larger Programs - *Using Modules*
* Large codebases can be broken up into seperate files.
* **Calling** functions calling functions from another file need to have **imported** them before use.
* Good practice to cut down on *code replication*.
* Importing a file causes **all code in the file to be executed**.
    * Need to be wary of code not within a function
    * Can *guard* against this code by wrapping it in `if __name__ == '__main__':`

In [None]:
def sum_numeric_list(nums):
    """
    Accepts a list of numeric values.
    Returns the sum of the elements.
    """
    sum = 0
    for item in nums:
        sum += item
    return sum


def prune_dict(dict, keys_to_remove):
    """
    Accepts a dict to priune and a list of keys to remove.
    Matching keys are deleted and rmainders returned.
    """
    pruned_dict = {}
    for key in dict.keys():
        if key not in keys_to_remove:
            pruned_dict[key] = dict[key]

    return pruned_dict


if __name__ == '__main__':
    test_ints = [1, 2, 3, 4, 5]
    print('The sum of {0} is {1}'.format(test_ints, sum_numeric_list(test_ints)))

    test_dict = {'Tom': 'The First', 'Dick': 'The Second', 'Harry': 'The Third'}
    pruned_dict = prune_dict(test_dict, 'Tom')
    print('Stripping \'Tom\' from {0} gives us {1}'.format(test_dict, pruned_dict))

```
> python
Python 2.7.9 (default, May  1 2015, 19:04:44)
[GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.49)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import mymodule
>>> dir(mymodule)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', 'prune_dict', 'sum_numeric_list']
```

---

###  Larger Programs - *Using Modules*
After importing the module, you can now use it's functions without the test code being called:

*import py:*

In [None]:
import mymodule

from mymodule import prune_dict

nums = [6, 7, 8, 9]
print('SUM:{0}'.format(mymodule.sum_numeric_list(nums)))

dict = {'a': 1, 'b': 2, 'c': 3}
new_dict = prune_dict(dict, 'b')

print('dict:     {0}'.format(dict))
print('new_dict: {0}'.format(new_dict))

###  Pattern Matching
* Regex matching is done via built in `re` module.
* re.search - *look for pattern anywhere within string*
* re.match  - *look for exact pattern match in string*
* regex's can be pre-compiled for optimisation *see example*


*regex cheat sheet*:
```
|||
|------:|:--------------------------------------------------------------------------|
|^      |start of string                                                            |
|$      |end of string                                                              |
|[abc]  |Any one character from list (a OR b OR c)                                  |
|[a-m]  |Any one character from range (a OR b OR ... OR m)                          |
|[^abc] |Any one character not in list (NOT a NOR b NOR c)                          |
|.      |Any one character                                                          |
|\s     |Any white space character                                                  |
|\S     |Any NON white space character                                              |
|\d     |Any digit                                                                  |
|\D     |Any NON digit                                                              |
|\w     |Any alphanumeric                                                           |
|\W     |Any NON alphanumeric                                                       |
|()     |Group together matches within parentheses (use with re.search().group()).  |
```

###  Pattern Matching


In [None]:
import re
string = 'The quick brown fox jumps over the laxy dog.'

print('search:      {0}'.format(re.search('The quick brown fox', string)))
print('search_gr0:  {0}'.format(re.search('The quick brown fox', string).group(0)))

print('match:       {0}'.format(re.match('quick (brown) fox', string)))
print('match_gr0:   {0}'.format(re.match('\S* quick (brown) fox', string).group(0)))
print('match_gr1:   {0}'.format(re.match('\S* quick (brown) fox', string).group(1)))

regex = re.compile('^[\S\s]*(over) (the)[\s\S]*')
print('compiled_match_gr0:  {0}'.format(regex.match(string).group(0)))
print('compiled_match_gr1:  {0}'.format(regex.match(string).group(1)))
print('compiled_match_gr2:  {0}'.format(regex.match(string).group(2)))

###  Debugging - PDB
* Drops script into interactive console either when:
    * Interpreter hit `pdb.set_trace()` function.
    * In python 3.7+ you can use the breakpoint() function instead.
    * Interpreter hits an *unhandled* exception (if script called with `python -m pdb <script.py>`.
* Same as an interactive python shell any commands/functions you like can be called manually to test.
```
|||
|------:|:------------------------------|
|l      | show code at current frame    |
|c      | continue code execution       |
|bob    | prints variable 'bob'         |
|Ctrl-c | quit script as usual          |
```