# Python Tutorial

*Heavily based on presentations by Guido van Rossum*

**Cai Hao**

*Wuhan University*

## Why Python?

<img src="https://upload.wikimedia.org/wikipedia/commons/6/66/Guido_van_Rossum_OSCON_2006.jpg" width=32% style="float:left;margin:10px"/> Guido van Rossum (Dutch pronunciation: [ˈɣido vɑn ˈrɔsʏm, -səm], born 31 January 1956) is a Dutch programmer who is best known as the author of the Python programming language. In the Python community, Van Rossum is known as a "Benevolent Dictator For Life" (BDFL), meaning that he continues to oversee the Python development process, making decisions where necessary. He was employed by Google from 2005 until 7 December 2012, where he spent half his time developing the Python language. In January 2013, Van Rossum started working for Dropbox.

- Open source general-purpose language.
- Object Oriented, Procedural, Functional
- Easy to interface with C/ObjC/Java/Fortran
- Easy-ish to interface with C++ (via SWIG)
- Great interactive environment

## 2.x / 3.x ???

## A Code Sample

In [2]:
x = 34 - 2298                 # A comment.
y = "Hello"                 # Another one.
z = 3.45
if z != 3.45 or y   != "Hello":
    x = x + 1
    y = y + " World"        # String concat.
print(x)
print(y)
print('ftyyuuii')

-2264
Hello
ftyyuuii


# The Basics

## Basic Datatypes

- Integers (default for numbers)

In [7]:
z = 5**10000
y = 17 % 5
print(z)
print(y)

5012372749206452009297555933742977749321567781338425839421429042279239530950784040189110696248422413361521849286205837208132744317689127301648216895479578118587420911116061228938487784021066813377355260810249578957216297123940899140227289847904219348790617130947725485922091082026037332572901667450308244739424469320405286659423089952215169224858097867951487315960824213807804541508387888670165230399223731310386419981753463384228772060548275645032353711693424365373253719432620188894349399903528047416971755052292544953318056897648743979899232752388415235736105766854907268028598192436537457324696920953561398194522336384570567700214002405948857185250269662785331014722297201440322907861768303031456236657554963604355994903236505725130255682348137049030831104449673903154785076431097092724702188118756848284287648468922653821950238866521237170923192257955945049604090449353681595213237784770177417776121421818210067714106625591849766609999257250590404933224647485161676975397309404488400824558647944

- Floats

In [3]:
x = 3.456

- Strings

In [9]:
# Can use "" or '' to specify
"abc" == 'abc'
a = "abc"
a[0]

'a'

In [10]:
# Unmatched can occur within the string
print('matt"s')

matt"s


In [11]:
# Use triple double-quotes for multi-line strings or strings 
# than contain both ' and " inside of them
print("""a'b"c""")

a'b"c


## Whitespace

Whitespace is meaningful in Python: especially
indentation and placement of newlines. 

- Use a newline to end a line of code.
  - Use `\n` when must go to next line prematurely.

In [7]:
print("test\ntest")

test
test


- No braces { } to mark blocks of code in Python…      
  Use consistent indentation instead.

In [15]:
for i in ['a','m','d','xyz']:
    print(i, end="+")
range(10)

a+m+d+xyz+

range(0, 10)

- Often a colon appears at the start of a new block.        
  (E.g. for function and class definitions.)

In [17]:
def f(x):
    return(x + 10)
print(f(12))
print(f(12.1))
print(f("abc"))

22
22.1


TypeError: must be str, not int

## Comments

- Start comments with # – the rest of line is ignored.
- Can include a "documentation string" as the first line of 
  any new function or class that you define
- The development environment, debugger, and other tools use
  it: it’s good style to include one.

In [8]:
def my_function(x, y):
    """This is the docstring. This
    function does blah blah blah."""
    # The code would go here...
print(my_function.__doc__)

This is the docstring. This
    function does blah blah blah.


## Assignment

- Binding a variable in Python means setting a name to hold a
  reference to some object.
  - Assignment creates references, not copies

- Names in Python do not have an intrinsic type. Objects have
  types.
  - Python determines the type of the reference automatically based on the
    data object assigned to it.

- You create a name the first time it appears on the left side of
  an assignment expression:

In [12]:
x = 3

- A reference is deleted via garbage collection after any names
  bound to it have passed out of scope.

## Accessing Non-Existent Names

If you try to access a name before it’s been properly created
(by placing it on the left side of an assignment), you’ll get an
error. 

In [18]:
yy

NameError: name 'yy' is not defined

In [14]:
yy = 3
yy

3

## Multiple Assignment

You can also assign to multiple names at the same time.

In [15]:
x, y = 2, 3
x

2

In [16]:
y
def = 9

SyntaxError: invalid syntax (<ipython-input-16-3864645d37d4>, line 2)

## Naming Rules

- Names are case sensitive and cannot start with a number.
  They can contain letters, numbers, and underscores.
  bob Bob _bob _2_bob bob_2 BoB

- There are some reserved words:

# Understanding Reference Semantics in Python

## Understanding Reference Semantics

- Assignment manipulates references

In [17]:
x = y
# does not make a copy of the object y references
# makes x reference the object y reference

- Very useful; but beware!

In [18]:
a = [1, 2, 3, 4] # a now references the list [1, 2, 3, 4]
b = a # b now references what a references
a.append(4) # this changes the list a references
print(b) # if we print what b references,

[1, 2, 3, 4]


**Why???**

### There is a lot going on when we type:

In [19]:
x = 3

- First, an integer 3 is created and stored in memory
- A name x is created
- An reference to the memory location storing the 3 is then
  assigned to the name x
- So: When we say that the value of x is 3,     
  we mean that x now refers to the integer 3

In [19]:
%load_ext nbtutor

In [20]:
%%nbtutor -r -f
x = 3

- In Python, the datatypes integer, float, and string 
  (and tuple) are "immutable"

- If we increment x, then what's really happening is:

In [21]:
x = 3
x = x + 1
print(x)

4


1. The reference of name x is looked up.    
2. The value at that reference is retrieved.
3. The 3+1 calculation occurs, producing a new data element 4 which is assigned to a fresh memory location with a new reference.
4. The name x is changed to point to this new reference.
5. The old data 3 is garbage collected if no name still refers to it.

In [23]:
%%nbtutor
x = 3
x = x + 1
print(x)

4


## Assignment 1

So, for simple built-in datatypes (integers, floats, strings),
assignment behaves as you would expect:

In [12]:
%%nbtutor -r -f
x = 3 # Creates 3, name x refers to 3
y = x # Creates name y, refers to 3.
y = 4 # Creates ref for 4. Changes y.
print(x) # No effect on x, still ref 3.

3


## Assignment 2

For other data types (lists, dictionaries, user-defined types), assignment
works differently. 

- These datatypes are “mutable.” 
- When we change these data, we do it in place. 
- We don’t copy them into a new memory address each time. 
- If we type y=x and then modify y, both x and y are changed.

**Why? Changing a Shared List**

In [13]:
%%nbtutor -r -f
a = [1, 2, 3] # a now references the list [1, 2, 3]
b = a # b now references what a references
a.append(4) # this changes the list a references
print(b) # if we print what b references

[1, 2, 3, 4]


# Sequence types:

Tuples, Lists, and Strings

1) Tuple
    - A simple immutable ordered sequence of items
    - Items can be of mixed types, including collection types

2) Strings
    - Immutable
    - Conceptually very much like a tuple

3) List
    - Mutable ordered sequence of items of mixed types

## Similar Syntax

- All three sequence types (tuples, strings, and lists)
  share much of the same syntax and functionality.

- Key difference:
  - Tuples and strings are immutable
  - Lists are mutable

- The operations shown in this section can be
  applied to all sequence types
  - most examples will just show the operation
    performed on one

## Definition of sequence object

- Tuples are defined using parentheses (and commas).

In [24]:
tu = (23, 'abc', 4.56, (2,3), 'def')
tu[0] = 24

TypeError: 'tuple' object does not support item assignment

- Lists are defined using square brackets (and commas).

In [15]:
li = ["abc", 34, 4.34, 23]
li.append(8)
li[0] = 11111
print(li)

[11111, 34, 4.34, 23, 8]


- Strings are defined using quotes (', ", or """).

In [27]:
st = "Hello World"
st = 'Hello World'
st = """This is a multi-line
string that uses triple quotes."""
print(st)

This is a multi-line
string that uses triple quotes.


## How to access individual members of a sequence object?

- We can access individual members of a tuple, list, or string
  using square bracket “array” notation.

- Note that all are 0 based… 

In [30]:
tu = (23, 'abc', 4.56, (2,3), 'def')
tu[1:3] # Second item in the tuple.

('abc', 4.56)

In [31]:
li = ["abc", 34, 4.34, 23]
li[1] # Second item in the list.

34

In [32]:
st = "Hello World"
st[1] # Second character in string.

'e'

## Positive and negative indices

In [19]:
tu = (23, 'abc', 4.56, (2,3), 'def')
tu[1] # Second item in the tuple.

'abc'

In [20]:
tu[-3]

4.56

## Slicing: Return Copy of a Subset

- Return a copy of the container with a subset of the original
  members. Start copying at the first index, and stop copying
  **before** the second index. 

In [35]:
tu = (23, 'abc', 4.56, (2,3), 'def')
tu[1:4]

('abc', 4.56, (2, 3))

- You can also use negative indices when slicing. 

In [36]:
tu[1:-1]

('abc', 4.56, (2, 3))

- Omit the first index to make a copy starting from the beginning
  of the container.

In [37]:
tu[0:2]

(23, 'abc')

- Omit the second index to make a copy starting at the first index
  and going to the end of the container.

In [38]:
tu[2:]

(4.56, (2, 3), 'def')

## Copying the Whole Sequence

- To make a copy of an entire sequence, you can use [:]
- Note the difference between these two lines for mutable
  sequences:

In [22]:
%%nbtutor -r -f
tu = (23, 'abc', 4.56, (2,3), 'def')
tu2 = tu #  2 names refer to 1 ref
         # Changing one affects both
tu3 = tu[:] # Two independent copies, two refs
tu4 = tu[1:4]
print(tu4)

('abc', 4.56, (2, 3))


## The ‘in’ Operator

- Boolean test whether a value is inside a container:

In [23]:
t = [1, 2, 4, 5]
3 in t
for i in t:
    print(i, end=',')

1,2,4,5,

In [41]:
4 in t 

True

In [42]:
4 not in t
not(4 in t)

False

- For strings, tests for substrings

```python
a = 'abcde'
'c' in a
```

```python
'cd' in a
```

```python
'ac' in a
```

- Be careful: the in keyword is also used in the syntax of
  for loops and list comprehensions.

```python
for i in [1, 2, 8, 12]:
    print(i, ',', end='')
```

## The + Operator

The + operator produces a new tuple, list, or string whose
value is the concatenation of its arguments.

```python
(1, 2, 3) + [4, 5, 6]
```

```python
[1, 2, 3] + [4, 5, 6]
```

```python
"Hello" + " " + "World"
```

## The * Operator

The * operator produces a new tuple, list, or string that
“repeats” the original content.

```python
(1, 2, 3) * 3
```

```python
[1, 2, 3] * 3
```

```python
"Hello" * 3
```

# Mutability:

Tuples vs. Lists

## Tuples: Immutable

```python
tu = (23, ['e','f','a'], 4.56, (2, 3), 'def')
tu[1].append(8)
tu[1]
```

- You can’t change a tuple.
- You can make a fresh tuple and assign its reference to a previously used
  name.

```python
tu = (23, 'abc', 3.14, (2, 3), 'def')
```

## Lists: Mutable

```python
li = ["abc", 34, 4.34, 23]
li[1] = 45
li
```

- We can change lists in place.
- Name li still points to the same memory reference when we’re
  done.
- The mutability of lists means that they aren’t as fast as tuples. 

```python
li = [1, 11, 3, 4, 5]
li.append('a')
li
```

```python
li.insert(2, 'i')
li
```

## The extend method vs the + operator

- '+' creates a fresh list (with a new memory reference)
- extend operates on list li in place

```python
li = [1, 11, 3, 4, 5]
li.extend([9, 8, 7]) 
li
li + [8, 10]
```

*Confusing:*

- Extend takes a list as an argument.
- Append takes a singleton as an argument.

```python
li.append([10, 11, 12])
li
```

## Operations on Lists Only 3

```python
li = ['a', 'b', 'c', 'b']
```

```python
li.index('b')
```

```python
li.count('b')
```

```python
li.remove('b')
```

```python
li
```

## Operations on Lists Only 4

```python
li = [5, 2, 6, 8]
```

```python
li.reverse()
li
```

```python
li.sort()
li
```

## Tuples vs. Lists

Lists slower but more powerful than tuples.

- Lists can be modified, and they have lots of handy operations we can
  perform on them.
- Tuples are immutable and have fewer features
- To convert between tuples and lists use the list() and tuple()
  functions:

```python
li = list(tu)
tu = tuple(li)
```

# Dictionaries

## Dictionaries: A Mapping type

- Dictionaries store a mapping between a set of keys
  and a set of values.
  - Keys can be any immutable type.
  - Values can be any type
  - A single dictionary can store values of different types
- You can define, modify, view, lookup, and delete
  the key-value pairs in the dictionary.

## Using dictionaries

```python

In [24]:
d = {'user':'bozo', 'pswd':1234}
print(d['user'])
print(d['pswd'])
print(d['bozo'])

bozo
1234


KeyError: 'bozo'

```

```python
d = {'user':'bozo', 'p':1234, 'i':34}
del d['user'] # Remove one.
print(d)
d.clear() # Remove all.
print(d)
```

```python
d = {'user':'bozo', 'pswd':1234}
d['user'] = 'clown'
print(d)
d['id'] = 45
print(d)
```

```python
d = {'user':'bozo', 'p':1234, 'i':34}
print(d.keys()) # List of keys.
print(d.values()) # List of values.
print(d.items()) # List of item tuples.
```

# Functions

- *def* creates a function and assigns it a name
- *return* sends a result back to the caller 
- Arguments are passed by assignment
- Arguments and return types are not declared
  def <name>(arg1, arg2, ..., argN): 
    <statements>
    return <value>

```python
def times(x,y):
    return x*y
```

```python
times(10, 11)
```

## Passing Arguments to Functions

- Arguments are passed by assignment
- Passed arguments are assigned to local names
- Assignment to argument names don't affect the caller
- Changing a mutable argument may affect the caller



```python

In [27]:
%%nbtutor -r -f
def changer(x,y):
    x = 2           # changes local value of x only 
    y[0] = 'hi'     # changes shared object
x = 33
y = [1, 2, 3]
changer(x, y)
print("x=", x)
print("y=", y)

x= 33
y= ['hi', 2, 3]


## Optional Arguments 

Can define defaults for arguments that need not be passed

```python
def func(a, b, c=10, d=100):
    print(a, b, c, d)
```

```python
func(a=1, b=2, d=3)
```

```python
func(1, 2, 3, 4)
```

## Gotchas

- All functions in Python have a return value
  - even if no return line inside the code.
- Functions without a return return the special value **None**. 
- There is no function overloading in Python.
  - Two different functions can’t have the same name, even if they have different arguments. 
- Functions can be used as any other data type. They can be: 
  - Arguments to function
  - Return values of functions
  - Assigned to variables
  - Parts of tuples, lists, etc

## Examples

```python

In [28]:
x = 3
if x == 3:
    print("X equals 3.")
elif x == 2:
    print("X equals 2.")
else:
    print("X equals something else.")
print("This is outside the 'if'.")

X equals 3.
This is outside the 'if'.


```

```python
x = 3
while x < 10:
    if x > 7:
        x += 2
        continue
    x = x + 1
    print("Still in the loop.")
    if x == 8:       
        break 
print("Outside of the loop.")

```

```python
for x in range(10):   
    if x > 7:       
        x += 2       
        continue   
    x = x + 1   
    print("Still in the loop.")   
    if x == 8:       
        break 
print("Outside of the loop.")

```

```python
number_of_players = 3
assert(number_of_players < 5)
```

# Modules

## Why Use Modules? 

- Code reuse
  - Routines can be called multiple times within a program
  - Routines can be used from multiple programs 
- Namespace partitioning
  - Group data together with functions used for that data 
- Implementing shared services or data
  - Can provide global data structure that is accessed by multiple subprograms

## Modules 

- Modules are functions and variables defined in separate files 
- Items are imported using from or import 
  from module import function
  function()

import module
module.function()

- Modules are namespaces
  - Can be used to organize variable names, i.e. 
    atom.position = atom.position - molecule.position

# Classes and Objects

## What is an Object? 

Object Oriented Design focuses on

- Encapsulation:
  - dividing the code into a public interface, and a private implementation of that interface 
- Polymorphism:
  - the ability to overload standard operators so that they have appropriate behavior based on their context 
- Inheritance:
  - the ability to create subclasses that contain specializations of their parents

```python
class atom(object): 
    def __init__(self,atno,x,y,z): 
        self.atno = atno 
        self.position = (x,y,z) 
    def __repr__(self): # overloads printing 
        return '%d %10.4f %10.4f %10.4f'\
        %(self.atno, self.position[0], self.position[1], self.position[2])

```

```python
at = atom(6,0.0,1.0,2.0)
print(at)
```

## Atom Class 

- Overloaded the default constructor
- Defined class variables (atno,position) that are persistent and local to the atom object
- Good way to manage shared memory:
  - instead of passing long lists of arguments, encapsulate some of this data into an object, and pass the object.
  - much cleaner programs result
- Overloaded the print operator

- We now want to use the atom class to build molecules...

## Molecule Class

```python

In [29]:
class atom(object): 
    def __init__(self,atno,x,y,z): 
        self.atno = atno 
        self.position = (x,y,z) 
    def __repr__(self): # overloads printing 
        return '%d %10.4f %10.4f %10.4f'\
        %(self.atno, self.position[0], self.position[1], self.position[2])

class molecule: 
    def __init__(self,name='Generic'): 
        self.name = name
        self.atomlist = []
    def addatom(self,atom): 
        self.atomlist.append(atom)
    def __repr__(self): 
        _str = 'This is a molecule named %s\n' % self.name
        _str = _str + 'It has %d atoms\n' % len(self.atomlist)
        for atom in self.atomlist:
            _str = _str + "%r" % atom + '\n'
        return _str

```

```python
mol = molecule('Water')
at = atom(8,0.,0.,0.)
mol.addatom(at)
mol.addatom(atom(1,0.,0.,1.))
mol.addatom(atom(1,0.,1.,0.))
print(mol) 
```

- Note that the print function calls the atoms print function
  - Code reuse: only have to type the code that prints an atom once; this means that if you change the atom specification, you only have one place to update

## Inheritance

```python
class qm_molecule(molecule):
    def addbasis(self):
        self.basis = []
        for atom in self.atomlist:
            self.basis = add_bf(atom,self.basis) 
```

- `__init__`, `__repr__`, and `addatom` are taken from the parent class (molecule) 
- Added a new function addbasis() to add a basis set 
- Another example of code reuse
  - Basic functions don't have to be retyped, just inherited
  - Less to rewrite when specifications change

## Overloading

```python
class qm_molecule(molecule):
    def __repr__(self): 
        str = 'QM Rules!\n'
        for atom in self.atomlist:
            str = str + "%r" % atom + '\n'
        return str 
```

- Now we only inherit `__init__` and addatom from the parent 
- We define a new version of `__repr__` specially for QM

## Public and Private Data 

- In Python anything with two leading underscores is private 
  __a, __my_variable 
- Anything with one leading underscore is semiprivate, and you should feel guilty accessing this data directly
  _b
  *Sometimes useful as an intermediate step to making data private*

# The Extra Stuff...

## File I/O, Strings, Exceptions...

```python
try: 
    1 / 0
except:
    print('That was silly!')
finally: 
    print('This gets executed no matter what') 
```

```python
fileptr = open(‘filename’) 
somestring = fileptr.read() 
for line in fileptr:   
    print line 
fileptr.close() 
```