# Introduction to Python Programming

This tutorial provides an overview of Python specifically for the purposes of data analysis and machine learning.

## Table of Contents
  * [Modules/Names Imports](#Modules/Names-Imports)
  * [Variables Assignment](#Variables-Assignment)
  * [Base Types and Conversions](#Base-Types-and-Conversions)
  * [Mathematical Operations](#Mathematical-Operations)
  * [Boolean Logic](#Boolean-Logic)
  * [Container Types](#Container-Types)
    + [Lists](#Lists)
    + [Operations on Lists](#Operations-on-Lists)
    + [Dictionary](#Dictionary)
    + [Operations on Dictionaries](#Operations-on-Dictionaries)
    + [Generic Operations on Containers](#Generic-Operations-on-Containers)
  * [Strings](#Strings)
    + [Operations on Strings](#Operations-on-Strings)
  * [Conditional Statements](#Conditional-Statements)
    + [If](#If)
    + [If-else](#If-else)
    + [If-elif](#If-elif)
    + [Nested if](#Nested-if)
  * [Conditional Loop Statement](#Conditional-Loop-Statement)
    + [While](#While)
  * [Iterative Loop Statement](#Iterative-Loop-Statement)
    + [For](#For)
  * [Loop Control](#Loop-Control)
    + [Break](#Break)
    + [Continue](#Continue)
  * [Functions](#Functions)
    + [Object Introspection](#Object-Introspection)
  * [List Comprehension](#List-Comprehension)  
  * [Exercises](#Exercises)
    + [Possible solutions](#Possible-solutions)

In [1]:
# let's suppress warnings, as they can get annoying sometimes
import warnings
warnings.filterwarnings("ignore")

In [2]:
# if you want to execute a command but suppress its output,
# just end it with a semi-colon, just like below
2+2;

## Modules/Names Imports

A module is just a code library: a file containing a set of functions and variables you can include in your code. You can write your own modules, or just use other people's modules! The most commonly used modules in Python for data analytics are the NumPy and pandas modules.

In [3]:
import numpy as np
np.zeros((3,4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [4]:
import pandas as pd
data_location = 'https://archive.ics.uci.edu/ml/machine-learning-databases/ionosphere/ionosphere.data'
df = pd.read_csv(data_location, header=None)
print(f'The ionosphere dataset has {df.shape[0]} rows and {df.shape[1]} columns.')

The ionosphere dataset has 351 rows and 35 columns.


## Variables Assignment

In Python, you can define a variable of one type, and then redefine it in some other type.

In [5]:
x = 3
print(x)
x = 'Hello'
print(x)

3
Hello


In [6]:
# Assignment to same value
y = z = 20
print(y)
print(z)

20
20


In [7]:
# Multiple assignments
a, b, c = 2, 5, 12
print(a, b, c)

2 5 12


In [8]:
# Values swap
a, b = b, a
print(a, b)

5 2


In [9]:
x = y = 5

# increment
x += 3  # x = x + 3

# decrement
y -= 2  # y = y - 2

print(x)
print(y)

8
3


In [10]:
# you can remove any variable from the memory using the del command
del x

## Base Types and Conversions

| Base Types | Description |
|----|---|
| int  | Integer |
| float  | Real |
| bool  | Boolean (True or False) |
| str  | String |

`int()`  - constructs an integer number from an integer literal, a float literal (by rounding down to the nearest whole number), or a compatible string literal.

In [11]:
x = int("1")
print(x)
type(x)

1


int

In [12]:
y = int(15.99)
print(y)
# watch out: this will not work: y = int("15.99")

15


In base Python, floats are 64-bit, but you can define floats and integers with other precision using NumPy. However, in base Python, integers have no limit! They can be arbitrarily large.

In [13]:
x = 123**45
print(x)

11110408185131956285910790587176451918559153212268021823629073199866111001242743283966127048043


`float()` - constructs a float number from an integer literal, a float literal, or a compatible string literal.

In [14]:
z = float("12.084")
print(z)
type(z)

12.084


float

` str() ` - constructs a string from other compatible data types.

In [15]:
x = str("Hello")
print(x)
type(x)

Hello


str

In [16]:
y = str(3.45)
print(y)
type(y)

3.45


str

## Mathematical Operations

| Symbol | Task Performed |
|----|---|
| +  | Addition |
| -  | Subtraction |
| /  | division |
| %  | mod |
| *  | multiplication |
| //  | floor division |
| **  | to the power of |


In [17]:
# Both / and // division always result in a float, even if the result is actually an integer
print(4/3)
x = 4/2
print(x)
type(x)

1.3333333333333333
2.0


float

In [18]:
13%10

3

In [19]:
3.9//2

1.0

In [20]:
round(3.57, 1)

3.6

Expect to see some strange behavior with rounding() - for instance, rounding to an integer is done to the nearest even number! Unlike what you might expect, rounding is a very tricky business and it has even caused [fatalities](https://en.wikipedia.org/wiki/Round-off_error). For a detailed explanation for rounding in Python, please see [this](https://realpython.com/python-rounding/).
    

In [1]:
round(3.5, 0)

4.0

In [21]:
round(4.5, 0)

4.0

4.0

In [22]:
abs(-5.6)

5.6

In [23]:
pow(5,3)

125

## Boolean Logic
| Symbol | Task Performed |
|----|---|
| == | True, if it is equal |
| !=  | True, if not equal to |
| < | less than |
| > | greater than |
| <=  | less than or equal to |
| >=  | greater than or equal to |
| and | True, if both statements are true |
| or  | True, if one of statements is true |
| not | False, if the result is true|

In [6]:
a = 13
print(a)
print(a == 13)

13
True


In [7]:
print(a == 33)

False


Pay attention to equality vs. two variables being the same.

In [8]:
x = y = [1, 2, 3]
z = [1, 2, 3]
print(x == y)
print(z == y)
print(x is y)  # this is True
print(z is y)  # this is False!

True
True
True
False


In [27]:
a != 33

True

In [28]:
a > 12

True

In [29]:
a <= 14

True

In [30]:
x = 6
print(x > 5 and x < 10)

True


In [31]:
y = 2
print(y > 8 or y < 4)

True


In [32]:
z = 4
print(not(z > 2 and z < 10))

False


## Container Types

### Lists

List is an ordered sequence of elements that is enclosed in square brackets and separated by a comma. Each of these elements can be accessed by calling it's index value. 
You can put any combination of data types into a list.
Lists are declared by just equating a variable to `[ ]` or `list()`.

In [33]:
lst = [1, 1.23, True, "hello", None]
print(lst)

[1, 1.23, True, 'hello', None]


In [34]:
cars = ["Toyota", "Mercedes", "Ford"]

In [35]:
print(cars)

['Toyota', 'Mercedes', 'Ford']


In Python, indexing starts from 0. Thus, for instance, the list ` cars `, which has three elements will have Toyota at 0 index, Mercedes at 1 index, and Ford at 2 index.

In [36]:
cars[0]

'Toyota'

Indexing can also be done in reverse order. That is the last element can be accessed first. Here, indexing starts from -1. Thus index value -1 will be Ford, index -2 will be Mercedes, and index -3 will be Toyota.

In [37]:
cars[-1]

'Ford'

Indexing is limited to accessing a single element, Slicing, on the other hand, is accessing a sequence of elements inside the list. 

In [38]:
# pay attention to the range() function
# and how we use list() to convert the output to a proper list
num = list(range(0,10))
print(num)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [39]:
print(num[0:4])
print(num[4:])

[0, 1, 2, 3]
[4, 5, 6, 7, 8, 9]


In [40]:
print(num[-3:])  # get the 3 elements from the end

[7, 8, 9]


It is also possible to slice a parent list with a step length.

In [41]:
num[:9:3]

[0, 3, 6]

In [42]:
num[:9:5]

[0, 5]

#### Operations on Lists

In [43]:
lst = [1,2,1,8,7]

`append()` is used to add a element to the end of the list.

In [44]:
lst.append(1)
print(lst)

[1, 2, 1, 8, 7, 1]


`extend()` is used to add another list at the end.

In [45]:
lst.extend([4,5])
print(lst)

[1, 2, 1, 8, 7, 1, 4, 5]


Alternatively, we can use `+` to combine multiple lists (or multiple strings).

In [46]:
lst = lst + [1, 3]
print(lst)

[1, 2, 1, 8, 7, 1, 4, 5, 1, 3]


` insert(x,y)` is used to insert an element y at a specified index value x, while ` append()` function can insert the element only at the end.

In [47]:
lst.insert(5, 'hello')
print(lst)

[1, 2, 1, 8, 7, 'hello', 1, 4, 5, 1, 3]


`remove()` can be used to remove the first occurance of an element by specifying the element itself using the function.

In [48]:
lst.remove('hello')
print(lst)

[1, 2, 1, 8, 7, 1, 4, 5, 1, 3]


` sort()` method arranges the elements in ascending order **in place**. That is, the original list is updated with the new order. You can sort all numerical or all string lists, but not a mix of them.

In [49]:
lst_num = [3, 5, 1.23]
lst_num.sort()
print(lst_num)

[1.23, 3, 5]


In [50]:
lst_str = ['hello', 'world']
print(lst_str)
lst_str.sort(reverse=True)
print(lst_str)

['hello', 'world']
['world', 'hello']


In [51]:
lst_mix = [3, 5, 1.23, 'hello']
# this will not work: lst_mix.sort()

For reversing a list, use `reverse()`

In [52]:
lst_mix.reverse()
print(lst_mix)

['hello', 1.23, 5, 3]


If you do not want to modify the original list, use `sorted()` and `reversed()` and set them equal to a new list.

In [53]:
lst = [3, 5, 1]
lst_new = sorted(lst)
print('original:', lst)
print('sorted:', lst_new)
lst_reversed = reversed(lst)  # this returns an interator, not a list!
print('just reversed:', lst_reversed)
lst_reversed_list = list(lst_reversed)
print('reversed and re-listed:', lst_reversed_list)


original: [3, 5, 1]
sorted: [1, 3, 5]
just reversed: <list_reverseiterator object at 0x11aa96470>
reversed and re-listed: [1, 5, 3]


`count()` is used to count the number of a particular element that is present in the list. If there is none, it will simply return 0.

In [54]:
lst.count(1)

1

`index()` is used to find the index of a particular element. Note that if there are multiple elements of the same value then this will return the first index. if there is none, it will throw an error.

In [55]:
lst

[3, 5, 1]

In [56]:
lst.index(1)

2

For other methods that are available for a list (or any other data structure), you can use the **tab completion feature** of Jupyter Notebook. Just define a list, put a dot, and then hit the `tab` button.

In [57]:
lst.clear()

If you want your list to be immutable, that is unchangable, use the **tuple** container. You can define a tuple by `()` or `tuple()`.

In [58]:
tpl = (1,2,3)
# this will not work - you cannot change a tuple: tpl[0] = 3.45

If you want a set in a mathematical sense, use the **set** container. You can define a set by `set()`. Python has a rich collection of methods for sets such as union, intersection, set difference, etc.

In [59]:
st = set([1,1,1,2,2,2,2])
print(st)

{1, 2}


We will not use tuple or set very often in this course.

### Dictionary

Dictionaries are like a lookup table. To define a dictionary, equate a variable to `{}` or `dict()`. 

In [60]:
d0 = {}
d1 = dict()
print(type(d0), type(d1))

<class 'dict'> <class 'dict'>


In [61]:
d0 = {}
d0['One'] = 1
d0['Two'] = 2 
print(d0)

{'One': 1, 'Two': 2}


That is how a dictionary looks like. Now you are able to access '1' by the index value set at 'One'.

In [62]:
print(d0['One'])

1


#### Operations on Dictionaries

`values()` method returns a list with all the assigned values in the dictionary.

In [63]:
d0.values()

dict_values([1, 2])

`keys()` method returns all the keys in the dictionary.

In [64]:
d0.keys()

dict_keys(['One', 'Two'])

`items()` method returns the key/ value combinations. This method is especially useful inside a for loop.

In [65]:
d0.items()

dict_items([('One', 1), ('Two', 2)])

` update()` inserts the specified items to the dictionary.

In [66]:
d1 = {"Three":3}
d0.update(d1)
d0

{'One': 1, 'Two': 2, 'Three': 3}

`clear()` function is used to erase the entire dictionary.

In [67]:
d0.clear()
print(d0)

{}


### Generic Operations on Containers

In [68]:
num = list(range(10))
num.append(13.45)
print(num)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 13.45]


To find the length of the list, that is, the number of elements in a list, use the `len()` method.


In [69]:
len(num)

11

If the list consists of all numeric or all string elements, then `min()` and `max()` gives the minimum and maximum value in the list.

In [70]:
min(num)

0

In [71]:
max(num)

13.45

In [72]:
num2 = num + ['hello']
# this won't work because not all elements are numeric: min(num2)
# min() and max() also work with strings:
st = ['one','two', 'three']
min(st)

'one'

How to check if a particular element is in a predefined list or dictionary:

In [73]:
# list
names = ['Earth','Air','Fire']

In [74]:
'Tree' in names

False

In [75]:
'Air' in names

True

For a dictionary, `in` checks the keys, not values.

In [76]:
d0 = {'One': 1, 'Two': 2, 'Three': 3}

In [77]:
"One" in d0

True

In [78]:
"Four" in d0

False

## Strings

Strings are immutable containers of characters that are defined by enclosing in the same single/double/triple quotes.

In [79]:
string0 = 'I love chocolate'
string1 = "I love 'coffee'"
string2 = '''I 
love 
bananas
'''

In [80]:
print(string0)
print(string1)
print(string2)

I love chocolate
I love 'coffee'
I 
love 
bananas



String indexing and slicing are similar to lists.

In [81]:
print(string0[2])
print(string0[7:])

l
chocolate


You cannot modify a string!

In [82]:
# This will not work: string0[0] = 'w'

Starting Python 3.6, f-strings are the best way of putting other variables inside strings.

In [83]:
name = 'pi'
val = 3.45
print(f'The value of {name} is {val}.')

The value of pi is 3.45.


### Operations on Strings

`find()` function returns the starting index of a given sequence of characters in the string. If not found, it returns -1.

In [84]:
print(string0.find('I'))
print(string0.find('we'))

0
-1


`startswith()` method checks if a string starts with a particular sequence of characters.

In [85]:
print(string0.startswith('I love'))

True


`endswith()` method works similarly.

In [86]:
print(string0.endswith('a'))

False


`count()` method counts the number of occurance of a sequence of characters in the given string.

In [87]:
print(string1.count('e'))
print(string1.count('ee'))

3
1


`lower()` converts any upper case to lower and `upper()` does vice versa.

In [88]:
print(string0)
print(string0.lower())
print(string0.upper())

I love chocolate
i love chocolate
I LOVE CHOCOLATE


` replace()` function replaces the element with another element.

In [89]:
string0_new = string0.replace('I','We all')
print(string0_new)

We all love chocolate


Try tab completion to see the full list of methods for strings.

In [90]:
st = 'aBc'
st.swapcase()

'AbC'

## Conditional Statements

### If

Statement block is executed only if a condition is true.

~~~~
    if logical condition:
        statements block
~~~~

Make sure you put ":" at the end!!!

In [91]:
a = 27
b = 300

if b > a:
  print("b is greater than a")

b is greater than a


### If-else

~~~~
    if logical condition:
        statements block    
    else:
        statements block
~~~~

In [92]:
a = 300
b = 27

if b > a:
  print("b is greater than a")
else:
  print("a is greater than b")

a is greater than b


### If-elif

~~~~
    if logical condition:
        statements block  
    elif logical condition:
        statements block 
    else:
        statements block
~~~~

In [93]:
a = 27
b = 27

if b > a:
  print("b is greater than a")
elif a == b:
  print("a is greater than b")
else:
  print("a and b are equal")

a is greater than b


### Nested if

if statement inside a if statement or if-elif or if-else are called as nested if statements.

In [94]:
a = 25
b = 30

if a > b:
    print("a > b")
elif a < b:
    print("a < b")
    if a == 25:
        print("a = 25")
    else:
        print("invalid")
else:
    print("a = b")

a < b
a = 25


## Conditional Loop Statement

### While

~~~~
  while some_condition:
    some code
~~~~

In [95]:
i = 1
while i < 5:
    print(i ** 2)
    i = i + 1
print('Bye')

1
4
9
16
Bye


## Iterative Loop Statement

### For

Statements block is executed for each item of a container of iterator.

~~~~
    for variable in something:
        statements block
~~~~

Note that the indentation is very important.

In [96]:
for i in range(5):
    print(i)

0
1
2
3
4


In [97]:
for i in [1,5,10,15]:
    print(i*5)

5
25
50
75


In [98]:
d0 = {'One': 1, 'Two': 2, 'Three': 3}
for key, value in d0.items():
    print(f'key is {key}, value is {value}.')

key is One, value is 1.
key is Two, value is 2.
key is Three, value is 3.


## Loop Control

### Break

As the name says. It breaks out of a loop when a condition becomes true.

In [99]:
for i in range(100):
    print(i)
    if i >= 5:
        break

0
1
2
3
4
5


### Continue

This continues the rest of the loop. Sometimes when a condition is satisfied there are chances of the loop getting terminated. This can be avoided using continue statement.

In [100]:
for i in range(10):
    if i == 5:
        print("Skipping 5")
        continue
    else:
        print(i)

0
1
2
3
4
Skipping 5
6
7
8
9


## Functions

A function in Python is defined using the keyword ` def ` , followed by a function name, a signature within parentheses `()` , and a colon `:` . Functions that returns a value use the ` return ` keyword. If there is no return statement, the function implicitly returns `None`.

~~~~
    def function_name(identifier):
        """ documentation """
        statements block
        return
~~~~

In [101]:
def some_function():   
    print("test")

In [102]:
some_function()

test


In [103]:
def i_love(name):
  print("I love " + name)

In [104]:
i_love("hot chocolate.")

I love hot chocolate.


In [105]:
def five_times(x):
  return 5 * x

In [106]:
five_times(3)

15

It is always a good habit to describe your functions. You can write the description right after declaring the function. For example, say we would like to create a `square` function to return the squared value given any number x.

In [107]:
def square(x):
    """
    Returns the square of the input.
    """
    return x ** 2

In [108]:
square(4)

16

To return `documentation` of a function, use `.__doc__` method as following.

In [109]:
square.__doc__

'\n    Returns the square of the input.\n    '

## Object Introspection

For help with variables and functions, add `?` at the beginning.

In [110]:
# the output will appear in a box at the bottom of your browser.
?print

[0;31mDocstring:[0m
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file:  a file-like object (stream); defaults to the current sys.stdout.
sep:   string inserted between values, default a space.
end:   string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.
[0;31mType:[0m      builtin_function_or_method


In [111]:
?string0

[0;31mType:[0m        str
[0;31mString form:[0m I love chocolate
[0;31mLength:[0m      16
[0;31mDocstring:[0m  
str(object='') -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
errors defaults to 'strict'.


In [112]:
?square

[0;31mSignature:[0m [0msquare[0m[0;34m([0m[0mx[0m[0;34m)[0m[0;34m[0m[0m
[0;31mDocstring:[0m Returns the square of the input.
[0;31mFile:[0m      ~/Dropbox/ml_course_mats/Python_notebooks/<ipython-input-107-e78a16767571>
[0;31mType:[0m      function


With functions, add `??` at the beginning to see the source code, if available.

In [113]:
??square

[0;31mSignature:[0m [0msquare[0m[0;34m([0m[0mx[0m[0;34m)[0m[0;34m[0m[0m
[0;31mSource:[0m   
[0;32mdef[0m [0msquare[0m[0;34m([0m[0mx[0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0;34m"""[0m
[0;34m    Returns the square of the input.[0m
[0;34m    """[0m[0;34m[0m
[0;34m[0m    [0;32mreturn[0m [0mx[0m [0;34m**[0m [0;36m2[0m[0;34m[0m[0m
[0;31mFile:[0m      ~/Dropbox/ml_course_mats/Python_notebooks/<ipython-input-107-e78a16767571>
[0;31mType:[0m      function


## List Comprehension

We can create a list with a for-loop as below. This is called list comprehension and it is a very commonly used Python feature.

```python
[dosomethingwithx for x in sequence]
```

For example, say we would like create a list of numbers ranging from 0 to 9.

In [114]:
[z for z in range(10)]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

How to convert the above numbers to strings and then combine them:

In [115]:
st_lst = [str(z) for z in range(10)]
print(st_lst)
'-'.join(st_lst)

['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']


'0-1-2-3-4-5-6-7-8-9'

* List comprehension is very flexible as you can have conditional statements.
  ```python
  [dosomethingwithx for x in sequence if x somecondition]
  [dosomethingwithx for x in sequence if somecondition else dosomethingelsewithx]

  ```

For example, how to return a list of even numbers ranging from 0 to 10:

In [116]:
[z for z in range(11) if z % 2 == 0]

[0, 2, 4, 6, 8, 10]

## Exercises

### 1. List Comprehension
Return a list of number ranging between 0 to 50 divisible by 3 and 5.

### 2. Dictionaries

* Suppose we have the following dictionary.


In [117]:
course_names = {'MATH2319': 'Machine learning', 'MATH2350': 'Intro to Analytics'}

* Add a new course named "Categorical Data Analysis" with course code MATH1298.

### 3. Conditional statements and function

* Given a value `x`, write a function which can check if `x` is a number and return its squared value. If `x` is not a number, return none. Hint: use `isinstance(2.4, (int, float))`.

### Possible solutions

1.  List comprehension
```python 
[z for z in range(51) if z%3 == 0 and z%5 == 0] 
```
2. Dictionaries 
```python 
course_names['MATH1298'] = 'Categorical Data Analysis'
```
3. Conditional statements and function
```python
def square(x):
    """
    Return the square of x.
    """
    if isinstance(x, (int, float)):
        return x ** 2
    else:
        return None
```

## References

* [Python Lectures](https://github.com/rajathkmp/Python-Lectures)
* [Introduction-to-Python-Programming](https://nbviewer.jupyter.org/github/jrjohansson/scientific-python-lectures/blob/master/Lecture-1-Introduction-to-Python-Programming.ipynb)


***
MATH2319 - Machine Learning @ RMIT University