## Lecture 11a. Python Basics - Wrap-up 1: Python Basics

- Keywords, Statements and Comments
- Variables
- Datatypes
    - Numbers
    - Strings
    - Booleans
    - Lists
    - Sets
    - Tuples
    - Dictionaries
- Operators

### 0. What is Python? 
- Python is a cross-platform programming language, meaning, it runs on multiple platforms like Windows, Mac OS X, Linux, Unix and other machines. It is free and open source. It's already the most popular programing language for data science.
- Recall the **print** function and the "Hello world!" program you've done:

In [12]:
print('hello world!')

hello world!


### 1. Keywords 
- Keywords are the reserved words in Python.

- We **cannot** use a keyword as variable name, function name or any other identifier. They are used to define the syntax and structure of the Python language.

- In Python, keywords are **case sensitive**.

- There are 33 keywords in Python 3. This number can vary in course of time (in future Python versions).

- All the keywords except True, False and None are in lowercase and they must be written as it is. The list of all the keywords are given below: 

    - **False,	class,	finally,	is,	return**, 
    - **None,	continue,	for,	lambda,	try**,
    - **True,	def,	from,	nonlocal,	while**,
    - **and,	del,	global,	not,	with**,
    - **as,	elif,	if,	or,	yield**,
    - **assert,	else,	import,	pass**,	 
    - **break,	except,	in,	raise**,

### 2. Python Statements & Comments

#### 2.1 Statements

- "Instructions" that your Python can execute are called statements. For example, "a = 1" is an **assignment statement**. You've also used **if statement**, **for statement** etc., which will be reviewed later.
- In Python, end of a statement is marked by a newline character. But we can make a statement extend over multiple lines with the line continuation character (\). For example:

In [3]:
x = 1 + 2 + 3 + \
    4 + 5 + 6 + \
    7 + 8 + 9
print(x)

45


This is explicit line continuation. In Python, line continuation is implied inside parentheses ( ), brackets [ ] and braces { }. For instance, we can implement the above multi-line statement as:

In [4]:
x = (1 + 2 + 3 + 
     4 + 5 + 6 + 
     7 + 8 + 9)
print(x)

45


 - You can also put **multiple statements in a single line** using semicolons, as follows:

In [6]:
a = 1; b = 2; c = 3

#### 2.2 Comments

- Comments are very important while writing a program. It describes what's going on inside a program so that a person looking at the source code does not have a hard time figuring it out. You might forget the key details of the program you just wrote in a month's time. So taking time to explain these concepts in form of comments is always fruitful.

- In Python, we use the **hash (#) symbol** to start writing a comment.

- Comments are for programmers for better understanding of a program. Python Interpreter ignores comment (not executed). 

In [11]:
# This is a long comment
# and it extends
# to multiple lines, it's not executed

#### 2.3 Python Indentation (Code blocks)

Most of the programming languages like C, C++, Java use braces { } to define a block of code. Python uses indentation.

- A code block (body of a function, loop etc.) starts with **indentation** and ends with the first unindented line. The amount of indentation is up to you, but it must be consistent throughout that block.

- The enforcement of indentation in Python makes the code look neat and clean. This results into Python programs that look similar and consistent.

- Generally **four whitespaces** are used for indentation and is preferred over tabs (since tabs may have different interpretations on different platforms). Here is an example (we will review the for-loop later):

In [14]:
for i in range(1,11):
    print(i)
    if i == 5:
        break

1
2
3
4
5


### 3. Variables

In Python (or most programing languages), a variable is a named location used to store data in the memory. Each variable must have a **unique** name called. It is helpful to think of variables as container that hold data which can be changed later throughout programming.

Non technically, you can suppose variable as a bag to store books in it and those books can be replaced at anytime.

#### 3.1 Declaring Variables in Python

- In Python, variables do not need declaration to reserve memory space. The "variable declaration" or "variable initialization" happens automatically when we assign a value to a variable. 

#### 3.2 Assigning value to a Variable
- Use the assignment operator **=** to _assign_ a value to a variable: 

In [15]:
website = "Apple.com" # create a variable named website, then assign value 'Apple.com' to it
print(website)

Apple.com


More ways to assign values to variables:

In [21]:
a, b, c, d = 5, 3.2, "Hello", True
print(d,c,b,a)

True Hello 3.2 5


or

In [20]:
x = y = z = "same"
print(x,y,z)

same same same


#### 3.3 Name your variables -  Suggested Rules 

To name your variables properly, here are some popular choices:

- lowercase - use this for variables
- lower_case_with_underscores - or this
- mixedCase - or this
- UPPERCASE - use this for constants
- UPPER_CASE_WITH_UNDERSCORES - or this
- CapitalizedWords (or CapWords, or CamelCase -- so named because of the bumpy look of its letters). 

Also, something to avoid:

- Don't use characters 'l' (lowercase letter el), 'O' (uppercase letter oh), or 'I' (uppercase letter eye) as single character variable names. These are easily confused in some fonts with '1' (one), '0' (zero), for example. If you really want a letter 'el', use 'L'.
- Don't use non-standard symbols like  ‘¥’ or ‘。’
- Be careful with special names that use leading or trailing underscores. these are treated differently by Python and you have to know what you are doing before you use them.

### 4. Python basic Datatypes

Every value in Python has a datatype. Since everything is an object in Python programming, data types are actually instance (class) and variables are objects of these classes. There are various data types in Python. Some of the important types are listed below.

#### 4.1 Numbers - int, float, complex

- Integers, floating point numbers and complex numbers falls under Python numbers category. They are defined as **int**, **float** and **complex** types in Python. 

In [2]:
a = 5
print(a, "is of type", type(a))

a = 2.0
print(a, "is of type", type(a))

a = 1+2j
print(a, "is complex number?", isinstance(1+2j, complex))

5 is of type <class 'int'>
2.0 is of type <class 'float'>
(1+2j) is complex number? True


A couple of notes on numbers in Python: 

- the **isinstance(data,type)** function determines whether _data_ is the specific _type_, which returns True of False as you've seen in the above example.

- **Integers** can be of any length, it is only **limited by the memory** available.

- A **floating** point number is accurate up to 15 decimal places. Integer and floating points are separated by decimal points. 1 is integer, 1.0 is floating point number.

- Complex numbers are written in the form, **x + yj**, where **x** is the real part and **y** is the imaginary part. 

- You can convert one type of number into another. Operations like **addition** (+), **subtraction** convert integer to float implicitly (automatically), if one of the operand is float. For example:

In [4]:
1+3.1

4.1

**Be careful on Python Decimal** 
- Python built-in class float performs some calculations that might amaze us. We all know that the sum of 1.1 and 2.2 equals to 3.3, but Python seems to disagree:

In [6]:
print( (1.1 + 2.2) == 3.3 )

False


What is going on?

It turns out that floating-point numbers are implemented in computer hardware as binary fractions, as computer only understands binary (0 and 1). Due to this reason, most of the decimal fractions we know, cannot be accurately stored in our computer.

Let's take an example. We cannot represent the fraction 1/3 as a decimal number. This will give 0.33333333... which is infinitely long, and we can only approximate it.

Turns out decimal fraction 0.1 will result into an infinitely long binary fraction of 0.000110011001100110011... and our computer only stores a finite number of it.

This will only approximate 0.1 but never be equal. Hence, it is the limitation of our computer hardware and not an error in Python.

In [8]:
print( 1.1 + 2.2 )

3.3000000000000003


To overcome this issue, we can use **decimal** module that comes with Python. While floating point numbers have precision up to 15 decimal places, the decimal module has user settable precision. Here's the full documentaion of the Decimal module: https://docs.python.org/3/library/decimal.html

#### 4.2 Python Booleans

#### 4.2 Python Strings

- String is sequence of Unicode characters. We can use single quotes or double quotes to represent strings. Multi-line strings can be denoted using triple quotes, ''' or """.

- Like list and tuple, slicing operator [ ] can be used with string. Strings are **immutable** (cannot be changed).

In [24]:
s = 'Hello world!'

# s[4] = 'o'
print("s[4] = ", s[4])

# s[6:11] = 'world'
print("s[6:11] = ", s[6:11])

# Generates error
# Strings are immutable in Python
s[5] ='d'

s[4] =  o
s[6:11] =  world


TypeError: 'str' object does not support item assignment

Note that operations can have different results regarding to the data type. For example the "**+**" operator

In [2]:
print(1+2+3) # add integers
print('1'+'2'+'3') # add strings

6
123


and the "**\***" operator (use this in the in-class practice problem):

In [3]:
print(1*5) # multiply integers
print('1'*5) # multiply strings

5
11111


#### 4.3 Python List

A Python list is an **ordered sequence of items**. It is one of the most used datatype in Python and is very flexible. All the items in a list do not need to be of the same type.

- Create a list:

In [4]:
# empty list
my_list = []

# list of integers
my_list = [1, 2, 3]

# list with mixed datatypes
my_list = [1, "Hello", 3.4]

# nested list - lists inside a list
my_list = ["mouse", [8, 4, 6], ['a']]

- Access and index slicing of elements in a list: 

In [18]:
# Define a list using []:
a = [3.14,10,'15',20,25.0,'True',35,2+3j]

# a[2] = 15
print("a[1] = ", a[1])

# a[0:3] = [5, 10, 15]
print("a[0:3] = ", a[0:3])

# a[5:] = [30, 35, 40], slicing all the way to the end
print("a[5:] = ", a[5:])

# a[-1] = 2+3j, negative index
print("a[-1] = ", a[-1])

a[1] =  10
a[0:3] =  [3.14, 10, '15']
a[5:] =  ['True', 35, (2+3j)]
a[-1] =  (2+3j)


- Change, add and delete elements in a list:

In [19]:
# mistake values
odd = [2, 4, 6, 8]

# change the 1st item    
odd[0] = 1            

# Output: [1, 4, 6, 8]
print(odd)

# change 2nd to 4th items
odd[1:4] = [3, 5, 7]  

# Output: [1, 3, 5, 7]
print(odd)   

[1, 4, 6, 8]
[1, 3, 5, 7]


We can add one item to a list using .append() method or add several items using extend() method:

In [20]:
odd = [1, 3, 5]

odd.append(7)

# Output: [1, 3, 5, 7]
print(odd)

odd.extend([9, 11, 13])

# Output: [1, 3, 5, 7, 9, 11, 13]
print(odd)

[1, 3, 5, 7]
[1, 3, 5, 7, 9, 11, 13]


We can also use + operator to combine two lists. This is also called concatenation.

The * operator repeats a list for the given number of times. For example:

In [21]:
odd = [1, 3, 5] # define a list

# Output: [1, 3, 5, 9, 7, 5]
print(odd + [9, 7, 5])

#Output: ["re", "re", "re"]
print(["re"] * 3)

[1, 3, 5, 9, 7, 5]
['re', 're', 're']


- Delete or remove elements from a list

In [22]:
my_list = ['p','r','o','b','l','e','m']

# delete one item
del my_list[2]

# Output: ['p', 'r', 'b', 'l', 'e', 'm']     
print(my_list)

# delete multiple items
del my_list[1:5]  

# Output: ['p', 'm']
print(my_list)

# delete entire list
del my_list       

# Error: List not defined
print(my_list)

['p', 'r', 'b', 'l', 'e', 'm']
['p', 'm']


NameError: name 'my_list' is not defined

A Summary of Python List **Methods**:  

append() - Add an element to the end of the list  
extend() - Add all elements of a list to the another list  
insert() - Insert an item at the defined index  
remove() - Removes an item from the list  
pop() - Removes and returns an element at the given index  
clear() - Removes all items from the list  
index() - Returns the index of the first matched item  
count() - Returns the count of number of items passed as an argument  
sort() - Sort items in a list in ascending order  
reverse() - Reverse the order of items in the list  
copy() - Returns a shallow copy of the list  

Build-in **functions** with Python List: 

all() -	Return True if all elements of the list are true (or if the list is empty).  
any() -	Return True if any element of the list is true. If the list is empty, return False.  
enumerate() -	Return an enumerate object. It contains the index and value of all the items of list as a tuple.  
len() -	Return the length (the number of items) in the list.  
list() -	Convert an iterable (tuple, string, set, dictionary) to a list.  
max() -	Return the largest item in the list.  
min() -	Return the smallest item in the list  
sorted() -	Return a new sorted list (does not sort the list itself).  
sum() -	Return the sum of all elements in the list.  

#### 4.4 Python Tuple  

- Tuple is an **ordered sequence** of items same as list.The only difference is that tuples are immutable. Tuples once created **cannot be modified**.

- Tuples are used to write-protect data and are usually faster than list as it cannot change dynamically.

- It is defined within parentheses () where items are separated by commas.

- You can use the slicing operator [] to extract items but we cannot change its value.

In [23]:
t = (5,'program', 1+3j)

# t[1] = 'program'
print("t[1] = ", t[1])

# t[0:3] = (5, 'program', (1+3j))
print("t[0:3] = ", t[0:3])

# Generates error
# Tuples are immutable
t[0] = 10

t[1] =  program
t[0:3] =  (5, 'program', (1+3j))


TypeError: 'tuple' object does not support item assignment

#### 4.5 Python set

Set is an **unordered** collection of **unique** items. Set is defined by values separated by comma inside braces { }. 

- Items in a set are not ordered.

- Set have unique values. They eliminate duplicates.

- We can perform set operations like union, intersection on two sets.

- But, since set are unordered collection, indexing has no meaning. Hence the slicing operator [] does not work. Let's see some examples:

In [27]:
a = {5,2,3,1,4}

# printing set variable
print("a = ", a)

# data type of variable a
print(type(a))

b = {1,2,2,3,3,3} 

# printing set variable
print('b = ', b)

# Generates error
# set cannot be indexed:
a[1]

a =  {1, 2, 3, 4, 5}
<class 'set'>
b =  {1, 2, 3}


TypeError: 'set' object does not support indexing

Sets are very useful in data analysis, for example recall the GDP per Capita data you've worked on for multiple times, if we want to know how many continents are included in the data set, you can use the set functions simply:

In [14]:
import pandas as pd
import numpy as np

file = "datasets/gdp_life.csv" # GDP data file
gdp = pd.read_csv(file) # load as a Pandas data frame
gdp = gdp.dropna()

print(gdp['continent'].values) # get all the continents

# Now let's generate a Python set use the set() function to see how many continents are included in the data
# here all the duplicates are eliminated by the set() function when we trying to convert all the continent 
# values into a Python set:
conti = set( gdp['continent'].values )

print(conti)

['Europe' 'Asia' 'Asia' 'North America' 'Europe' 'Asia' 'Europe' 'Europe'
 'North America' 'Europe' 'Europe' 'black' 'Europe' 'Europe' 'Europe'
 'Europe' 'Europe' 'Asia' 'Europe' 'Asia' 'Europe' 'Asia' 'Europe'
 'Europe' 'Europe' 'Asia' 'black' 'Asia' 'Europe' 'Asia' 'Asia' 'Europe'
 'North America' 'Europe' 'Europe' 'North America' 'Europe' 'Europe'
 'Africa' 'North America' 'North America' 'Africa' 'Asia' 'Africa'
 'Africa' 'North America' 'Asia' 'North America' 'Africa' 'Europe'
 'Europe' 'North America' 'Asia' 'North America' 'Europe' 'North America'
 'Africa' 'Europe' 'North America' 'North America' 'Europe' 'Africa'
 'Asia' 'Europe' 'North America' 'North America' 'Africa' 'North America'
 'North America' 'blue' 'North America' 'Europe' 'North America' 'Africa'
 'North America' 'Asia' 'Africa' 'Asia' 'Africa' 'Asia' 'Asia'
 'North America' 'Asia' 'North America' 'Africa' 'Africa' 'North America'
 'Asia' 'Asia' 'Asia' 'Asia' 'North America' 'Asia' 'Africa' 'Asia' 'Asia'
 'Asia' 'A

Now think about the following basics:  

How to change a set in Python?   
How to remove elements from a set?   
Python Set Operations  
- Set Intersection
- Set Difference 

Different Python Set Methods  
- Set Membership Test
- Iterating through a Set
- Built-in Functions with Set

#### 4.6 Python Dictionary

- Dictionary is also an unordered collection, but with key-value pairs.

- It is generally used when we have a huge amount of data. Dictionaries are optimized for retrieving data. We must know the key to retrieve the value.

- In Python, dictionaries are defined within braces {} with each item being a pair in the form key:value. Key and value can be of any type.

- Use key to retrieve the respective value. But not the other way around.

In [18]:
d = {1:'value','key':2}
print(type(d))

print(d.keys())

print(d.values())

print("d[1] = ", d[1]);

print("d['key'] = ", d['key']);

# Generates error
print("d[2] = ", d[2]);

<class 'dict'>
dict_keys([1, 'key'])
dict_values(['value', 2])
d[1] =  value
d['key'] =  2


KeyError: 2

### 5. Python Operators

Operators are special symbols in Python that carry out arithmetic or logical computation. The value that the operator operates on is called the operand.

#### 5.1 Arithmetic operators


|Operator | Meaning| Example |
|--- |-------------------------------|-------------|
| +  | Add two operands or unary plus| x+y+2       |
| -  | Subtract right operand from the left or unary minus | x-y      |
| *  | Multiply two operands | x\*y      |
| /  | Divide left operand by the right one (always results into float) | x/y     |
| %  | Modulus - remainder of the division of left operand by the right | x%y     |
| ** | Exponent - left operand raised to the power of right | x**y     |

Examples:



In [29]:
x = 15
y = 4

# Output: x + y = 19
print('x + y =',x+y)

# Output: x - y = 11
print('x - y =',x-y)

# Output: x * y = 60
print('x * y =',x*y)

# Output: x / y = 3.75
print('x / y =',x/y)

# Output: x % y = 3
print('x % y =',x%y)

# Output: x ** y = 50625
print('x ** y =',x**y)

x + y = 19
x - y = 11
x * y = 60
x / y = 3.75
x % y = 3
x ** y = 50625


#### 5.2 Relational operators


|Operator | Meaning| Example |
|--- |-------------------------------|-------------|
| >  | Greater than - True if left operand is greater than the right| x > y     |
| <  | Less than - True if left operand is less than the right | x < y      |
| == | Equal to - True if both operands are equal | x == y      |
| != | Not equal to - True if operands are not equal | x != y     |
| >= | Greater than or equal to - True if left operand is greater than or equal to the right | x >= y     |
| <= | Less than or equal to - True if left operand is less than or equal to the right | x <= y     |

Examples:

In [30]:
x = 10
y = 12

# Output: x > y is False
print('x > y  is',x>y)

# Output: x < y is True
print('x < y  is',x<y)

# Output: x == y is False
print('x == y is',x==y)

# Output: x != y is True
print('x != y is',x!=y)

# Output: x >= y is False
print('x >= y is',x>=y)

# Output: x <= y is True
print('x <= y is',x<=y)

x > y  is False
x < y  is True
x == y is False
x != y is True
x >= y is False
x <= y is True


#### 5.3 Logical operators


|Operator | Meaning| Example |
|--- |-------------------------------|-------------|
| and  | True if both the operands are true | x and y     |
| or  | True if either of the operands is true | x or y      |
| not | True if operand is false (complements the operand) | not x      |

Examples:

In [31]:
x = True
y = False

# Output: x and y is False
print('x and y is',x and y)

# Output: x or y is True
print('x or y is',x or y)

# Output: not x is False
print('not x is',not x)

x and y is False
x or y is True
not x is False


#### 5.4 Membership operators

in and not in are the membership operators in Python. They are used to test whether a value or variable is found in a sequence (string, list, tuple, set and dictionary).

In a dictionary we can only test for presence of key, not the value.

|Operator | Meaning| Example |
|--- |-------------------------------|-------------|
| in  |True if value/variable is found in the sequence | x in y     |
| not in  | True if value/variable is not found in the sequence | x not in y      |

Examples:

In [34]:
x = 'Hello world'
y = [3.14, 'Hi', 1, 2+3j, False]
z = {1:'a',2:'b'}

# Output: True
print('H' in x)

# Output: True
print('hello' not in x)

# Output: True
print(2+3j in y)

# Output: True
print(1 in z)

# Output: False
print('a' in z)

True
True
True
True
False
