# <center>Introduction to Python (I)</center>

Refereces: 
- https://www.tutorialspoint.com/python

## 1. Introduction

 - Python is a general-purpose *interpreted*, interactive, object-oriented, and high-level programming language
   - Questions: 
     - What is the difference between *interpreted* and *compiled* languages? 
     - Do you know any compiled languages? 
     - Check https://www.programiz.com/article/difference-compiler-interpreter for answers
    
 - Dynamically typed: variables do not have a predefined type
 - “Batteries included”: large collection of proven modules included in the standard Python distribution
 - Thousands of third party packages (collections of modules) available at https://pypi.python.org/pypi



## 2. Python Environment
- To start Python:
  - type "python" in a terminal window (or command window) to start Python interactive interpreter, or
  - use GUI environments such as Jupyter Notebook or Spyder
- Check your current working folder and python system path

In [None]:
# Exercise 2.1: Check your python environment

import os, sys

# Get your current working folder
print(os.getcwd())

# Change your current working folder to another folder
# for mac: e.g. /Users/xyz/temp
# for windows users: e.g. "C:/xyz". Don't use backslash "\"
os.chdir("/your/folder/here")  

# print something to a file without specifying its path
# where can you find the file?
print("hello world", file=open("helloworld.txt",'w'))

# get the system path of your python. These are the paths Python will search for modules
print(sys.path)

## 3. Basic Syntax

### 3.1 ** Python identifier ** : name to identify a variable, function, class, module, object###
 * Starts with a letter or underscore (_) followed by letters, underscore, and digits
 * No punctuations
 * Case sensitive: “Web” ≠ “web”
 * Reserved words: <font color=blue> **and**, **break**, **def**, **if**, **else** </font>, …

### 3.2. ** Lines and indentation **###
 * Blocks of code are denoted by <font color=blue>**line indentation**</font> (**not braces!**)
 * The number of spaces in the indentation varies, <font color=blue> ** but all statements within the block must be indented the same amount ** </font>
 *  All the continuous lines indented with same number of spaces form a block


In [None]:
# Exercise 3.2.1: Indentation - Correct Example

x=True

if x:
    print("True")
    print('correct indentation')
else:
    print("False")

In [None]:
# Exercise 3.2.2: Indentation - Incorrect Example
#                 Try fix it

x=False
if x:
    print ("True")
else:
    print ("False")
  print ("Incorrect indentation")

### 3.3.  Quotation ###
 * **single (<font color=blue>'</font>)**, **double (<font color=blue>"</font>)** and **triple (<font color=blue>'''</font> or <font color=blue>"""</font>)** quotes are acceptable to denote string literals
 * The triple quotes are used to span the string across multiple lines.

In [None]:
# Exercise 3.3.1: Quotes

word = 'word'
print(word)

sentence = "This is a sentence."
print(sentence)

paragraph = """This is a paragraph with multiple lines. 
line 1
line 2"""
print (paragraph)

### 3.4. **Comments**###
* **Hash sign (<font color=blue>#</font>)** begins a comment
* All characters after the # and up to the end of the physical line are part of the comment and the Python interpreter ignores them.
* Add appropriate comments to increase your code readability

In [None]:
# Exercise 3.4.1: Comments

# First comment
print ("Hello, Python!") # second comment

### 3.5. ** Variable assignment **###
 * **The equal sign (<font color=blue>=</font>)** is used to assign values to variables
 * No need to decare variable type
 * You can assign a single value to multiple variables simultaneously
 * Or assign multiple objects to multiple variables perspectively

In [None]:
# Exercise 3.5.1: Assignment

# assign variables without declaring their types
counter = 100          # An integer assignment
miles   = 1000.0       # A floating point
name    = "John"       # A string

# assign a single value to multiple variables
a = b =  '1'
print("a =", a)
print("b =", b)

# assign multiple objects to multiple variables respectively
a,b = '1',"john"
print("a =", a)
print("b =", b)

## 4. Standard Data Types: Number, String, List, Tuple, Dictionary, Iterator

### 4.1. **Numbers**###
 * 4 different numerical types: int, long, float, complex
 * int: 10, 100, -123
 * float (floating point real values): 0.0, 15.20
 * complex: (complex numbers): 3.14**<font color=blue>j</font>**
 
<div class="alert alert-block alert-info">In python 2, there is another data type for numbers: long. This data type is removed in Python 3. "int" in Python 3 is equivalent to "long" in Python 2 </div>

### 4.2. **Strings** ###
 * Strings are identified as a contiguous set of characters represented in the quotation marks
 * Substrings can be taken using slice operator **<font color=blue>[ ]</font>** with indexes starting at **<font color=blue>0</font> in the beginning of the string ** and working their way from **<font color=blue>-1</font> at the end **.
 * Strings can be concatenated using **<font color=blue>+</font>**
 * Useful string functions: **<font color=blue>strip, lstrip, rstrip, split, lower, upper, find, rfind, replace</font>**

In [None]:
# Exercise 4.2.1: Strings

s=''               # an empty string
s = 'Hello World!'

print(s)          # Prints complete string
print(s[0])       # Prints first character of the string
print(s[2:5])     # Prints characters starting from 3rd to 5th
print(s[2:] )     # Prints string starting from 3rd character
print(s[-1])      # Prints the last character 
# Question: how to get the last three characters?

print(s + " TEST") # Prints concatenated string
print(s[0:10:2])   # step-wise slicing; extracting characters with even indexes from the first 10 characters

In [None]:
# Exercise 4.2.2: Useful string functions

s='   Welcome to Web Mining Class   '
print (s.strip() )            # remove both leading and trailing spaces
print (s.lstrip() )           # remove leading spaces
print (s.rstrip()  )          # remove trailing spaces
print (s.split(" "))          # split s by delimiter " " into a list
print (s.lower())             # convert to lowercase
print (s.upper())             # convert to uppercase
print (s.find("W"))           # get the index of the first "W" in s starting from the left; return -1 if "W" is not in s
print (s.rfind("W"))          # get the index of the first "W" in s starting from the right; return -1 if "W" is not in s
print (s.replace("W", "**"))  # replace all occurrences of "W" by "**"


In [None]:
# Exercise 4.2.3: 

path="http://localhost:8888/notebooks/Python_I.ipynb"

# 1. Retrieve the last five characters of the path, i.e. get 'ipynb'

# 2. retrieve the file name in the path, i.e. the part after the last "/"

# 3. reverse the number sequence, i.e. for string '123456789', print '987654321'
s='123456789'



### 4.3. **Lists** ###
* A list contains items separated by **<font color=blue>commas (,) </font>** and enclosed within **<font color=blue> square brackets ([])</font>**, e.g. [ 'abcd', 786 , 2.23, 'john', 70.2 ]
* Items in a list can be of **different data type** (different from arrays in C)
* Values in a list can be accessed using slice operator **<font color=blue>[ ]</font>** with indexes starting at **<font color=blue>0</font> for the first element ** and working their way from **<font color=blue>-1</font> at the end **.
* Lists can be concatenated using **<font color=blue>+</font>**
* A string is actually a list of characters without commas!
* Items in a list can be of any python data types, e.g. numbers, strings, lists, tuples, dictionaries
* List functions: **<font color=blue>append, remove</font>**

In [None]:
# Exercise 4.3.1: list and list functions

list1 = [ 'welcome', "to" , "my", 'class' ]
list2 = ['BIA-660', 'Web', 'Analytics']

print (list1)          # Prints complete list
print (list1[0])       # Prints first element of the list
print (list1[1:3])     # Prints elements starting from 2nd till 3rd 
print (list1[2:])      # Prints elements starting from 3rd element
print (list1[-1])      # Prints the last element

# concatenate elements in a list
print (" ".join(list1))# join elements into a single string with " " as the separator

# concatenate lists
print (list1 + list2)  # Prints concatenated lists

In [None]:
# Exercise 4.3.2: Useful functions of lists

list3=range(10)       # generate the numbers from 0 to 9; retrun an iterator
print (list3)
print (list(list3))

list4=range(1,10)     # generate the numbers from 1 to 9
print (list(list4))

list5=['a','b']*3     # repeat list ['a','b'] three times
print (list5)

list5.append("c")     # append "c" to the end of the list
print (list5)

list5.remove("a")     # remove the first "a" found in the list from the beginning
print (list5)           

# question: how to remove all 'a'?

# get index of the first occurrence of "a"  
print ("index of first 'a':",list5.index("a") )         

# get count of "a" in the list
print ("count of 'a':", list5.count("a") )   

# sort list
list6=[1,3,5,4,0]
print(sorted(list6))
print(sorted(list6, reverse=True)) 
# any other way to sort in the decreasing order?

In [None]:
# Exercise 4.3.3: List operations

list2 = ['BIA-660', 'Web', 'Analytics']

# 1:  How to extract "660" from list2

# 2: Join the last two elements in list2 
#    into a string with "-" as the separator, 
#    i.e. 'Web Analytics'


### 4.4. **Tuples** ###
*  A tuple consists of a number of values separated by commas enclosed in **<font color=blue>( )</font>** , e.g. ('welcome', "to" , "BA", 660)
*  A tuple is similar to a list in many aspects. However, they differ in the following:
    * Tuples are enclosed within parentheses **<font color=blue>( )</font>**, while lists are enclosed in **<font color=blue>[ ]</font>**.
    * A tuple, once declared, **<font color=blue>cannot be updated (read-only)</font>**, while the elements and size of a list can be changed

In [None]:
# Exercise 4.4.1: tuples

tuple1 = ( 'welcome', "to" , "my", 'class' )
tuple2 = ('BIA (Web Mining)', 660)

print (tuple1)          # Prints complete tuple
print (tuple1[0])       # Prints first element of the tuple
print (tuple1[1:3])     # Prints elements starting from 2nd till 3rd 
print (tuple1[2:])      # Prints elements starting from 3rd element
print (tuple1 + tuple2)  # Prints concatenated tuples

In [None]:
# Exercise 4.4.2: Comparison between lists and tuples

# list can be updated, while tuple is read only
list1 = [ 'welcome', "to" , "my", 'class' ]
tuple1 = ( 'welcome', "to" , "my", 'class' )
list1[0]='WELCOME'
tuple1[0]='WELCOME'        # incorrect

In [None]:
# Exercise 4.4.3: Make a list of tuples of out of two lists, 
# e.g. [(1,'Mary'),(2,'Tom'),(3, 'Joe')]

ids=[1,2,3]
names=['Mary','Tom','Joe']

x=[(ids[0], names[0]),\
   (ids[1], names[1]),\
   (ids[2], names[2]) ]
print(x)

# A efficient way: zip function - combine lists element-wise
# zip function in python 3 returns an iterator
# use list function to convert an iterator to list
zipped=list(zip(ids, names))   
print(zipped)

# how to convert zipped back to unzipped?


### 4.5. **Dictionary** ###
*  A dictionary is similar to a lookup table with key-value pairs, e.g. {1:'Mary Joe', 2:'David Johnson'} 
*  Keys are **unique**
*  A dictionary is enclosed by **<font color=blue>curly braces { } </font>** 
*  Values can be assigned and accessed using **<font color=blue> square braces [ ]</font>**
*  Keys are usually **<font color=blue>numbers or strings</font>**, but values can be of any python data types, e.g. numbers, strings, lists, tuples, dictionaries

In [None]:
# Exercise 4.5.1: dictionary

dict1 = {}                     # define an empty dictionary
dict1['one'] = "This is one"   # add a key-value pair to the dictionary
dict1[2]     = "This is two"

dict2 = {1:'Mary Joe', 2:'David Johnson'}  # a more compact way to define a dictionary

print (dict1['one'])       # Prints value for 'one' key
print (dict1[2])           # Prints value for 2 key
print (dict2)              # Prints complete dictionary
print (dict2.keys())       # Prints all the keys
print (dict2.values())     # Prints all the values
print (dict2.items() )     # print key-value pair as a list of tuples


### 4.6. **Iterator** ###
*  An iterator is simply an object that can be iterated upon. An iterator will return data, one element at a time.
*  An iterator must implement two special methods, <font color=blue> iter</font> and  <font color=blue>next</font>.
* An object is called **iterable** if we can get an iterator from it. Iterable objects include lists, tuples, strings, dictionaries etc. The <font color=blue>__iter__()</font> function returns an iterator from an iterable object.
* An iterator object can be used **only once**. When we reach the end and there is no more data to be returned, it will raise the error *StopIteration*.

*Question*: when is an iterator useful?

In [None]:
# Exercise 4.6.1: iterables

# define a list
my_list = [4, 7, 0, 3]

# get an iterator using iter()
my_iter = iter(my_list)

# similarly, you can convert the iterator object back to a list
print(list(my_iter))

## iterate through an iterable using next() 
my_iter = iter(my_list)

#prints 4
print(next(my_iter))

#prints 7
print(next(my_iter))

# loop through all remaining elements
for item in my_iter:
    print (item)
    
# raise StopIteration error if call next again since the for loop has reached the end of the iterator object
print(next(my_iter))


### 4.7. Data Type Coversion ###

| Function      | Description                                                                |
| :------------- |:----------------------------------------------------------------------------|
| int(x)        | Converts string x to integer                                               |
| float(x)      | Converts string x to a floating-point number                               |
| str(x)        | Converts object x to a string                                              |
| list(x)       | Converts sequence x (e.g. tuple, string, set, or dictionary) to a list     |
| tuple(x)      | Converts sequence x (e.g. list, string, set, or dictionary) to a tuple     |
| dict(x)       | Creates a dictionary using x, which must be a sequence of (key,value) pair |
| set(x)        | Converts sequence x to a set, an unordered collections of unique elements  |




In [None]:
# Exercise 4.7.1: data conversion

x='123'
y='246'

z2=int(x)+int(y)                       # convert x and y to integers and add them up
print ("z2=int(x)+int(y) : ",z2)

z4=float(x)+float(y)                   # convert x and y to floats and add them up
print ("z4=float(x)+float(y) : " ,z4)

z5=str(z4)                             # convert z4 to string
print ("z5=string of z4 : ",z5)

z7=list(z1)                            # convert z1, a string to a list
print ("z7=list(z1) : " ,z7)

z8=tuple(z1)                           # convert z1, a string to a tuple
print ("z8=tuple(z1) : " ,z8)

xy=[('x',x),('y',y)]

z9=dict(xy)                            # convert a list of tuples to a dictionary
print ("z9=dict(xy) : " ,z9)

z10=set(z7)                            # convert a list to a set (unique and unordered elements)
print ("z10=set(z7) : " ,z10)

## 5. Python Operators

### 5.1.  Basic Operators : <font color=blue>+, -, \*, /, %, /, \*\*, +=, -=</font> ###

In [None]:
# Exercise 5.1.1.: basic operators

a, b=2,7
print ("a+b = ", a+b)             # addition/concatenation
print ("a-b = ", a-b )            # subtraction
print ("a*b = ", a*b )            # multiplication

# division. Notice that a,b are integers 
# in python2, b/a is also an integer
# in python3, b/a is a float
print ("b/a = ", b/a )    

# modulus, returns remainder
print ("b%a = ", b%a)    
# floor division, return quotient which is floored
print ("b//a = ", b//a )     
# exponent
print ("b**a = ", b**a )          

# consise assignment 
a+=b
# a+=b is equivalent to a = a + b.
print ("after a+=b, a = ", a)       

a*=b
# a*=b is equivalent to a= a*b. 
print ("after a*=b, a = ",a)     
# This concise assignment operator can be applied to 
# other basic operators

### 5.2. Comparison Operators: <font color=blue>==, !=, >, <, >=, <=, is, not is, in, not in </font> ###

In [None]:
# Exercise 5.2.1.: comparison operators  -- value comparison

a, b = 2,7

print ("a==b ?", a==b)          # equal
print ("a!=b ?", a!=b )         # not Equal
print ("a>b ?",a>b)             # greater than
print ("a<b ?",a<b )                 # less than
print ("a>=b ?",a>=b)                 # greater than or equal
print ("a<=b ?",a<=b)                # less than or equal

In [None]:
# Exercise 5.2.2.: comparison operators -- identity operator

a, b =2,7

# test if both sides of the operator point to the **same object**
print ("a is 2 ?", a is 2)     
# test if both sides of the operator are not the same object
print ("a is not b ?",a is not b)    

#  Question: identity operator "is" is equivalent to "=="?
#  Try the following 
x=[1,2,3]
y=[1,2,3]
print("x==y? ", x==y)
print("x is y? ", x is y)

# Conclusion: 
# "is": object equality, "==": value equality
# for small integers, "is" perhaps is equivalent to "=="
# for composite data types, they are different
# for details, check discussion: https://stackoverflow.com/questions/132988/is-there-a-difference-between-and-is-in-python

In [None]:
# Exercise 5.2.3.: comparison operators -- membership operator

a, b, c=2,7, [1,2,3]

# test if a is in list c
print ("a in c ? ",a in c) 
# test if b is not in list c
print ("b not in c ? ",b not in c)     

# test if a key is in dictionary dict1
dict1={"105":"Mary", "010":"Joe","030":"Tom"}

print("020 is a key of dict1 ", "020" in dict1)
print("020 is not a key of dict1 ", "020" not in dict1)

# Note that you can directly use **dict1**. 
# You can also use **dict1.keys()** 

In [None]:
# Exercise 5.2.4.: comparison operators

dict1={"105":"Mary", "010":"Joe","030":"Tom"}
list1=[("105","Mary"), ("010","Joe"),("030","Tom")]

# Task1: Test if ("040","John") is in list1

# Task2: Test if "040" is a key of dict1

# Task3: Test if "Joe" is a value of dict1


### 5.3. Logical Operators :<font color=blue> and, or, not </font> ###

In [None]:
# Exercise 5.3.1.: logical operators

a, b, c=2,7, [1,2,3]
# both conditions are true
print ("(a>b) and (a in c) ?",(a>b) and (a in c))     
# either condition is true
print ("(a>b) or (a in c) ?",(a>b) or (a in c))
# reverse the condition
print ("not (a>b) ?",not (a>b) )                    

