![image.png](attachment:image.png)

***

## Section 1: Data Type

<strong> A. Overview: </strong>
- Number
- Boolean
- List
- String
- <font style="color:red;">Data Type Conversion</font>
- Dictionary
- DataFrame

For more details, please refer to Python official documentation: https://docs.python.org/3.7/library/datatypes.html

<strong> B. Number: </strong>
- Integer: abbr. `int`, signed whole number, e.g. _1 or 0_
- Float: signed decimal number, e.g. _1.0 or 0.0_

<div class="alert alert-block alert-success">
    **<b>Extra Knowledge</b>** We can use funtion <font style='color:red;font-weight:bold;'>type(variable name)</font> to figure out the data type of a certain variable.</div> 

Try running cells below:

In [None]:
a=0
print(type(a)) #This is a function of function. print() will use return value of type() as its input value.

In [None]:
b=0.0
print(type(b))

<div class="alert alert-block alert-info">
**<b>Tip</b>** Python sets variable's type based on its value. If you change its value to another data type, variable's type will also be changed accordingly.</div>

In [None]:
a=0
print(type(a))
a=0.0
print(type(a))

<div class="alert alert-block alert-info">
**<b>Tip</b>** You can use `round(number,decimals)` to round numbers.</div>

In [None]:
a=1.24343
print(round(a))
print(round(a,2))

#### <font style="color: blue">Practice:</font>
---
<font style="color: blue"> <p>Suppose a=1.0, b=1, c=2</p>
<p>1. What is the data type of (a+b)?</p>
<p>2. What is the data type of (a/b)?</p>
<p>3. What is the data type of (a&ast;b)?</p>
<p>4. What is the data type of (b+c)?</p>
<p>5. What is the data type of (b/c)?</p>
</font>

In [None]:
#Write down your code here
#---------------------------------------------------------
#HINT:
#Step 1: assign values to a, b and c.
#Step 2: print out data types of required questions.
#---------------------------------------------------------






<strong> C. Boolean: </strong>
- abbr. `bool`
- Special type of Integer which is **Dichotomous** with only two potential values _True_ and _False_

In [None]:
a=True
print(type(a))

In [None]:
#Boolean and Integer are often used interchangably. True = 1 and False = 0.
#Try applying mathematical calculation onto variable "a".




<strong> D. List: </strong>
- List contains a series of values. Each value is an "element" or "item" of list.
- List elements can be of heterogenous data types.
- How to create a list?
    - Use sqaure brackets, separating elements with commas: `[a,b,c]`
- How to refer to an element?
    - Format: `list_name[index]`
    - <b>0-Based Index</b>: the index of element starts from 0, i.e. the first element is with index of 0.
    
|Element|H|e|l|l|o|!|
|--|--|--|--|--|--|
|Index|0|1|2|3|4|5|

In [None]:
a=[1,2,3] #list of whole numbers
print(type(a))

b=[1,2,True] #list of whole numbers and a boolean value
print(type(b))

c=[1,2,'hello','world'] #list of whole numbers and strings
print(type(c))

d=[1,2,[3,4]] #list of list
print(type(d))

<div class="alert alert-block alert-success">
**<b>Extra Knowledge</b>**
    <br>1. We can use funtion <font style='color:red;font-weight:bold;'>len(list)</font> to check the length of list.
    <br>2. We can use function <font style='color:red;font-weight:bold;'>sum(list)</font> to get the total of a list of numbers.
    <br>3. We can use function <font style='color:red;font-weight:bold;'>max(list)</font> to get the max value of a list of numbers.
    <br>4. We can use function <font style='color:red;font-weight:bold;'>min(list)</font> to get the min value of a list of numbers.
    <br>5. We can use method <font style='color:red;font-weight:bold;'>.count(value)</font> to count the frequency of a certain value.
    <br>6. We can use method <font style='color:red;font-weight:bold;'>.extend(list)</font> and <font style='color:red;font-weight:bold;'>.append(list)</font> to add new elements to the list.</div>

In [None]:
#Try len(), sum(), max(), min(), .count(), .extend() and .append()





<strong>E. String</strong>
- String is a special type of list whose elements are all characters.
- Use quotes to denote string
    - No difference between single quotes and double quotes.
    - Either single quote or double quote must be used in pairs, namely a string beginning with a double quote must end with a double quote.

In [None]:
a='hello world!'
print(type(a))
a="hello world!"
print(type(a))

<div class="alert alert-block alert-success">
**<b>Extra Knowledge</b>** We can use method <font style='color:red;font-weight:bold;'>.lower()</font> and <font style='color:red;font-weight:bold;'>.upper()</font> to quickly change the case of a string.</div>

In [None]:
#Try lower() and upper() methods.


<div class="alert alert-block alert-success">
**<b>Extra Knowledge</b>** We can use method <font style='color:red;font-weight:bold;'>.split(char)</font> to split the string by a given character and get a <font style='color:red;font-weight:bold;'>list</font> of sub-strings.</div>

In [None]:
a='This is an apple.'
#How many words are there in sentence a?



<strong>F. Data Type Conversion</strong>
- Forced conversion by function: `int()`, `float()`, `bool()`, `list()`, `str()`

#### <font style="color: blue">Practice:</font>
---
<font style="color: blue">
<p>Write some commands to figure out which of the following pairs of conversion are workable?</p></font>

In [1]:
from IPython.display import HTML
from IPython.display import display
import ipywidgets as widgets

tag = HTML('''<script>
code_show=true; 
function code_toggle() {
    if (code_show){
        $('div.cell.code_cell.rendered.selected div.input').hide();
    } else {
        $('div.cell.code_cell.rendered.selected div.input').show();
    }
    code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<button><a href="javascript:code_toggle()">Hide/Show</a></button>''')
display(tag)
options=['integer -> float', 'integer -> boolean','integer -> list', 'integer -> string',
        'float -> integer', 'float -> boolean', 'float -> list', 'float -> string',
        'boolean -> integer','boolean -> float', 'boolean -> list', 'boolean -> string',
        'list -> integer', 'list -> float','list -> boolean','list -> string',
        'string -> integer','string -> float', 'string -> boolean', 'string -> list']
items = [widgets.Checkbox(value=False,description=options[i],disabled=False) for i in range(len(options))]
left_box=widgets.VBox(items[:8])
middle_box=widgets.VBox(items[8:16])
right_box=widgets.VBox(items[16:])
widgets.HBox([left_box,middle_box,right_box])

HBox(children=(VBox(children=(Checkbox(value=False, description='integer -> float'), Checkbox(value=False, des…

<font style="color: blue">
<p>Hint: to solve the first question, you can code as follows: </p></font>

>```python
a = 0
print(type(a))
a=float(a)
print(type(a))```

In [None]:
#Write down your code here
#------------------------------------------------






---
# Break
---

<strong>G. Dictionary</strong>
- Dictionary is a list of `key → value` pairs. `Keys` are **unique** identifiers of elements. `Values` can be hetergeneous.
- The main usage of dictionary is to look up a `value` based its `key`.
- How to create a dictionary? 
    - Use curly bracket, separating elements with comma. 
    - Each element contains a unique key and a value, separated by colon.
    - Keys and values can be of any data type
    - Format: `{key1:value1, key2:value2}`
- How to look up a value for a given key?
    - Format: `dictionary_name[key]`

In [None]:
# Suppose we have three participants in our class: king-wa (id=1001), junior (id=1002), benjamin (id=1003).
# Try generating a dictionary of participants' ids and their names.




In [None]:
# What are the name, gender and affiliation of participant whose id is 1003?



<div class="alert alert-block alert-info">
**<b>Tip</b>** Dictionary has a famous counterpart in JavaScript named JSON. Compared with previous five data types, dictionary operates at a higher level as it can represent not only the values but also the <b>relationship</b> between values. Moreover, it is highly <b>human readable</b>.</div>

<div class="alert alert-block alert-success">
**<b>Extra Knowledge</b>** We can use method `.keys()` to extract all keys and `.values()` to extract all values.</div>

In [None]:
# Try .keys() and .values() methods.



<strong>H. DataFrame</strong>
- A special data type of Pandas library.
- The main usage of dictionary is to look up a `value` based its `key`.
- How to create a DataFrame? 
    - Use Pandas' Function `pd.DataFrame(data=...[,index=...,columns=...])`. Arguments in square bracket are optional.
    - Import from other data source, like csv, excel or json. For example, `pd.read_csv(file path [, indexcol=..., header=...])`.
- How to look up a value for a given key?
    - Format: `dataframe.loc[index]`

In [None]:
import pandas as pd
#We have three ways to create a DataFrame, i.e. Progressive way, Radical way and Easy way.
#1: Progressive way
a=pd.DataFrame([[1001,'king-wa','M','JMSC'],[1002,'junior','F','JMSC'],[1003,'benjamin','M','SHKS']])

In [None]:
a.columns=['id','name','gender','affiliation']

In [None]:
a=a.set_index('id')

In [None]:
a

In [None]:
#2: Radical way
a=pd.DataFrame([['king-wa','M','JMSC'],['junior','F','JMSC'],['benjamin','M','SHKS']],columns=['id','name','gender','affiliation'],index=[1001,1002,1003])

In [None]:
#Retrieve values


In [None]:
#3: Easy way\
#format: pd.read_csv(file path [, header=..., indexcol=...])
b=pd.read_csv('COMM_journals.csv',header=0,index_col=0)

In [None]:
b

## Quiz

<h3>Q1. Among the following which will create a list?</h3>

>```python
(A) a = "1,2,3"
(B) a = [1,2,3]
(C) a = (1,2,3)
(D) a = {1,2,3}```

<h3>Q2. Among the following which will create a dictionary?</h3>

>```python
(A) a = {1:2}
(B) a = {1,2}
(C) a = {1;2}
(D) a = (1,2)```

<h4><i> Suppose I have a dictionary named "dic"</i></h4>

>```python
dic = {'a':5,'b':4,'c':3,'d':2,'e':1}```
<h3>Q3. Among the following which can help retrieve the value of "a" from dic?</h3>

>```python
(A) dic{a}
(B) dic[a]
(C) dic['a']
(D) dic{'a'}```

## Section 2: Some useful built-in functions

<strong> A. File I/O: </strong>
- Use `open(path[,mode='r'])` function.
    - Modes: read ('r') or write ('w') or both ('r+') or append ('a')
- Input: `.readlines()` method will extract all content in the file as a list of strings. One paragraph, one string.
- Output: `.write(string)` method write the given content to the file, from the beginning of file if mode '`w`' is used or from the bottom of file if mode '`a`' is used.
- Save and Close File: `.close()` method

In [None]:
# Create a new file, add new lines to it and close it.




In [None]:
# Open file you just created, read the existing lines and print them one by one.




<strong> B. For Loop </strong>
- `for` loop is used to iterate through every element in a list and repeatedly execute commands after the colon.
    - Format: `for a in list_name: ...`
    - Usually coupled with `range([start=0,] stop[, step=1])` function, which will automatically create a list of continuous whole numbers ranging from the start number and stopping at but not including the stop number.
    - Syntax: In above format, `...` is a block of commands subordinate to for loop. They are only functional within for loop. Python requires indent to group commands into block. For example:
>```python
for a in range(5):
    print(a)       #Use indentation to denote a Block subordinate to above statement
    print(a+1)
    print(a+2)
```

In [None]:
a=[0,1,2,3]
for i in a:
    print(i)

In [None]:
for i in range(4):
    print(i)

In [None]:
for i in range(0,4,2):
    print(i)

<div class="alert alert-block alert-success">
**<b>Extra Knowledge</b>**
    <br>We can use a loop in the list to create a new list based on an old one.</div>

In [7]:
a=[0,1,2,3]
#two ways to increase every element in a by one unit
#---------------------------------
#1



#---------------------------------
#2


<div class="alert alert-block alert-success">
**<b>Extra Knowledge</b>**
    <br>1. We can use <font style='color:red;font-weight:bold;'>continue</font> statement to ignore following commands and directly jump to next iteration.
    <br>2. We can use <font style='color:red;font-weight:bold;'>break</font> statement to quit the loop.</div>

In [None]:
#Try continue and break




In [None]:
for i in 'Hello!':
    print(i)

In [None]:
#Print the characters at even indexes in 'Hello!'
a='Hello!'



In [None]:
#Repeat every line in above created file three times, i.e. copy each line and paste it three times to the file. Save the outputs.





In [None]:
#Cut lines in above created file by words. One word, one line. Save the outputs.






<strong> C. If/Else Statement </strong>
- If/Else Statement is used to test whether a condition is True. If yes, do something. If not, do something else. Else statement is optional.
- Format: `if logical_condition1 :... (else: ...)`
- Example:
>```python
if a==1:
    print('yes') #Block A
else:
    print('no') #Block B```

In [None]:
#Please use for loop and If/Else statement to select all even numbers from 0 to 19 and print them out one by one.





<div class="alert alert-block alert-success">
    **<b>Extra Knowledge</b>** <font style='color:red;font-weight:bold;'>If/Else</font> statement can be upgraded into a <font style='color:red;font-weight:bold;'>If/Elif/Else</font> statement.</div>

In [None]:
a=1
if a<0:
    print('negative')
elif a==0:
    print('neutral')
else:
    print('positive')

#### Practice
<img src='img/week2-decision-tree.jpg'>

In [None]:
#Use If/Elif/Else statement to allocate a patient with records as below:
a={'new patient':False,'unpaid bill':False}





## Section 3: Build our own function

- Function is a block of reusable codes. Annotation: y=f(x), where x is a list of input variables and y is a list of output variables.
    - Terminology: input variables = <b>parameters</b>, output variables = <b>returned variables</b> and their actual values = <b>arguments</b>
    - <b>Global vs Local</b>: function can create its local variables that are only used inside its boundary. Local variables can use same names as global variables without overriding their values.
    - Format:
>```python
def function_name(input1[,input2,input3...]):
        command line
        return 
    ```

- The function of function is to transform x into y. Like a magic trick turning a girl into a tiger.
<img src='img/week2-function.png' width='200px'>

In [None]:
#Wrap our preview If/Elif/Else statements into a customer function, which takes patient record dictionary as input and return.
a={'new patient':False,'unpaid bill':False}






<img src='img/week2-presidents.png'>
Presidential inauguration speeches capture the sentiment of the time.

## Practice: Inauguration Speech
Download the dataset from: https://juniorworld.github.io/python-workshop-2018/doc/presidents.rar

<p>Expected Objectives:</p>

1. Total number of sentences in the speech
2. Total number of words in the speech
3. Average length of sentences
4. Coleman–Liau index of Readablity

#### Coleman–Liau index:
><b>CLI = 0.0588 &ast; L - 0.296 &ast; S - 15.8</b>
<br>L is the average number of letters per 100 words and S is the average number of sentences per 100 words.

In [None]:
presidents=['Washington','Jefferson','Lincoln','Roosevelt','Kennedy','Nixon','Reagan','Bush','Clinton','W Bush','Obama','Trump']

In [None]:
for president in presidents:
    file=open('doc\\'+president+'.txt','r')
    paragraphs=file.readlines()
    paragaraph_count=           #Write your command here
    sentence_count,word_count,letter_count=readablity_test(paragraphs)
    CLI=0.0588*(letter_count/word_count*100)-0.296*(sentence_count/word_count*100) - 15.8
    if CLI <= 6:
        grade_level='primary'
    elif CLI<=12:
        grade_level='secondary'
    elif CLI<=16:
        grade_level='undergrad'
    else:
        grade_level='postgrad'
    print(president,':',sentence_count,'sentences,',word_count,'words,',round(word_count/sentence_count),'words/sentence, CLI at',round(CLI),',',grade_level,' level')

In [None]:
def readablity_test(paragraphs):
#Define a customer function readablity_test() to output sentence_count,word_count and letter_count







    return sentence_count,word_count,letter_count

In [None]:
#Save results to a new file





<img src='img\week2-flow.png'>