# Data Programming in Python | MSCI:6040
# Part 1. Python Basics

Instructor: Kang-Pyo Lee 

Topics to be covered:
- Numbers
- Strings (+ exercises)
- Lists (+ exercises)
- Tuples (+ exercises)
- Dictionaries (+ exercises)
- Sets (+ exercises)
- Built-in Functions (+ exercises)
- Operators
- del

References: 
- Data Wrangling with Python by Katharine Jarmul, Jacqueline Kazil (http://shop.oreilly.com/product/0636920032861.do)
- Python Programming by en.wikibooks.org (https://en.wikibooks.org/wiki/Python_Programming)
- Python Standard Library by Python Software Foundation (https://docs.python.org/3/library/)
- Python Tutorial by Python Software Foundation (https://docs.python.org/3/tutorial/)
- Python Tutorial by w3schools.com (https://www.w3schools.com/python/default.asp)

## ▪ Numbers

three types of numbers
- Integers
- Floating point numbers
- Complex numbers

In [1]:
x = 5
print(x, type(x))

5 <class 'int'>


x is called a variable. 

In [2]:
y = 3.141592
print(y, type(y))

3.141592 <class 'float'>


In [3]:
z = 5 + 2j
print(z, type(z))

(5+2j) <class 'complex'>


You do not have to specify what type of variable you want. The data types are dynamically inferred.

## ▪ Strings

In [4]:
s = "How are you?"
print(s, type(s))

How are you? <class 'str'>


String literals can be enclosed in matching single quotes (') or double quotes ("); either is fine. 

In [5]:
s = "She said, "How are you?""
print(s)

SyntaxError: invalid syntax (<ipython-input-5-6ae9475ee435>, line 1)

In [6]:
s = "She said, \"How are you?\""
print(s)

She said, "How are you?"


If you need double quotes in a string already enclosed in double quotes, you can put a backslash escape character (\\) before each double quote inside the string.

In [7]:
s = 'She said, "How are you?"'
print(s)

She said, "How are you?"


If the string contains double quotes, you can use single quotes around the string without using a backslash, and vice versa.

In [8]:
s = "I'm a boy."
print(s)

I'm a boy.


In [9]:
s = "1"
print(s, type(s))

1 <class 'str'>


In [10]:
print(1, "1")

1 1


Integer 1 and string "1" look the same when printed.

In [11]:
print(type(1), type("1"))

<class 'int'> <class 'str'>


In [12]:
s = "How are you?"
len(s)

12

The <b>len</b> function is a built-in function of Python, which is widely used for getting the length of a list of any type. 

### String Additions and Multiplications

In [13]:
s1 = "hello"
s2 = "world"
s1 + s2

'helloworld'

The easiest way to combine two strings is to use the + operator. 

In [14]:
s1 * 3

'hellohellohello'

In [15]:
s1 * s2

TypeError: can't multiply sequence by non-int of type 'str'

### Containment

In [16]:
s1 = "hello"
s2 = "hell"
s2 in s1

True

In [17]:
s1 in s2

False

The <b>in</b> operator returns True if the first operand is contained in the second.

### Indexing and Slicing -  this is important

Python string is, in fact, a sequence, meaning that it could be indexed and sliced.

In [37]:
s = "Data_Science_Institute!"
s

'Data_Science_Institute!'

In [19]:
index = 0
for character in s:
    print(index, "\t", character)
    index += 1

0 	 D
1 	 a
2 	 t
3 	 a
4 	 _
5 	 S
6 	 c
7 	 i
8 	 e
9 	 n
10 	 c
11 	 e
12 	 _
13 	 I
14 	 n
15 	 s
16 	 t
17 	 i
18 	 t
19 	 u
20 	 t
21 	 e
22 	 !


A Python index starts from 0, increments by 1, and ends at the length -1. 

In [20]:
s[0]

'D'

You can access a character in a string by referring to the index number inside square brackets.

In [21]:
s[22]

'!'

In [22]:
s[23]

IndexError: string index out of range

In [23]:
s[0:4]

'Data'

You can use a colon to apply a range filter. Note that s[i:j] will return a string starting with s[i] and ending with s[j-1], not s[j]

In [24]:
s[:4]

'Data'

You can skip the starting index 0, if it starts from 0. s[:n] is the easiest way to get the first n characters in a string.

In [27]:
s[4:23]

'_Science_Institute!'

In [28]:
s[4:]

'_Science_Institute!'

You can skip the ending index, if it ends to the end.

In [29]:
s[:]

'Data_Science_Institute!'

You can skip both the starting and ending indices if it starts from 0 and ends to the end.

In [30]:
s[-1]

'!'

Python also indexes the arrays backwards, using negative numbers.

In [31]:
index = 0
for character in s:
    print(index, "\t", index-len(s), "\t", character)
    index += 1

0 	 -23 	 D
1 	 -22 	 a
2 	 -21 	 t
3 	 -20 	 a
4 	 -19 	 _
5 	 -18 	 S
6 	 -17 	 c
7 	 -16 	 i
8 	 -15 	 e
9 	 -14 	 n
10 	 -13 	 c
11 	 -12 	 e
12 	 -11 	 _
13 	 -10 	 I
14 	 -9 	 n
15 	 -8 	 s
16 	 -7 	 t
17 	 -6 	 i
18 	 -5 	 t
19 	 -4 	 u
20 	 -3 	 t
21 	 -2 	 e
22 	 -1 	 !


In [32]:
s[-10:-1]

'Institute'

In [33]:
s[-10:]

'Institute!'

s[-n:] is the easiest way to get the last n characters in a string.

In [42]:
s

'DaTa ScIeNcE InStItUtE!'

Note that the original string <i>s</i> has not changed at all. Indexing and slicing of strings returns a new copy of string, not changing the original string. 

In [39]:
s = s[:4]
s

'Data'

If you want to change the orignial string, make sure to re-assign the new copy to the original variable. 

### String Methods

In [40]:
s = "DaTa ScIeNcE InStItUtE!"
s

'DaTa ScIeNcE InStItUtE!'

In [41]:
s.upper()

'DATA SCIENCE INSTITUTE!'

The <b>upper</b> method returns a string where all characters are in upper case. Symbols and numbers are ignored.

In [43]:
s.lower()

'data science institute!'

The <b>lower</b> method returns a string where all characters are lower case. Symbols and numbers are ignored.

In [44]:
s.count("S")

2

The <b>count</b> method returns the number of times a specified value appears in the string.

All strings and string methods in Python are case-sensitive.

In [48]:
s = "\tData Science Institute!\nWelcome!"     # \t: tab, \n: new line
print(s)

	Data Science Institute!
Welcome!


In [49]:
s = "\tData Science Institute    \n"
print(s.strip())

Data Science Institute


The <b>strip</b> method removes any leading (spaces at the beginning) and trailing (spaces at the end) characters. Space is the default character to be removed.

In [46]:
print(s.lstrip())

Data Science Institute!
Welcome!


The <b>lstrip</b> method removes any leading characters.

In [47]:
print(s.rstrip())

	Data Science Institute!
Welcome!


The <b>rstrip</b> method removes any trailing characters.

In [50]:
s = "Data Science Institute;"
print(s.rstrip(";"))

Data Science Institute


You can specify the character to be removed. 

In [51]:
s = "Data Science Institute"
s.startswith("Data")

True

The <b>startswith</b> method returns True if the string starts with the specified value, otherwise False.

In [52]:
s.endswith("e")

True

The <b>endswith</b> method returns True if the string ends with the specified value, otherwise False.

In [94]:
l = ["a", "b", "c"]
"+".join(l)

'a+b+c'

The <b>join</b> method takes all items in an iterable and joins them into one string. A string must be specified as the separator.

In [54]:
s = "Data Science Institute"
s.find("Science")

5

The <b>find</b> method finds the index of the first occurrence of the specified value.

In [55]:
s = "Data Science Institute"
s.find("z")

-1

It returns -1 if the value is not found.

In [56]:
s = "Data Science Institute"
s.index("Science")

5

The <b>index</b> method is almost the same as the <b>find</b> method, the only difference is that the <b>find</b> method returns -1 if the value is not found.

In [57]:
s = "Data Science Institute"
s.index("z")

ValueError: substring not found

In [59]:
s = "Data Science Institute"
s.replace(" ", "_")

'Data_Science_Institute'

The <b>replace</b> method replaces a specified value with another specified value.

In [60]:
s = "Data Science Institute"
s.split()

['Data', 'Science', 'Institute']

The <b>split</b> method splits a string into a list. Default separator is any whitespace. You can specify the separator. 

In [61]:
s = "Data_Science_Institute"
s.split("_")

['Data', 'Science', 'Institute']

In [64]:
name = "Alice"
age = 30
sex = "F"
s = "Name: {}, Age: {}, Sex: {}".format(name, age, sex)
s

'Name: Alice, Age: 30, Sex: F'

The <b>format</b> method formats specified values in a string. 

Note that string methods return a new copy of string, not changing the original string.

### Dot Notation

In [65]:
s = "Data Science Institute"

Suppose you want to perform a series of string methods, e.g., convert <i>s</i> to lowercase and then split it into a list of words. 

In [66]:
s1 = s.lower()
s1

'data science institute'

In [67]:
s1.split()

['data', 'science', 'institute']

In [68]:
s.lower().split()

['data', 'science', 'institute']

Dot notation is useful for taking the outcome of the previous method. That way, you do not have to store the intermediate outcome from the previous operation to do another operation. 

In [69]:
s.split().lower()

AttributeError: 'list' object has no attribute 'lower'

s.split() returns a list, not a string. There is no <b>lower</b> method in lists. 

## Exercises - Strings

In [73]:
s = "I'm learning Python data analytics."

Get the type of <i>s</i>.

In [74]:
# Your answer here
type(s)

str

Get the first character in <i>s</i>.

In [75]:
# Your answer here
s[0]

'I'

Get the last character in <i>s</i>.

In [76]:
# Your answer here
s[-1]

'.'

Get the first 3 characters in <i>s</i>. 

In [77]:
# Your answer here
s[:3]

"I'm"

Get the last 10 characters in <i>s</i>.

In [80]:
# Your answer here
s[-10:]

'analytics.'

Check if the substring <i>Python</i> is in <i>s</i>.

In [82]:
# Your answer here (using the in operator)
"Python" in s

True

In [83]:
# Your answer here (using the find method)
s.find("Python")

13

In [84]:
# Your answer here (using the index method)
s.index("Python")

13

Get only the <i>Python</i> part in <i>s</i>. (Use a range filter.)

In [87]:
# Your answer here
s[13:19]

'Python'

Get the lower-case version of <i>s</i>. 

In [88]:
# Your answer here
s.lower()

"i'm learning python data analytics."

Count the number of <i>yt</i>'s in <i>s</i>. 

In [89]:
# Your answer here
s.count("yt")

2

Remove all whitespaces in <i>s</i>. (Use <b>replace</b>.)

In [90]:
# Your answer here
s.replace(" ","")

"I'mlearningPythondataanalytics."

Remove the trailing period in <i>s</i> and then split it into a list of words, or tokens. (Use dot notation.)

In [99]:
# Your answer heres
s.replace(".","").split()

["I'm", 'learning', 'Python', 'data', 'analytics']

Join all names in the list <i>names</i> using a comma as separator.   

In [95]:
names = ["Alice", "Bob", "Tom"]

In [96]:
# Your answer here
",".join(names)

'Alice,Bob,Tom'

Print "I'm learning Python data analytics." by embedding the value of <i>s</i> using the <b>format</b> method.

In [100]:
s = "Python"

In [101]:
# Your answer here
s = "I'm learning {} data analytics".format(s)
s

"I'm learning Python data analytics"

## Python Collections

There are four collection data types in Python:

- List is a collection that is ordered and changeable (i.e., mutable). Allows duplicate members.
- Tuple is a collection that is ordered and unchangeable. Allows duplicate members.
- Dictionary is a collection of key-value mappings that is unordered and changeable. No duplicate keys.
- Set is a collection that is unordered and unindexed. No duplicate members.

It is important to choose the right type that fits your needs.

## ▪ Lists

A list is a collection that is ordered, changeable, and written with square brackets.

In [102]:
l = []
print(l, type(l))

[] <class 'list'>


In [103]:
l = list()
print(l, type(l))

[] <class 'list'>


In [104]:
l = [1, 2, 3]
l

[1, 2, 3]

In [105]:
l = ["a", "b", "c"]
l

['a', 'b', 'c']

In [106]:
l = [1, 2, 3, "a", "b", "c"]
l

[1, 2, 3, 'a', 'b', 'c']

List elements do not have to be of the same type.

In [107]:
len(l)

6

The length of a list is the number of items in the list.

In [113]:
l[:3]

[1, 2, 3]

Indexing and slicing of lists are exactly the same as that of strings.

In [109]:
3 in l

True

In [110]:
"z" in l

False

### List Methods

In [111]:
l1 = [1, 2, 3]
l2 = ["a", "b", "c"]
l1 + l2

[1, 2, 3, 'a', 'b', 'c']

The easiest way to combine two lists is to use the + operator, just as combining two strings. 

In [114]:
l1.extend(l2)
l1

[1, 2, 3, 'a', 'b', 'c']

The <b>extend</b> method adds the specified list elements (or any iterable) to the end of the current list.

Note that l1 + l2 returns a new copy of list, while l1.extend(l2) actually extends l1 by adding l2. 

In [115]:
l = [1, 2, 3]
l.append(4)
l

[1, 2, 3, 4]

The <b>append</b> method appends an element to the end of the list.

In [116]:
l.append([5, 6])
l

[1, 2, 3, 4, [5, 6]]

The list [5,6] has been added as an element of l, not part of the list. The <b>append</b> method always adds one element only to the end of a list.

In [117]:
fruits = ['apple', 'banana', 'cherry']
fruits.insert(0, "orange")
fruits

['orange', 'apple', 'banana', 'cherry']

The <b>insert</b>(pos, element) method inserts the specified value `element` at the specified position `pos`.

In [118]:
l = [1, 2, 3, 4, 5]
l.pop()
l

[1, 2, 3, 4]

The <b>pop</b>(pos) method removes the element at the specified position. The default value of `pos` is -1, which removes the last item.

In [119]:
l.pop(0)
l

[2, 3, 4]

pop(0) will remove the first item.

In [120]:
l.remove(3)
l

[2, 4]

The <b>remove</b> method removes the first occurrence of the element with the specified value.

In [121]:
l.clear()
l

[]

The <b>clear</b> method removes all the elements from a list.

In [122]:
l = []

l.clear() is equivalent to l = [].

In [124]:
l = [1, 6, 3, 4, 2, 5]
l.sort()
l

[1, 2, 3, 4, 5, 6]

The <b>sort</b> method sorts the list in ascending order by default.

In [1]:
l = [1, 6, 3, 4, 2, 5]
l.sort(reverse=True)
l

[6, 5, 4, 3, 2, 1]

Setting the `reverse` parameter to True will sort the list in descending order.

In [157]:
l = ["a", "c", "b", 3, 1, 2]
l.sort()
l

TypeError: '<' not supported between instances of 'int' and 'str'

You cannot sort a list with elements of different types. 

In [128]:
l = [1, 6, 3, 4, 2, 5]
sorted(l)

[1, 2, 3, 4, 5, 6]

Python also has a built-in function <b>sorted</b>, which works the same as the <b>sort</b> method except that it returns a new copy. 

In [129]:
l

[1, 6, 3, 4, 2, 5]

In [130]:
sorted(l, reverse=True)

[6, 5, 4, 3, 2, 1]

In [131]:
l = [1, 6, 3, 4, 2, 5]
l.reverse()
l

[5, 2, 4, 3, 6, 1]

The <b>reverse</b> method reverses the sorting order of the elements. Do not confuse reversing a list with sorting a list. 

In [133]:
l = [1, 6, 3, 4, 2, 5]
reversed(l)

<list_reverseiterator at 0x23ab8abb710>

Python also has a built-in function <b>reversed</b>, which works the same as the <b>reverse</b> method except that it returns a new copy.

In [134]:
list(reversed(l))

[5, 2, 4, 3, 6, 1]

In [135]:
l = [1, 2, 3, 4, 5, 4, 3, 2, 1]
l.count(1)

2

The <b>count</b> method returns the number of elements with the specified value.

## Exercises - Lists

In [164]:
animals = ["dog", "cat", "bird", "tiger", "lion", "fox"]

Get the type of <i>animals</i>.

In [137]:
# Your answer here
type(animals)


list

Check if <i>elephant</i> is in <i>animals</i>. The output must be either True or False. 

In [139]:
# Your answer here
"elephant" in animals

False

Append a new element <i>elephant</i> to <i>animals</i>. Be careful not to append the element multple times.

In [165]:
# Your answer here
animals.append("elephant")

Get the number of elements in <i>animals</i>. 

In [166]:
# Your answer here
len(animals)

7

Get the first 5 items in <i>animals</i>. 

In [143]:
# Your answer here
animals[:5]

['dog', 'cat', 'bird', 'tiger', 'lion']

Get a new reversed list of <i>animals</i>. 

In [167]:
# Your answer here
list(reversed(animals))

['elephant', 'fox', 'lion', 'tiger', 'bird', 'cat', 'dog']

Get a new sorted list of <i>animals</i> in descending alphabetical order. 

In [168]:
# Your answer here
animals.sort( reverse=True)
animals

['tiger', 'lion', 'fox', 'elephant', 'dog', 'cat', 'bird']

## ▪ Tuples

A tuple is a collection that is ordered, unchangeable, and written with round brackets.

In [153]:
t = ()
print(t, type(t))

() <class 'tuple'>


In [154]:
t = tuple()
print(t, type(t))

() <class 'tuple'>


In [155]:
t = ("a", "b", "c")
print(t, type(t))

('a', 'b', 'c') <class 'tuple'>


In [158]:
t[0]

'a'

Indexing and slicing of tuples are the same as other collections.

In [159]:
t[0] = "A"

TypeError: 'tuple' object does not support item assignment

Once a tuple is created, you cannot change its values as tuples are unchangeable. 

The major difference between tuples and lists is that a list is changeable, whereas a tuple is not. In general,  we use a list to store similar, or homogeneous, items, whereas we use a tuple to store heterogeneous items describing an entity. 

In [160]:
employees = [("Alice", 30, "female"), ("Bob", 25, "male"), ("Tom", 34, "male")]

In this example, <i>employees</i> is a list of tuples.

In [161]:
employees[0]

('Alice', 30, 'female')

In [162]:
employees[0][0]

'Alice'

In [163]:
employees[0][1]

30

## Exercises - Tuples

Manually create a list <i>family</i> of tuples, each of which contains the name of your family member and their age.

In [177]:
# Your answer here
family = [("Dave", 51), ("Misa", 48), ("Alex", 19)]

Get the first tuple, or member, in <i>family</i>.

In [178]:
# Your answer here
family[0]

('Dave', 51)

Get the name of the first member in <i>family</i>.

In [179]:
# Your answer here
family[0][0]

'Dave'

Get the age of the last member in <i>family</i>.

In [181]:
# Your answer here
family[-1][1]

19

## ▪ Dictionaries

A dictionary is a collection of key-value mappings that is unordered, changeable, and written with curly brackets. If you look up a key in the dictionary, it returns its value, but not vice versa.

In [176]:
d = {}
print(d, type(d))

{} <class 'dict'>


In [182]:
d = dict()
print(d, type(d))

{} <class 'dict'>


In [183]:
buildings = {"UCC": "University Capitol Center", "CPHB": "College of Public Health Building"}
print(buildings, type(buildings))

{'UCC': 'University Capitol Center', 'CPHB': 'College of Public Health Building'} <class 'dict'>


<i>buildings</i> is a dictionary for UI building name abbreviations. If you look up an abbreviation, it returns its full name.

In [184]:
buildings["UCC"]

'University Capitol Center'

If you look up a key in a dictionary, it returns, if any, its value. 

In [185]:
buildings["CPHB"]

'College of Public Health Building'

In [186]:
buildings["IMU"]

KeyError: 'IMU'

If there is not the key in the dictionary, it returns KeyError.

In [187]:
buildings.keys()

dict_keys(['UCC', 'CPHB'])

In [188]:
buildings.values()

dict_values(['University Capitol Center', 'College of Public Health Building'])

When designing a dictionary, think about what should be the key and what should be the value. It depends on the purpose of the dictionary.

In [189]:
buildings["IMU"] = "Iowa Memorial Union"
buildings

{'UCC': 'University Capitol Center',
 'CPHB': 'College of Public Health Building',
 'IMU': 'Iowa Memorial Union'}

In [190]:
"IMU" in buildings

True

You can check if a key is in a dictionary using the <b>in</b> operator.

In [191]:
"PBB" in buildings

False

In [192]:
len(buildings)

3

The length of a dictionary is the number of key-value pairs in the dictionary. 

## Exercises - Dictionaries

Manually create a dictionary named <i>ages</i> with the names of your family members being the keys and their ages being the values.

In [195]:
# Your answer here
ages = {"Dave":51,"Misa":48,"Alex":19}


Check if <i>Spiderman</i> is in <i>ages</i>. The output must be either True or False. 

In [196]:
# Your answer here
"Spiderman" in ages


False

Add a new name-age pair ("Spiderman", 20) to <i>ages</i>.

In [197]:
# Your answer here
ages["Spiderman"]=20

Get the age of <i>Spiderman</i> using <i>ages</i>.

In [198]:
# Your answer here
ages["Spiderman"]

20

## ▪ Sets

A set is a collection that is unordered, unindexed, and written with curly brackets. It allows no duplicates. Dictionaries and sets are both written with curly brackets, but sets only have keys with no corresponding values to those keys. 

In [199]:
s = set()
print(s, type(s))

set() <class 'set'>


In [200]:
s = {"cat", "dog", "bird"}
print(s, type(s))

{'cat', 'bird', 'dog'} <class 'set'>


In [201]:
s[0]

TypeError: 'set' object is not subscriptable

Sets are not indexed, which means you cannot access the elements using their index numbers.

In [202]:
"dog" in s

True

In [203]:
"cow" in s

False

### Set Methods

In [204]:
s.add("fish")
s

{'bird', 'cat', 'dog', 'fish'}

The <b>add</b> method adds an element to the set. There is no <b>append</b> method in sets.

In [205]:
s.add("fish")
s

{'bird', 'cat', 'dog', 'fish'}

Sets do not allow duplicate values. If the element to be added already exists, it does not add the element.

In [206]:
s.update({"elephant", "horse", "whale"})
s

{'bird', 'cat', 'dog', 'elephant', 'fish', 'horse', 'whale'}

The <b>update</b> method updates the current set by adding items from another set.

Note that the <b>add</b> method adds a single element to a set, while the <b>update</b> method adds a group of elements. 

In [207]:
s.remove("cat")
s

{'bird', 'dog', 'elephant', 'fish', 'horse', 'whale'}

The <b>remove</b> method removes the specified element from the set.

In [208]:
s1 = {1, 2, 3, 4, 5}
s2 = {1, 3, 5, 7, 9}
s1.union(s2)

{1, 2, 3, 4, 5, 7, 9}

The <b>union</b> method returns a set that contains all items from both sets.

In [209]:
s1 = {1, 2, 3, 4, 5}
s2 = {1, 3, 5, 7, 9}
s1 | s2                         # vertical bar as a union operator

{1, 2, 3, 4, 5, 7, 9}

s1 | s2 is equivalent to s1.union(s2).

In [210]:
s1 = {1, 2, 3, 4, 5}
s2 = {1, 3, 5, 7, 9}
s1.intersection(s2)

{1, 3, 5}

The <b>intersection</b> method returns a set that contains only items that exist in both sets.

In [211]:
s1 = {1, 2, 3, 4, 5}
s2 = {1, 3, 5, 7, 9}
s1 & s2                         # ampersand as an intersection operator

{1, 3, 5}

s1 & s2 is equivalent to s1.intersection(s2).

In [214]:
s1 = {1, 2, 3, 4, 5}
s2 = {1, 3, 5, 7, 9}
s1.difference(s2)

{2, 4}

The <b>difference</b> method returns a set that contains the difference between two sets, i.e., the returned set contains items that exist only in the first set, and not in both sets.

In [215]:
s1 = {1, 2, 3, 4, 5}
s2 = {1, 3, 5, 7, 9}
s1 - s2                          # minus as an intersection operator

{2, 4}

s1 - s2 is equivalent to s1.difference(s2).

In [216]:
s1.symmetric_difference(s2)

{2, 4, 7, 9}

The <b>symmetric_difference</b> method returns a set that contains all items from both set, but not the items that are present in both sets, i.e., the returned set contains a mix of items that are not present in both sets.

In [217]:
s1 = {1, 2, 3, 4, 5}
s2 = {1, 3, 5, 7, 9}
s1 ^ s2                          # hat as an intersection operator

{2, 4, 7, 9}

s1 ^ s2 is equivalent to s1.symmetric_difference(s2).

In [219]:
s1 = {1, 2, 3, 4, 5}
s2 = {1, 3, 5, 7, 9}
s3 = {2, 4, 6, 8, 10}

In [220]:
set.union(s1, s2, s3)

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

You can perform set operations on more than two sets.

In [221]:
s1 | s2 | s3

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

## Exercises - Sets

Manually create a set named <i>s1</i> of even numbers between 1 and 20.

In [222]:
# Your answer here
s1 = {2,4,6,8,10,12,14,16,18,20}

Manually create a set named <i>s2</i> of multiples of 3 between 1 and 20.

In [223]:
# Your answer here
s2 = {3,6,9,12,15,18}

Get the union of <i>s1</i> and <i>s2</i>.

In [224]:
# Your answer here
s1 | s2

{2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20}

Get the intersection of <i>s1</i> and <i>s2</i>.

In [225]:
# Your answer here
s1 & s2

{6, 12, 18}

## ▪ Built-in Functions

In [226]:
abs(-1)

1

The <b>abs</b> function returns the absolute value of a number.

In [229]:
l = [True, True, True]
all(l)

True

The <b>all</b> function returns True if all items in an iterable object are true. 

In [230]:
l = [True, True, False]
all(l)

False

In [231]:
l = [False, False, True]
any(l)

True

The <b>any</b> function returns True if any item in an iterable object is true.

In [232]:
l = [False, False, False]
any(l)

False

In [233]:
s = "I'm learning Python data analytics"
dir(s)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',


The <b>dir</b> function returns a list of the specified object's properties and methods.

In [234]:
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



The <b>help</b> function executes the built-in help system.

In [235]:
len(s)

34

The <b>len</b> function returns the length of an object.

In [236]:
l = [1, 2, 3, 4, 5]
max(l)

5

The <b>max</b> function returns the largest item in an iterable.

In [237]:
min(l)

1

The <b>min</b> function returns the smallest item in an iterable.

In [243]:
pow(10, 2)

100

The <b>pow</b>(x, y) function returns the value of `x` to the power of `y`.

In [239]:
s = "I'm learning Python data analytics."
print(s)

I'm learning Python data analytics.


The <b>print</b> function prints the specified message to the screen or other standard output device.

In [240]:
range(0, 10)

range(0, 10)

The <b>range</b>(start, stop) function returns a sequence of numbers, starting from `start` and increments by 1 and ends at `stop` - 1. Rather than being a function, range() is actually an unchangeable sequence type.

In [241]:
list(range(0, 10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [244]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

You can skip `start` if it is 0.

In [245]:
list(range(0, 10, 2))

[0, 2, 4, 6, 8]

The <b>range</b>(start, stop, step) function increments by `step`. 

In [246]:
l = [1, 5, 3, 2, 4]
reversed(l)

<list_reverseiterator at 0x23ab8a9a080>

The <b>reversed</b> function returns a reversed iterator.

In [247]:
list(reversed(l))

[4, 2, 3, 5, 1]

In [248]:
round(3.141592, 2)            # Round 3.141592 to 2 decimal places

3.14

The <b>round</b>(number, ndigits) function rounds `number` rounded to `ndigits` precision. 

In [249]:
round(3.141592)

3

If `ndigits` is omitted, it returns the nearest integer to its input. 

In [250]:
l = [1, 5, 3, 2, 4]
sorted(l)

[1, 2, 3, 4, 5]

The <b>sorted</b> function returns a sorted list.

In [251]:
sorted(l, reverse=True)

[5, 4, 3, 2, 1]

In [252]:
sum(l)

15

The <b>sum</b> function sums the items of an iterator.

In [253]:
type(l)

list

The <b>type</b> function returns the type of an object.

In [254]:
l1 = ["Alice", "Bob", "Tom"]
l2 = [30, 25, 34]
zip(l1, l2)

<zip at 0x23ab8aaa5c8>

The <b>zip</b> function returns an iterator that aggregates elements from each of the iterables. 

In [255]:
list(zip(l1, l2))

[('Alice', 30), ('Bob', 25), ('Tom', 34)]

### Difference between Functions and Methods

- A function looks like this: <b>function_name(something)</b>, e.g., sorted(l1)
- A method looks like this: <b>something.method_name()</b>, e.g., l1.sort()

A method always belongs to an object (e.g. string methods only work for string objects, list methods only work for list objects, etc.), while a function doesn’t necessarily (e.g., you can use the <b>len()</b> function for nearly any data type).

## Exercises - Built-in Functions

Check if all items in <i>l</i> is True. The output must be either True or False. 

In [256]:
l = [True, True, True, True, True, True, True, True, True, True, False, True, True, True, True]

Check if all items in <i>l</i> are True. The output must be either True or False. 

In [257]:
# Your answer here
all(l)

False

Check if any item in <i>l</i> is True. The output must be either True or False. 

In [258]:
l = [False, False, False, False, False, False, False, False, False, False, False, False, False, False, False]

In [259]:
# Your answer here
any(l)

False

Get the min value in <i>l</i>.

In [261]:
l = [5, 13, 76, 9, 93, 28, 49, 76, 9, 88]

In [262]:
# Your answer here
min(l)

5

Get the max value in <i>l</i>.

In [263]:
# Your answer here
max(l)

93

Get the sum of all values in <i>l</i>.

In [264]:
# Your answer here
sum(l)

446

Generate a list of all two-digit integers.

In [272]:
# Your answer here
list(range(10, 100))

[10,
 11,
 12,
 13,
 14,
 15,
 16,
 17,
 18,
 19,
 20,
 21,
 22,
 23,
 24,
 25,
 26,
 27,
 28,
 29,
 30,
 31,
 32,
 33,
 34,
 35,
 36,
 37,
 38,
 39,
 40,
 41,
 42,
 43,
 44,
 45,
 46,
 47,
 48,
 49,
 50,
 51,
 52,
 53,
 54,
 55,
 56,
 57,
 58,
 59,
 60,
 61,
 62,
 63,
 64,
 65,
 66,
 67,
 68,
 69,
 70,
 71,
 72,
 73,
 74,
 75,
 76,
 77,
 78,
 79,
 80,
 81,
 82,
 83,
 84,
 85,
 86,
 87,
 88,
 89,
 90,
 91,
 92,
 93,
 94,
 95,
 96,
 97,
 98,
 99]

Generate a list of all two-digit even integers.

In [270]:
# Your answer here
list(range(10, 100,2))

[10,
 12,
 14,
 16,
 18,
 20,
 22,
 24,
 26,
 28,
 30,
 32,
 34,
 36,
 38,
 40,
 42,
 44,
 46,
 48,
 50,
 52,
 54,
 56,
 58,
 60,
 62,
 64,
 66,
 68,
 70,
 72,
 74,
 76,
 78,
 80,
 82,
 84,
 86,
 88,
 90,
 92,
 94,
 96,
 98]

Get the value of 2 to the power of 10.

In [265]:
# Your answer here
pow(2,10)

1024

Get the number of 0.333333333333 rounded to 2 precision.

In [266]:
# Your answer here
round(0.333333333333,2)

0.33

### Built-in Functions for Type Conversion

In [273]:
a = 1.0
print(a, type(a))

1.0 <class 'float'>


In [274]:
b = int(a)
print(b, type(b))

1 <class 'int'>


The <b>int</b> function converts the specified value into an integer number.

In [275]:
c = str(b)
print(c, type(c))

1 <class 'str'>


The <b>str</b> function converts the specified value into a string.

In [276]:
d = float(c)
print(d, type(d))

1.0 <class 'float'>


The <b>float</b> function converts the specified value into a floating point number.

In [277]:
l = [1, 2, 3, 2, 1]
print(l, type(l))

[1, 2, 3, 2, 1] <class 'list'>


In [278]:
s = set(l)
print(s, type(s))

{1, 2, 3} <class 'set'>


The <b>set</b> function creates a set object.

There are cases where you need to convert a list to a set, so you can use some characteristics of sets. 
Easiest way to find unique items in  list is to covert it to a set()

In [279]:
l2 = list(s)
print(l2, type(l2))

[1, 2, 3] <class 'list'>


The <b>list</b> function creates a list object.

## ▪ Operators

In [280]:
(1 + 2) * 3

9

Python supports all types of arithmetic operators.

### Powers

In [281]:
2 ** 10      # 2 to the power of 10

1024

2 \*\* 10 is equivalent to pow(2, 10).

### Division

In [282]:
5 / 2    # a single slash for true division

2.5

In [283]:
5 // 2   # double slashes for floor division

2

In [284]:
5 % 2    # a percentage for remainder division

1

### Negation

In [285]:
x = 1
-x

-1

To negate a variable, just put - before the variable.

### Comparisons

In [286]:
a = 1
b = 3

In [287]:
a == b

False

Do not counfuse the <b>==</b> (equality) operator with the <b>=</b> (assignment) operator. 

In [288]:
a != b

True

In [289]:
a > b

False

### Augmented Assignment

In [290]:
x = 2
x += 1
x

3

x += 1 is equivalent to x = x + 1. This the easiest way to add something to itself.

In [291]:
x = 2
x -= 1
x

1

x -= 1 is equivalent to x = x - 1. This the easiest way to subtract something from itself.

In [292]:
x = 3
x *= 2
x

6

x *= 2 is equivalent to x = x * 2. This the easiest way to multiply itself by something.

In [293]:
x = 4
x /= 2
x

2.0

x /= 2 is equivalent to x = x / 2. This is the easiest way to divide iteself by something.

In [294]:
x = 4
x **= 2
x

16

x \*\*= 2 is equivalent to x = x \*\* 2.

### Boolean Operations

In [296]:
p = True
q = False

There are two Boolean values in Python: True and False. 

In [297]:
p and q

False

In [298]:
p or q

True

In [299]:
not p

False

## ▪ Del

The <b>del</b> keyword is used to delete objects. In Python everything is an object.

In [300]:
s = "hello"
s

'hello'

In [301]:
del(s)

In [302]:
s

NameError: name 's' is not defined

In [304]:
l = [1, 2, 3, 4, 5]
del(l[0])

You can also delete an element in a collection that is changeable. 

In [305]:
l

[2, 3, 4, 5]

In [306]:
buildings = {"UCC": "University Capitol Center",
             "CPHB": "College of Public Health Building",
             "IMU": "Iowa Memorial Union"}
buildings

{'UCC': 'University Capitol Center',
 'CPHB': 'College of Public Health Building',
 'IMU': 'Iowa Memorial Union'}

In [307]:
del(buildings["CPHB"], buildings["IMU"])

In [308]:
buildings

{'UCC': 'University Capitol Center'}