<center><font size = 5><b>Module 04: Python Data Types</b></font></center>

This module is primarily based on the web site: https://www.programiz.com/python-programming/operators

This module introduces Python data types in depth. The following tree diagram list the data types in Python.

<img src="https://github.com/pengdsci/PythonCrashCourse/raw/main/image/Python-data-types.png" width="400"  height="300" alt="Python data types" />


## 0. Python Modules


We will use use some pre-written Python functions reside in different code **libraries** - customarily called **modules**. We now briefly describe Python module.

A Python module is a code library that is a file containing a set of functions we want to include in Python code.

**Example**: We define the following Python function and save it as 'mymodule.py' in a folder. 

```
def greeting(name):
  print("Hello, " + name)
```

`mymodule.py` is called a `Python module`. We can call this function by importing module `mymodule.py` and reference to the specific function in it as howed in the following pseudo code

```
import mymodule

mymodule.greeting("Jonathan")
```
The output of the above code is:

`Hello, Jonathan`

We will use some functions in a few numerical and mathematical modules in the subsequent sections. The details of these modules and functions can be found at: https://docs.python.org/3/library/numeric.html

## 1. Numbers

Python supports `integers`, `floating-point numbers` and `complex numbers`. They are defined as `int`, `float`, and `complex` classes in Python.

Integers and floating points are separated by the presence or absence of a decimal point. For example, `5` is an integer whereas `5.0` is a floating-point number. Complex numbers are written in the form, `x + yj`, where `x` is the real part and `y` is the imaginary part.

We can use the `type() function` to know which class a variable or a value belongs to and `isinstance()` function to check if it belongs to a particular class.

In [1]:
a = 5

print(type(a))

print(type(5.0))

c = 5 + 3j
print(c + 3)

print(isinstance(c, complex))

<class 'int'>
<class 'float'>
(8+3j)
True


### 1.1. Handling Numbers with Python Functions

There are many function available in verious numerical and mathematical modules mentioned earlier. Here we only use several examples to illustrate how to call functions in different modules.

**Example 1**: module `math`

In [2]:
import math

print(math.pi)
print(math.cos(math.pi))
print(math.exp(10))
print(math.log10(1000))
print(math.sinh(1))
print(math.factorial(6))

3.141592653589793
-1.0
22026.465794806718
3.0
1.1752011936438014
720


**Example 2**: module `random`

In [3]:
import random

print(random.randrange(10, 20))

x = ['a', 'b', 'c', 'd', 'e']
# Get random choice
print(random.choice(x))
# Shuffle x
random.shuffle(x)
# Print the shuffled x
print(x)
# Print random element
print(random.random())

19
a
['e', 'a', 'd', 'b', 'c']
0.3730447502719496


**Example 3**: module `fraction`

In [5]:
import fractions as F

print(F.Fraction(1.5))
print(F.Fraction(5))
print(F.Fraction(1,3))

3/2
5
1/3


**Example 4**: module `decimal`

In [6]:
import decimal

print(0.1)
print(decimal.Decimal(0.1))

0.1
0.1000000000000000055511151231257827021181583404541015625


## 2. Python List

 A `list` is created by placing all the items (elements) inside square brackets `[]`, separated by commas.

### 2.1. Definition of List

**Example**

In [7]:
# empty list
my_list = []

# list of integers
my_list = [1, 2, 3]

# list with mixed data types
my_list = [1, "Hello", 3.4]

# nested list
my_list = ["mouse", [8, 4, 6], ['a']]

### 2.2. Accessing A List

We can use the index operator [] to access an item in a list. Note that

1. indices start at 0. So, a list having 5 elements will have an index from 0 to 4.

2. The index must be an integer. We can't use float or other types.

3. Nested lists are accessed using nested indexing.

**Example**: The following example illustrates the above points.

In [8]:
# List indexing

my_list = ['p', 'r', 'o', 'b', 'e']

# Output: p
print(my_list[0])

# Output: o
print(my_list[2])

# Output: e
print(my_list[4])

# Nested List
n_list = ["Happy", [2, 0, 1, 5]]

# Nested indexing
print(n_list[0][1])

print(n_list[1][3])

# Error! Only integer can be used for indexing
print(my_list[4.0])

p
o
e
a
5


TypeError: list indices must be integers or slices, not float

**Negative Index**

Python allows negative indexing for its sequences. The index of -1 refers to the last item, -2 to the second last item and so on.

**Example**

In [10]:
# Negative indexing in lists
my_list = ['p','r','o','b','e']

print(my_list[-1])

print(my_list[-5])

e
p


### 2.3. Slice Lists in Python

We can access a range of items in a list by using the slicing operator :(colon).

In [11]:
# Correcting mistake values in a list
odd = [2, 4, 6, 8]

# change the 1st item    
odd[0] = 1            

print(odd)

# change 2nd to 4th items
odd[1:4] = [3, 5, 7]  

print(odd)  

[1, 4, 6, 8]
[1, 3, 5, 7]


### 2.4. Modify Lists

**Example 1**: use the assignment operator = to change an item or a range of items.

In [12]:
# Correcting mistake values in a list
odd = [2, 4, 6, 8]

# change the 1st item    
odd[0] = 1            

print(odd)

# change 2nd to 4th items
odd[1:4] = [3, 5, 7]  

print(odd) 

[1, 4, 6, 8]
[1, 3, 5, 7]


**Example 2**: Add one item to a list using the `append()` method or add several items using `extend()` method.

In [13]:
# Appending and Extending lists in Python
odd = [1, 3, 5]

odd.append(7)

print(odd)

odd.extend([9, 11, 13])

print(odd)

[1, 3, 5, 7]
[1, 3, 5, 7, 9, 11, 13]


**Example 2**: Concatenation using `+ operator`, replication using `* operator`, and insertion using `insert`.

In [15]:
# Concatenating and repeating lists
odd = [1, 3, 5]

print(odd + [9, 7, 5])
print(["re"] * 3)


# Demonstration of list insert() method
even = [4, 8]
even.insert(1,3)

print(even)
even[2:2] = [5, 7]

print(even)



[1, 3, 5, 9, 7, 5]
['re', 're', 're']
[4, 3, 8]
[4, 3, 5, 7, 8]


### 2.5. Delete/Remove List Elements

We can delete one or more items from a list using the keyword `del`. It can even delete the list entirely.

**Example 1**: `Keyword del`

In [16]:
# Deleting list items
my_list = ['p', 'r', 'o', 'b', 'l', 'e', 'm']

# delete one item
del my_list[2]

print(my_list)

# delete multiple items
del my_list[1:5]

print(my_list)

# delete entire list
del my_list

# Error: List not defined
print(my_list)

['p', 'r', 'b', 'l', 'e', 'm']
['p', 'm']


NameError: name 'my_list' is not defined

We can use `remove()` method to remove the given item or `pop()` method to remove an item at the given index. 

The `pop()` method removes and returns the last item if the index is not provided. This helps us implement lists as stacks (first in, last out data structure).

We can also use the clear() method to empty a list.

**Example 2**: Keywords `remove()`, `pop()`, and `clear()` 

In [17]:
my_list = ['p','r','o','b','l','e','m']
my_list.remove('p')

# Output: ['r', 'o', 'b', 'l', 'e', 'm']
print(my_list)

# Output: 'o'
print(my_list.pop(1))

# Output: ['r', 'b', 'l', 'e', 'm']
print(my_list)

# Output: 'm'
print(my_list.pop())

# Output: ['r', 'b', 'l', 'e']
print(my_list)

my_list.clear()

# Output: []
print(my_list)

['r', 'o', 'b', 'l', 'e', 'm']
o
['r', 'b', 'l', 'e', 'm']
m
['r', 'b', 'l', 'e']
[]


### 2.6. Summary of List Methods

Here is a summary of methods for manupilating lists:

* `append() ` - Add an element to the end of the list
* `extend() ` - Add all elements of a list to the another list
* `insert() ` - Insert an item at the defined index
* `remove() ` - Removes an item from the list
* `pop()    ` - Removes and returns an element at the given index
* `clear()  ` - Removes all items from the list
* `index()  ` - Returns the index of the first matched item
* `count()  ` - Returns the count of the number of items passed as an argument
* `sort()   ` - Sort items in a list in ascending order
* `reverse()` - Reverse the order of items in the list
* `copy()   ` - Returns a shallow copy of the list




**Example**

In [18]:
# Python list methods
my_list = [3, 8, 1, 6, 0, 8, 4]

# Output: 1
print(my_list.index(8))

# Output: 2
print(my_list.count(8))

my_list.sort()

# Output: [0, 1, 3, 4, 6, 8, 8]
print(my_list)

my_list.reverse()

# Output: [8, 8, 6, 4, 3, 1, 0]
print(my_list)

1
2
[0, 1, 3, 4, 6, 8, 8]
[8, 8, 6, 4, 3, 1, 0]


### 2.7. Create Patterned List

List comprehension is an elegant and concise way to create a new list from an existing list in Python. A list comprehension consists of an expression followed by for statement inside square brackets.

**Example**: make a list with each item being increasing power of 2.

In [19]:
pow2 = [2 ** x for x in range(10)]
print(pow2)

[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]


**More Examples**

In [20]:
odd = [x for x in range(20) if x % 2 == 1]
print(odd)

[x+y for x in ['Python ','C '] for y in ['Language','Programming']]


[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]


['Python Language', 'Python Programming', 'C Language', 'C Programming']

## 3. Tuples

A tuple in Python is similar to a list. The difference between the two is that we cannot change the elements of a tuple once it is assigned whereas we can change the elements of a list.

A tuple is created by placing all the items (elements) inside parentheses (), separated by commas. The parentheses are optional, however, it is a good practice to use them.

**Example**: Definition of tuple.

In [1]:
# Different types of tuples

# Empty tuple
my_tuple = ()
print(my_tuple)

# Tuple having integers
my_tuple = (1, 2, 3)
print(my_tuple)

# tuple with mixed datatypes
my_tuple = (1, "Hello", 3.4)
print(my_tuple)

# nested tuple
my_tuple = ("mouse", [8, 4, 6], (1, 2, 3))
print(my_tuple)

()
(1, 2, 3)
(1, 'Hello', 3.4)
('mouse', [8, 4, 6], (1, 2, 3))


### 3.1. Methods for Accessing Tuple Elements

**Indexing**

We can use the `index operator []` to access an item in a tuple, where `the index starts from 0`.

**Example 1**: access with index.

In [2]:
# Accessing tuple elements using indexing
my_tuple = ('p','e','r','m','i','t')

print(my_tuple[0])   # 'p' 
print(my_tuple[5])   # 't'

# IndexError: list index out of range
# print(my_tuple[6])

# Index must be an integer
# TypeError: list indices must be integers, not float
# my_tuple[2.0]

# nested tuple
n_tuple = ("mouse", [8, 4, 6], (1, 2, 3))

# nested index
print(n_tuple[0][3])       # 's'
print(n_tuple[1][1])       # 4

p
t
s
4


**Negative Indexing**:


Python allows negative indexing for its sequences.  The index of -1 refers to the last item, -2 to the second last item and so on.

**Example 2**: negative index

In [3]:
# Negative indexing for accessing tuple elements
my_tuple = ('p', 'e', 'r', 'm', 'i', 't')

# Output: 't'
print(my_tuple[-1])

# Output: 'p'
print(my_tuple[-6])

t
p


### 3.2. Manupilating Tuples

**Slicing** 

We can access a range of items in a tuple by using the slicing operator colon :
    
**Example 1**: slicing a tuple.

In [4]:
# Accessing tuple elements using slicing
my_tuple = ('p','r','o','g','r','a','m','i','z')

# elements 2nd to 4th
# Output: ('r', 'o', 'g')
print(my_tuple[1:4])

# elements beginning to 2nd
# Output: ('p', 'r')
print(my_tuple[:-7])

# elements 8th to end
# Output: ('i', 'z')
print(my_tuple[7:])

# elements beginning to end
# Output: ('p', 'r', 'o', 'g', 'r', 'a', 'm', 'i', 'z')
print(my_tuple[:])

('r', 'o', 'g')
('p', 'r')
('i', 'z')
('p', 'r', 'o', 'g', 'r', 'a', 'm', 'i', 'z')


**Change Tuple**

Unlike lists, tuples are immutable. This means that elements of a tuple cannot be changed once they have been assigned. However, if the element is itself a mutable data type like a list, <font color = "red">its nested items can be changed </font>. We can also assign a tuple to different values (reassignment).

In [5]:
# Changing tuple values
my_tuple = (4, 2, 3, [6, 5])     # having a nested list
my_tuple01 = (4, 2, 3, (6, 5))   # having a nested tuple

# TypeError: 'tuple' object does not support item assignment
# my_tuple[1] = 9

# However, item of mutable element can be changed
my_tuple[3][0] = 9    # Output: (4, 2, 3, [9, 5])
print(my_tuple)

my_tuple01[3][0] = 10    # will generate an error since the nested item is a tuple!
print(my_tuple01)


# Tuples can be reassigned
my_tuple = ('p', 'r', 'o', 'g', 'r', 'a', 'm', 'i', 'z')

# Output: ('p', 'r', 'o', 'g', 'r', 'a', 'm', 'i', 'z')
print(my_tuple)

(4, 2, 3, [9, 5])


TypeError: 'tuple' object does not support item assignment

**Derive a New Tuple from Existing Ones**

We can use `+ operator` to combine two tuples. This is called **concatenation**.  We can also repeat the elements in a tuple for a given number of times using the `* operator` and deleting a tuple entirely, however, is possible using the `keyword del`.

In [6]:
# Concatenation
# Output: (1, 2, 3, 4, 5, 6)
print((1, 2, 3) + (4, 5, 6))

# Repeat
# Output: ('Repeat', 'Repeat', 'Repeat')
print(("Repeat",) * 3)

# Deleting tuples
my_tuple = ('p', 'r', 'o', 'g', 'r', 'a', 'm', 'i', 'z')

# can't delete items
# TypeError: 'tuple' object doesn't support item deletion
# del my_tuple[3]

# Can delete an entire tuple
del my_tuple

# NameError: name 'my_tuple' is not defined
print(my_tuple)

(1, 2, 3, 4, 5, 6)
('Repeat', 'Repeat', 'Repeat')


NameError: name 'my_tuple' is not defined

### 3.3. Other Tuple Operations

**Tuple Membership Test** using `keyword in`

In [7]:
# Membership test in tuple
my_tuple = ('a', 'p', 'p', 'l', 'e',)

# In operation
print('a' in my_tuple)
print('b' in my_tuple)

# Not in operation
print('g' not in my_tuple)

True
False
True


**Iterating Through a Tuple**

In [8]:
# Using a for loop to iterate through a tuple
for name in ('John', 'Kate'):
    print("Hello", name)

Hello John
Hello Kate


### 3.4. Advantages of Tuple over List

Since tuples are quite similar to lists, both of them are used in similar situations. However, there are certain advantages of implementing a tuple over a list. Below listed are some of the main advantages:

* We generally use tuples for heterogeneous (different) data types and lists for homogeneous (similar) data types.
* Since tuples are immutable, iterating through a tuple is faster than with list. So there is a slight performance boost.
* Tuples that contain immutable elements can be used as a key for a dictionary. With lists, this is not possible.
* If we have data that doesn't change, implementing it as tuple will guarantee that it remains write-protected.

## 4. Strings

A string is a sequence of characters. A character is simply a symbol. For example, the English language has 26 characters.



### 4.1. Definition of A String

Strings can be created by enclosing characters `inside a single quote or double-quotes`. `Triple quotes` can be used in Python but generally used to represent `multiline strings` and `docstrings`.

In [9]:
# defining strings in Python
# all of the following are equivalent
my_string = 'Hello'
print(my_string)

my_string = "Hello"
print(my_string)

my_string = '''Hello'''
print(my_string)

# triple quotes string can extend multiple lines
my_string = """Hello, welcome to
           the world of Python"""
print(my_string)

Hello
Hello
Hello
Hello, welcome to
           the world of Python


### 4.2. Access Characters in A String

We can access individual characters using indexing and a range of characters using slicing. Index starts from 0. Trying to access a character out of index range will raise an IndexError. The index must be an integer. We can't use floats or other types, this will result into TypeError.

Python allows negative indexing for its sequences.  The index of -1 refers to the last item, -2 to the second last item and so on. We can access a range of items in a string by using the slicing operator :(colon).

In [14]:
#Accessing string characters in Python
str = 'programiz'
print('str = ', str)

#first character
print('str[0] = ', str[0])

#last character
print('str[-1] = ', str[-1])

#slicing 2nd to 5th character
print('str[1:5] = ', str[1:5])

#slicing 6th to 2nd last character
print('str[5:-2] = ', str[5:-2])


# index must be in range
my_string = 'Hello'
print(my_string[15])   # error
my_string = 'Hello'

str =  programiz
str[0] =  p
str[-1] =  z
str[1:5] =  rogr
str[5:-2] =  am


IndexError: string index out of range

In [15]:
my_string = 'Hello'
print("decimal index", my_string[1.5])

TypeError: string indices must be integers

### 4.4. Manupilating A String

* Strings are immutable. This means that elements of a string cannot be changed once they have been assigned. 

In [22]:
my_string = 'programiz'
my_string[5] = 'a'

TypeError: 'str' object does not support item assignment

* We can simply reassign different strings to the same name.

In [21]:
my_string = 'Python'
my_string

'Python'

* We cannot delete or remove characters from a string. But deleting the string entirely is possible using the del keyword.

In [24]:
del my_string[1]    # generate error since we cannot delete characters in a atring

del my_string
my_string

TypeError: 'str' object doesn't support item deletion

In [25]:
del my_string              # delete the entire string.
my_string                  # NameError: name 'my_string' is not defined

NameError: name 'my_string' is not defined

### 4.5. String Operations

There are many operations that can be performed with strings which makes it one of the most used data types in Python.

* Concatenation of Two or More Strings

Joining of two or more strings into a single one is called `concatenation`. The `+ operator` does this in Python. Simply writing two string literals together also concatenates them. The `* operator` can be used to repeat the string for a given number of times.

In [26]:
# Python String Operations
str1 = 'Hello'
str2 ='World!'

# using +
print('str1 + str2 = ', str1 + str2)

# using *
print('str1 * 3 =', str1 * 3)

str1 + str2 =  HelloWorld!
str1 * 3 = HelloHelloHello


If we want to concatenate strings in different lines, we can use parentheses.

In [27]:
# two string literals together
'Hello ''World!'


# using parentheses
s = ('Hello '           # not a tuple since values are not separated by a comma!
     'World')
s

'Hello World'

* Iterating Through a string

We can iterate through a string using a for loop. Here is an example to count the number of 'l's in a string.

In [28]:
# Iterating through a string
count = 0
for letter in 'Hello World':
    if(letter == 'l'):
        count += 1
print(count,'letters found')

3 letters found


* String Membership Test

We can test if a substring exists within a string or not, using the keyword `in`.

In [30]:
'a' in 'program'

True

In [31]:
'at' not in 'battle'

False

* Built-in functions to Work with Python

Various built-in functions that work with sequence work with strings as well.

Some of the commonly used ones are `enumerate()` and `len()`. The `enumerate()` function returns an enumerate object. It contains the index and value of all the items in the string as pairs. This can be useful for iteration.

Similarly, `len()` returns the length (number of characters) of the string.


In [32]:
str = 'cold'

# enumerate()
list_enumerate = list(enumerate(str))
print('list(enumerate(str) = ', list_enumerate)

#character count
print('len(str) = ', len(str))

list(enumerate(str) =  [(0, 'c'), (1, 'o'), (2, 'l'), (3, 'd')]
len(str) =  4


### 4.6. Python String Formatting

* **Escape Sequence** If we want to print a text like He said, `"What's there?"`, we can neither use single quotes nor double quotes. This will result in a `SyntaxError` as the text itself contains both single and double quotes.

In [34]:
print("He said, "What's there?"")

SyntaxError: invalid syntax (<ipython-input-34-5b2db8c64782>, line 1)

In [35]:
print('He said, "What's there?"')

SyntaxError: invalid syntax (<ipython-input-35-5c6702031631>, line 1)

One way to get around this problem is to use triple quotes. Alternatively, we can use escape sequences.

An escape sequence starts with a backslash and is interpreted differently. If we use a single quote to represent a string, all the single quotes inside the string must be escaped. Similar is the case with double quotes. Here is how it can be done to represent the above text.

In [36]:
# using triple quotes
print('''He said, "What's there?"''')

# escaping single quotes
print('He said, "What\'s there?"')

# escaping double quotes
print("He said, \"What's there?\"")

He said, "What's there?"
He said, "What's there?"
He said, "What's there?"


* The `format() Method` for Formatting Strings

The `format() method` that is available with the string object is very versatile and powerful in formatting strings. Format strings contain curly `braces {}` as placeholders or replacement fields which get replaced.

We can use positional arguments or keyword arguments to specify the order.

In [37]:
# Python string format() method

# default(implicit) order
default_order = "{}, {} and {}".format('John','Bill','Sean')
print('\n--- Default Order ---')
print(default_order)

# order using positional argument
positional_order = "{1}, {0} and {2}".format('John','Bill','Sean')
print('\n--- Positional Order ---')
print(positional_order)

# order using keyword argument
keyword_order = "{s}, {b} and {j}".format(j='John',b='Bill',s='Sean')
print('\n--- Keyword Order ---')
print(keyword_order)


--- Default Order ---
John, Bill and Sean

--- Positional Order ---
Bill, John and Sean

--- Keyword Order ---
Sean, Bill and John


* **optional format specifications**

The format() method can have optional format specifications. They are separated from the field name using `colon`. For example, we can `left-justify <`, `right-justify >` or `center ^` a string in the given space.

We can also format integers as binary, hexadecimal, etc. and floats can be rounded or displayed in the exponent format. There are lots of formatting we can do.  For more details, see [this artical](https://www.programiz.com/python-programming/methods/string/format).

In [38]:
# formatting integers
"Binary representation of {0} is {0:b}".format(12)


'|butter    |  bread   |       ham|'

In [39]:
# formatting floats
"Exponent representation: {0:e}".format(1566.345)

'Exponent representation: 1.566345e+03'

In [40]:
# round off
"One third is: {0:.3f}".format(1/3)

'One third is: 0.333'

In [42]:
# string alignment
"|{:<10}|{:^20}|{:>10}|".format('butter','bread','ham')

'|butter    |       bread        |       ham|'

In [43]:
"PrOgRaMiZ".lower()

'programiz'

In [44]:
"PrOgRaMiZ".upper()

'PROGRAMIZ'

In [45]:
"This will split all words into a list".split()

['This', 'will', 'split', 'all', 'words', 'into', 'a', 'list']

In [46]:
' '.join(['This', 'will', 'join', 'all', 'words', 'into', 'a', 'string'])

'This will join all words into a string'

In [47]:
'Happy New Year'.find('ew')

7

In [48]:
'Happy New Year'.replace('Happy','Brilliant')

'Brilliant New Year'

## 5. Sets

A `set` is an unordered collection of items. <font color = "red">Every set element is unique (no duplicates) and must be immutable (cannot be changed)</font>. <font color = "blue">However, a set itself is mutable.</font> We can add or remove items from it.

Sets can also be used to perform mathematical set operations like union, intersection, symmetric difference, etc.

### 5.1. Definition

A set is defined by placing all the items (elements) inside `curly braces {}`, `separated by comma`, `or by using the built-in set() function`.

It can have any number of items and they may be of different types (integer, float, tuple, string etc.). But `a set cannot have mutable elements` like lists, sets or dictionaries as its elements.

In [49]:
# Different types of sets in Python
# set of integers
my_set = {1, 2, 3}
print(my_set)

# set of mixed datatypes
my_set = {1.0, "Hello", (1, 2, 3)}
print(my_set)

{1, 2, 3}
{1.0, (1, 2, 3), 'Hello'}


In [51]:
# set cannot have duplicates
# Output: {1, 2, 3, 4}
my_set = {1, 2, 3, 4, 3, 2}
print(my_set)

# we can make set from a list
# Output: {1, 2, 3}
my_set = set([1, 2, 3, 2])    # we can consider this a conversion from a list to a set using set()
print(my_set)

# set cannot have mutable items
# here [3, 4] is a mutable list
# this will cause an error.

my_set = {1, 2, [3, 4]}          # list cannot be a compoenent/an element of a set!

{1, 2, 3, 4}
{1, 2, 3}


TypeError: unhashable type: 'list'

* Creating an empty set.

Empty curly braces {} will make an empty dictionary in Python. To make a set without any elements, we use the set() function without any argument.

In [52]:
# Distinguish set and dictionary while creating empty set

# initialize a with {}
a = {}

# check data type of a
print(type(a))

# initialize a with set()
a = set()

# check data type of a
print(type(a))

<class 'dict'>
<class 'set'>


### 5.2. Modifying A Set

Sets are mutable. However, since they are unordered, `indexing has no meaning`.


* **Adding Element(s) to A Set**

We can add a single element using the `add()` method, and multiple elements using the `update()` method. The update() method can take tuples, lists, strings or other sets as its argument. <font color = "red"><b>In all cases, duplicates are avoided</b></font>.

In [54]:
# initialize my_set
my_set = {1, 3}
print(my_set)

# my_set[0]
# if you uncomment the above line
# you will get an error
# TypeError: 'set' object does not support indexing

# add an element
# Output: {1, 2, 3}
my_set.add(2)
print(my_set)

# add multiple elements
# Output: {1, 2, 3, 4}
my_set.update([2, 3, 4])
print(my_set)

# add list and set
# Output: {1, 2, 3, 4, 5, 6, 8}
my_set.update([4, 5], {1, 6, 8})
print(my_set)

{1, 3}
{1, 2, 3}
{1, 2, 3, 4}
{1, 2, 3, 4, 5, 6, 8}


* **Removing Elements from A Set**

A particular item can be removed from a set using the methods discard() and remove().

The only difference between the two is that the discard() function leaves a set unchanged if the element is not present in the set. On the other hand, the remove() function will raise an error in such a condition (if element is not present in the set).

In [55]:
# Difference between discard() and remove()

# initialize my_set
my_set = {1, 3, 4, 5, 6}
print(my_set)

# discard an element. Output: {1, 3, 5, 6}
my_set.discard(4)
print(my_set)

# remove an element. Output: {1, 3, 5}
my_set.remove(6)
print(my_set)

# discard an element not present in my_set. Output: {1, 3, 5}
my_set.discard(2)
print(my_set)

# remove an element not present in my_set you will get an error. Output: KeyError
my_set.remove(2)

{1, 3, 4, 5, 6}
{1, 3, 5, 6}
{1, 3, 5}
{1, 3, 5}


KeyError: 2

Similarly, we can remove and return an item using the `pop() method`.

Since set is an unordered data type, there is no way of determining which item will be popped. It is completely arbitrary.

We can also remove all the items from a set using the `clear() method`.

In [56]:
# initialize my_set. Output: set of unique elements
my_set = set("HelloWorld")
print(my_set)

# pop an element. Output: random element
print(my_set.pop())

# pop another element
my_set.pop()
print(my_set)

# clear my_set. Output: set()
my_set.clear()
print(my_set)

print(my_set)

{'o', 'r', 'l', 'W', 'e', 'd', 'H'}
o
{'l', 'W', 'e', 'd', 'H'}
set()
set()


### 5.3. Set Operations

* **Union of A and B is a set of all elements from both sets**.

Union is performed using `|` operator. Same can be accomplished using the `union()` method.

In [57]:
# Set union method
# initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# use | operator
# Output: {1, 2, 3, 4, 5, 6, 7, 8}
print(A | B)

# use union function
AunionB =  A.union(B)
print(AunionB)

# use union function on B
BunionA = B.union(A)
print(BunionA)

{1, 2, 3, 4, 5, 6, 7, 8}


In [58]:
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# use union function
AunionB =  A.union(B)
print(AunionB)

# use union function on B
BunionA = B.union(A)
print(BunionA)

{1, 2, 3, 4, 5, 6, 7, 8}
{1, 2, 3, 4, 5, 6, 7, 8}


* **Intersection of A and B is a set of elements that are common in both the sets.**

Intersection is performed using `& operator`. Same can be accomplished using the `intersection() method`.

In [59]:
# Intersection of sets initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# use & operator. Output: {4, 5}
print(A & B)

# use intersection function on A
AintersectB = A.intersection(B)
print(AintersectB)

# use intersection function on B
BintersectA =  B.intersection(A)
print(BintersectA)

{4, 5}
{4, 5}
{4, 5}


* **Difference of Two Sets**

Difference of the set B from set A, `denoted by (A - B)`, is a set of elements that are only in A but not in B. Similarly, `B - A` is a set of elements in B but not in A.

Difference is performed using `- operator`. Same can be accomplished using the `difference() method`.

In [61]:
# Difference of two sets initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# use - operator on A. Output: {1, 2, 3}
print(A - B)

# use difference function on A
AdiffB = A.difference(B)
print(AdiffB)

# use - operator on B
BminusA =  B - A
print(BminusA)

# use difference function on B
BdiffA = B.difference(A)
print(BdiffA)

{1, 2, 3}
{1, 2, 3}
{8, 6, 7}
{8, 6, 7}


* **Symmetric Difference of Two Sets**

**Symmetric Difference of A and B** is a set of elements in A and B but not in both (excluding the intersection).

Symmetric difference is performed using `^ operator`. Same can be accomplished using the method `symmetric_difference()`.

In [62]:
# Symmetric difference of two sets initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# use ^ operator. Output: {1, 2, 3, 6, 7, 8}
print(A ^ B)

{1, 2, 3, 6, 7, 8}


### 5.4. Other Python Set Methods

There are many set methods, some of which we have already used above. Here is a list of all the methods that are available with the set objects.

|Method	|Description|
|:-----:|:----------|
|add()	|Adds an element to the set|
|clear()	|Removes all elements from the set|
|copy()	|Returns a copy of the set|
|difference()	|Returns the difference of two or more sets as a new set|
|difference_update()	|Removes all elements of another set from this set|
|discard()	|Removes an element from the set if it is a member. (Do nothing if the element is not in set)|
|intersection()	|Returns the intersection of two sets as a new set|
|intersection_update()	|Updates the set with the intersection of itself and another|
|isdisjoint()	|Returns True if two sets have a null intersection|
|issubset()	|Returns True if another set contains this set|
|issuperset()	|Returns True if this set contains another set|
|pop()	|Removes and returns an arbitrary set element. Raises KeyError if the set is empty|
|remove()	|Removes an element from the set. If the element is not a member, raises a KeyError|
|symmetric_difference()	|Returns the symmetric difference of two sets as a new set|
|symmetric_difference_update()	|Updates a set with the symmetric difference of itself and another|
|union()	|Returns the union of sets in a new set|
|update()	|Updates the set with the union of itself and others|

In [None]:
# two sets initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}



### 5.4. Built-in Functions with Set

Built-in functions like all(), any(), enumerate(), len(), max(), min(), sorted(), sum() etc. are commonly used with sets to perform different tasks.

|Function	|Description|
|:---------:|:----------|
|all()	|Returns True if all elements of the set are true (or if the set is empty).|
|any()	|Returns True if any element of the set is true. If the set is empty, returns False.|
|enumerate()	|Returns an enumerate object. It contains the index and value for all the items of the set as a pair.|
|len()	|Returns the length (the number of items) in the set.|
|max()	|Returns the largest item in the set.|
|min()	|Returns the smallest item in the set.|
|sorted()	|Returns a new sorted list from elements in the set(does not sort the set itself).|
|sum()	|Returns the sum of all elements in the set.|

In [78]:
# two sets initialize A and B
A = {1, 2, 3, 0, 5}
B = {4, 5, 6, 0, 8}

print("length of A =", len(A))
print("sume of B =",sum(B))
print("maximum of B =", max(B))
print(all(A))
print(any(A))        # 
print(sorted(B))     # returns a list

length of A = 5
sume of B = 23
maximum of B = 8
False
True
[0, 4, 5, 6, 8]


### 5.5. Frozenset

A `frozenset` is an unordered, un-indexed, and immutable collection of elements. It provides all the functionalities that a set offers in Python, the only difference being the fact that a `frozenset is immutable`, i.e. canâ€™t be changed after it is created. Hence in simple words, `frozen sets are immutable sets`. Frozensets can be created using the `frozenset() function`.

In Python, hashable objects refers to the modifiable datatypes (dictionary, lists etc). Sets on the other hand cannot be modified once assigned, so sets are unhashable (non-hashable). Whereas, `frozenset() is hashable`.

Sets being mutable are unhashable, so they can't be used as dictionary keys. On the other hand, frozensets are hashable and can be used as keys to a dictionary.

This data type supports methods like copy(), difference(), intersection(), isdisjoint(), issubset(), issuperset(), symmetric_difference() and union(). Being immutable, it does not have methods that add or remove elements.

In [65]:
# Frozensets
# initialize A and B
A = frozenset([1, 2, 3, 4])
B = frozenset([3, 4, 5, 6])

testdisjoint =  A.isdisjoint(B)
print(testdisjoint)

AdiffB =  A.difference(B)
print(AdiffB)

AbarB = A | B
print(AbarB)

Aadd3 = A.add(3)   # AttributeError: 'frozenset' object has no attribute 'add'
print(Aadd3)

False
frozenset({1, 2})
frozenset({1, 2, 3, 4, 5, 6})


AttributeError: 'frozenset' object has no attribute 'add'

## 6. Dictionary

In [66]:
 A.add(3)   

AttributeError: 'frozenset' object has no attribute 'add'

Python dictionary is an unordered collection of items. Each item of a dictionary has a key/value pair. Dictionaries are optimized to retrieve values when the key is known.

### 6.1. Definition

Creating a dictionary is as simple as placing items inside curly braces {} separated by commas.

An item has a key and a corresponding value that is expressed as a pair (key: value).

While the values can be of any data type and can repeat, keys must be of immutable type (string, number or tuple with immutable elements) and must be unique.

In [82]:
# empty dictionary
my_dict0 = {}
print("my_dict0 =",my_dict0)

# dictionary with integer keys
my_dict1 = {1: 'apple', 2: 'ball'}
print("my_dict1 =",my_dict1)

# dictionary with mixed keys
my_dict2 = {'name': 'John', 1: [2, 4, 3]}
print("my_dict2 =",my_dict2)

# using dict()
my_dict3 = dict({1:'apple', 2:'ball'})
print("my_dict3 =",my_dict3)

# from sequence having each item as a pair
my_dict4 = dict([(1,'apple'), (2,'ball')])
print("my_dict4 =",my_dict4)

my_dict0 = {}
my_dict1 = {1: 'apple', 2: 'ball'}
my_dict2 = {'name': 'John', 1: [2, 4, 3]}
my_dict3 = {1: 'apple', 2: 'ball'}
my_dict4 = {1: 'apple', 2: 'ball'}


### 6.2. Accessing Elements from Dictionary

We use keys to access values of a dictionary. Keys can be used either inside square brackets `[]` or with the `get() method`.

<font color = "red">If we use the square brackets `[]`, <b>KeyError</b> is raised in case a key is not found in the dictionary. On the other hand, the `get() method` returns <b>None</b> if the key is not found.</font>

In [83]:
# get vs [] for retrieving elements
my_dict = {'name': 'Jack', 'age': 26}

# Output: Jack
print(my_dict['name'])

# Output: 26
print(my_dict.get('age'))

# Trying to access keys which doesn't exist throws error
# Output None
print(my_dict.get('address'))

# KeyError
print(my_dict['address'])

Jack
26
None


KeyError: 'address'

### 6.3. Modifying Disctionaries

* **Changing and Adding Dictionary Elements**

Dictionaries are mutable. We can add new items or change the value of existing items using an assignment operator.

If the key is already present, then the existing value gets updated. In case the key is not present, a new (key: value) pair is added to the dictionary.

In [84]:
# Changing and adding Dictionary Elements
my_dict = {'name': 'Jack', 'age': 26}

# update value
my_dict['age'] = 27

#Output: {'age': 27, 'name': 'Jack'}
print(my_dict)

# add item
my_dict['address'] = 'Downtown'

# Output: {'address': 'Downtown', 'age': 27, 'name': 'Jack'}
print(my_dict)

{'name': 'Jack', 'age': 27}
{'name': 'Jack', 'age': 27, 'address': 'Downtown'}


* **Removing elements from Dictionary**

We can remove a particular item in a dictionary by using the `pop() method`. This method removes an item with the provided key and returns the value.

The `popitem() method` can be used to remove and return an arbitrary (key, value) item pair from the dictionary. All the items can be removed at once, using the `clear() method`.

We can also use the `del keyword` to remove individual items or the entire dictionary itself.

In [86]:
# create a dictionary
squares = {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

# remove a particular item, returns its value
# Output: 16
print(squares.pop(4))

# Output: {1: 1, 2: 4, 3: 9, 5: 25}
print(squares)

# remove an arbitrary item, return (key,value)
# Output: (5, 25)
print(squares.popitem())

# Output: {1: 1, 2: 4, 3: 9}
print(squares)

# remove all items
squares.clear()

# Output: {}
print(squares)

# delete the dictionary itself
del squares

# Throws Error
print(squares)

16
{1: 1, 2: 4, 3: 9, 5: 25}
(5, 25)
{1: 1, 2: 4, 3: 9}
{}


NameError: name 'squares' is not defined

## 6.4. Dictionary Methods

Methods that are available with a dictionary are tabulated below. Some of them have already been used in the above examples.

|Method	|Description|
|:-----:|:---------:|
|clear()	|Removes all items from the dictionary.|
|copy()     |Returns a shallow copy of the dictionary.|
|fromkeys(seq[, v])	|Returns a new dictionary with keys from seq and value equal to v (defaults to None).|
|get(key[,d])	|Returns the value of the key. If the key does not exist, returns d (defaults to None).|
|items()	|Return a new object of the dictionary's items in (key, value) format.|
|keys()	|Returns a new object of the dictionary's keys.|
|pop(key[,d])	|Removes the item with the key and returns its value or d if key is not found. If d is not provided and the key is not found, it raises KeyError.|
|popitem()	|Removes and returns an arbitrary item (key, value). Raises KeyError if the dictionary is empty.|
|setdefault(key[,d])	|Returns the corresponding value if the key is in the dictionary. If not, inserts the key with a value of d and returns d (defaults to None).|
|update([other])	|Updates the dictionary with the key/value pairs from other, overwriting existing keys.|
|values()	|Returns a new object of the dictionary's values|

In [87]:
# Dictionary Methods
marks = {}.fromkeys(['Math', 'English', 'Science'], 0)

# Output: {'English': 0, 'Math': 0, 'Science': 0}
print(marks)

for item in marks.items():
    print(item)

# Output: ['English', 'Math', 'Science']
print(list(sorted(marks.keys())))

{'Math': 0, 'English': 0, 'Science': 0}
('Math', 0)
('English', 0)
('Science', 0)
['English', 'Math', 'Science']


### 6.5. Dictionary Built-in Functions

Built-in functions like all(), any(), len(), cmp(), sorted(), etc. are commonly used with dictionaries to perform different tasks.

|Function	|Description|
|:---------:|:----------|
|all()	|Return True if all keys of the dictionary are True (or if the dictionary is empty).|
|any()	|Return True if any key of the dictionary is true. If the dictionary is empty, return False.|
|len()	|Return the length (the number of items) in the dictionary.|
|cmp()	|Compares items of two dictionaries. (Not available in Python 3)|
|sorted()	|Return a new sorted list of keys in the dictionary.|

In [88]:
# Dictionary Built-in Functions
squares = {0: 0, 1: 1, 3: 9, 5: 25, 7: 49, 9: 81}

# Output: False
print(all(squares))

# Output: True
print(any(squares))

# Output: 6
print(len(squares))

# Output: [0, 1, 3, 5, 7, 9]
print(sorted(squares))

False
True
6
[0, 1, 3, 5, 7, 9]
