# String manipulation and data structure in Python

## What you stand to gain in this unit

Upon completion of this study unit, you should be able to:

- Manipulate string in Python

- Create different data structure such as list, typle, set, and dictionary

- Use their methods to perform various tasks


### Python Variable Data Types 


<img src ="Images/Dtypes.png" alt="Drawing" width="800" />

Source: *[FireBlaze](https://www.fireblazeaischool.in/blogs/data-types-in-python/)*. 

In this section, we will be talking on different data types in python which will be explained below.


# String Manipulation in Python

# Python Strings

**A string is an ordered sequence of characters**. Two key words here, **ordered** and **characters.** Ordered means that we will be able to use *indexing* and *slicing* to grab elements from the string.

## Creating strings.

Single or double quotes are okay.

In [1]:
"Hi, welcome to string manipution in Python"

'Hi, welcome to string manipution in Python'

We can use another set of quotes to capture that inside a single quote. For example:

In [2]:
"I'm a beginner in python programming!"

"I'm a beginner in python programming!"

In [3]:
"I'm feeling curious"

"I'm feeling curious"

# The len function

We can get the length of a string by using `len()` function. Every position is counted including spaces.

## Examples 

In [4]:
len("Python")

6

In [5]:
len("Python is simple")

16

In [6]:
len("I'm feeling curious")

19

# String operations

We can perform some operations such as string concatenation and replication in Python.

## String Concatenation

Addition operator (`+`) enable us to join two strings together. This is known as string concatenation. For example, `"Python"` + `"string"` will become `Pythonstring`. Also, if `s1` and `s2` are strings, then `s1 + s2` is also a string (string concatenation).

# Tip

We can use `' '` or `" "` to create an empty string.


## Example 1

In [7]:
a = "Python is"

b = "a programming language"

a + b

'Python isa programming language'

# Example 2

In [8]:
a + " " + b

'Python is a programming language'

# Example 3

In [9]:
"My name is Jamal" + " and I am from Somalia"

'My name is Jamal and I am from Somalia'

# Example 4

In [10]:
A = "Hello"

B = " "

C = "world"

print(A + B + C)

Hello world


# Exampel 5

In [11]:
A = "Hello " # Note the extra space after Hello

B = "world"

print(A + B)

Hello world


# String replication

With multiplication operator (*), we can repeat the number of time a particular string should be repeated. 

# Example 1

Hello will be repeated three times

In [12]:
"Hello" * 3

'HelloHelloHello'

# Example 2

This will print Sudan in six times

In [13]:
country = "Sudan"

print(country*6)

SudanSudanSudanSudanSudanSudan


# Example 3

In [14]:
"Python is simple " * 3

'Python is simple Python is simple Python is simple '

# Example 4

String replication and string concatenation

In [15]:
"Python is simple, " * 3 + "and Python is simple"

'Python is simple, Python is simple, Python is simple, and Python is simple'

# Tip

To concatenate two strings we use the + operator

The * operator can be used to repeat the string for a given number of times.

# Indexing and Slicing

Since strings are *ordered sequences* of characters, it means we can `select` single characters (indexing) or grab sub-sections of the string (slicing).

### Indexing

Indexing starts at `0`, consider the string Somalia:

    character:    S    o    m   a   l   i   a
    
    index:        0    1    2   3   4   5   6
    
You can access a particular character by putting its position index in a square bracket.

In [16]:
country = "Somalia"

print(country)

Somalia


In [17]:
country[0]

'S'

In [18]:
country[3]

'a'

In [19]:
country[5]

'i'

Python also supports reverse indexing. Consider the string

    character:        S     o     m    a    l    i    a
    index:            0     1     2    3    4    5    6
    reverse index:   -7    -6    -5   -4   -3   -2   -1
    
Reverse indexing is used commonly to grab the last "chunk" of a sequence.

In [20]:
country[-1]

'a'

In [21]:
country[-2]

'i'

In [22]:
country[-7]

'S'

## Slicing

We can grab entire subsections of a string with *slice* notation.

This is the notation:

    [start:stop:step]

Key things to note:

1. The starting index directly corresponds to where your slice will start
2. The stop index corresponds to where your slice will go up to. **It does not include this index character!**
3. The step size is how many characters you skip as you go grab the next one.


For example, the index `[0: 4]` will be 0, 1, 2, 3. The last index will not be included. The index `[0 : 2]` pulls the first two values out of the string. The index `[1 : 3]` pulls the second and third values out of the string.

Let's see some examples

In [23]:
country = "Ethiopia"

In [24]:
country[0 : 3]

'Eth'

In [25]:
country[0 : 4]

'Ethi'

In [26]:
country[2 : 4]

'hi'

On either side of the colon, a blank stand for “default”.

- `[: 2]` corresponds to `[start=default : stop = 2]`. Default value here is 0

- `[1: ]` corresponds to `[start=1 : stop = default]`. Default value here is the last index of the string

Therefore, the slicing operation `[:2]` pulls out the first and second values in an array. The slicing operation `[1:]` pull out the second through the last values in an array. The examples below illustrate the default stop value is the last value in the array.


In [27]:
print(country)

Ethiopia


In [28]:
country[:2]

'Et'

In [29]:
country[1:]

'thiopia'

In [30]:
country[: 3]

'Eth'

## Basic String Methods

Methods are actions you can call off an object usually in the form `.method_name()` notice the closed parenthesis at the end. Strings have many methods which you can use. Infact, you can get a list of them by putting a dot(.) at the end of already assigned string and press a Tab key on your keyboard. Let's see some of them with examples.

![](Images/string_method.png)

In [31]:
basic = "hello world, I am still a beginner pythonista"

# `.upper()`

`.upper()` will convert the string to upper case.

In [32]:
basic.upper()

'HELLO WORLD, I AM STILL A BEGINNER PYTHONISTA'

# `.lower()`

`.lower()` will convert the string to lower case.

In [33]:
basic.lower()

'hello world, i am still a beginner pythonista'

# `.capitalize()`

`.capitalize()` make the first character have upper case and the rest lower case.

In [34]:
basic.capitalize()

'Hello world, i am still a beginner pythonista'

# `.title()`

`.title()` will capitalize each word in a string.

In [35]:
basic.title()

'Hello World, I Am Still A Beginner Pythonista'

# `.split()`

`.split()` will split each character in the string.

In [36]:
basic.split()

['hello', 'world,', 'I', 'am', 'still', 'a', 'beginner', 'pythonista']

# String formatting

Sometimes, we will like to print our string with a specific format. We can use the function `print()` to force the string to `print` to a new line by using `\n`:

In [37]:
print('this is a new line \nnotice how this is on another new line')

this is a new line 
notice how this is on another new line


We can also use the function `print()` to force the string to have some extra spaces (something like when pressing `tab`on the keyboard) by using `\t`:

In [38]:
print('this is a tab\t notice how this prints with space between')

this is a tab	 notice how this prints with space between


## String interpolation

String interpolation is the act of substituting values of variables into placeholders in a string. For example, as a your teacher, I can greet every student taking this course like this "Hello `{Name of person}`, thank you for taking this course!". I would like to replace the placeholder `{Name of person}` with an actual name. This process is called string interpolation.

You can use the `.format()` method in a string, to perform string interpolation, essentially inserting `variables` when printing a string. For example, consider:

In [39]:
Name_of_student = "Ruth"

In [40]:
print("Welcome {}, thank you for taking this course!".format(Name_of_student))

Welcome Ruth, thank you for taking this course!


With the introduction of `f`-strings, we can actually do it in this way: 

In [41]:
print(f"Hello {Name_of_student}, thank you for taking this course!")

Hello Ruth, thank you for taking this course!


In above example, the prefix `f` tells Python to substitute the value of the variable `Name_of_student` inside curly brackets {}. So, that when we print we get the above output. This new way of formatting strings is powerful, easy to use, and we shall be using `f` string henceforth.

# Example 2

In [42]:
username = "Jamal_cs21"

password = 8022021

In [43]:
print(f"Welcome {username} and your password is {password}")

Welcome Jamal_cs21 and your password is 8022021


# Example 3

In [44]:
name = "Jamal"

action = 'learn'

In [45]:
print(f"The {name} needs to {action} python")

The Jamal needs to learn python


# Tip

If we have just one variable to do string extrapolation for, we can use print() function with the f-string.

# Example 4

In [46]:
name = "Addisu"
print("My name is", name)

My name is Addisu


# Example 5

In [47]:
name = "Addisu"
print(f"My name is {name}")

My name is Addisu


### Formatting Numbers

You can also control the number of floating point or decimal places

In [48]:
num = 245.908343

print("The number is {num}")

The number is {num}


In [49]:
# For one decimal place

print(f"The code is {num:.1f}")

The code is 245.9


In [50]:
# For two decimal places

print(f"The code is {num:.2f}")

The code is 245.91


In [51]:
# For three decimal places

print(f"The code is {num:.3f}")

The code is 245.908


In [52]:
# For four decimal places

print(f"The code is {num:.4f}")

The code is 245.9083


# Input and output function

`input()` and `print()` functions are widely used for standard input and output operations respectively in Python. 

There are many cases where we might want to take the input from the user. In Python, we have the input() function that allows for this. You can get an input from a user and then save that as a variable that can be used later in your program with the help of `input()` function.

# Example 1

This program will ask you of your name. Please respond by typing your name, then, press enter button on your keyboard.

In [53]:
name = input("What is your name?")

print(name)

What is your name? Jamal


Jamal


Any variable that comes as a result of `input()` will always be a string data type.

# Example 2

In [54]:
name = input("What is your name?")

print(name)

type(name)

What is your name? jamal


jamal


str

# Example 3

The code below when run will ask of your name, please supply your name to it:

In [55]:
name = input("What is your name?")

print(f"My name is {name}")

print(f"{name} is of type {type(name)}")

What is your name? Jamal


My name is Jamal
Jamal is of type <class 'str'>


# Example 4

This  program will ask of your name, your age, and your country. It will then print for example, `My name is Jamal, I am 20 years old, and I am from Somali`.

In [56]:
name = input("What is your name?")

age = input("How old are you?")

country = input("What is the name of your country?")

print(f"My name is {name}, I am {age} years old, and I am from {country}")

What is your name? Jamal
How old are you? 20
What is the name of your country? Somalia


My name is Jamal, I am 20 years old, and I am from Somalia


# Example 5

In [57]:
name = "Addisu"

print("My name is", name)

My name is Addisu


# Example 6

In [58]:
name = "Addisu"

print(f"My name is {name}")

My name is Addisu


# Lists

We've learned that strings are sequences of characters. Similarly, lists are sequences of objects, they can hold a variety of data types in order, and they follow the same sequence and indexing bracket rules that strings do. We can create a list by puting all items or elements in a square bracket `[]` where each element is being separated by a comma. The items or elements in a list can be of any data types i.e. floats, integers, strings, boolean or there combination.

Let's explore some useful examples:

In [59]:
# An empty list

alist = []

type(alist)

list

In [60]:
my_list = [1, 2, 3]

In [61]:
my_list

[1, 2, 3]

In [62]:
a = 100

b = 200

c = 300

my_list3 = [a, b, c]

# List of integers 

In [63]:
ages = [21, 23, 16, 6, 76, 7]

In [64]:
ages

[21, 23, 16, 6, 76, 7]

In [65]:
type(ages)

list

# List of float

In [66]:
time = [2.23, 1.59, 4.18, 3.51]

In [67]:
time

[2.23, 1.59, 4.18, 3.51]

In [68]:
type(time)

list

# List of string

In [69]:
countries = ["Sudan", "South Sudan", "Kenya", "Ethiopia", "Somalia", "Uganda" ]

In [70]:
countries

['Sudan', 'South Sudan', 'Kenya', 'Ethiopia', 'Somalia', 'Uganda']

In [71]:
type(countries)

list

# Mixed lists 

In [72]:
Mixed = [90, 2.5, "Jamal", 123, 0.75, True, False]

In [73]:
Mixed

[90, 2.5, 'Jamal', 123, 0.75, True, False]

In [74]:
type(Mixed)

list

## Nested Lists

Lists can hold other lists! This is called a nested list. 

## Examples

Let's see some examples:

In [75]:
new_list = [1, 2, 3, "Ruth",  ["a", "b", "c"]]

In [76]:
new_list

[1, 2, 3, 'Ruth', ['a', 'b', 'c']]

In [77]:
type(new_list)

list

In [78]:
eastafrica_northafrica = [["Ethiopia", "Kenya", "Rwanda", "Somalia", "South Sudan", "Uganda", "Burundi"],
                           ["Algeria", "Egypt", "Libya", "Morocco", "Sudan", "Tunisia"]]

In [79]:
eastafrica_northafrica 

[['Ethiopia',
  'Kenya',
  'Rwanda',
  'Somalia',
  'South Sudan',
  'Uganda',
  'Burundi'],
 ['Algeria', 'Egypt', 'Libya', 'Morocco', 'Sudan', 'Tunisia']]

In [80]:
type(eastafrica_northafrica)

list

# The range() function

We can also generate a list of sequence of numbers by using `range()` function. For example, `range(12)` will generate numbers from $0$ to $11$ ($12$ numbers).

The range function also has the `start`, `stop` and `step size` i.e. `range(start, stop,step_size)`. The `step_size` is $1$ if not provided. The range object is "lazy" and does not store all the values in the memory. So it remembers the `start`, `stop`, `step_size`.

# Examples

In [81]:
range(10)

range(0, 10)

In [82]:
range(2, 8)

range(2, 8)

In [83]:
range(2, 20, 3)

range(2, 20, 3)

To force this function to output all the items, we can use the function `list()`.

# Examples

In [84]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

As you know, the data type is a list

In [85]:
type(list(range(10)))

list

In [86]:
list(range(2, 8))

[2, 3, 4, 5, 6, 7]

In [87]:
list(range(2, 20, 3))

[2, 5, 8, 11, 14, 17]

# The len function

We can get the number of elements in a list by using `len()` function.

## Examples 

In [88]:
Mixed = [90, 2.5, 'Ezekiel', 123, 0.75, True, False]

len(Mixed)

7

In [89]:
east_africa = ["Ethiopia", "Kenya", "Rwanda", "Somalia", "South Sudan", "Uganda", "Burundi"]

len(east_africa)

7

In [90]:
eastafrica_northafrica = [["Ethiopia", "Kenya", "Rwanda", "Somalia", "South Sudan", "Uganda", "Burundi"],
                           ["Algeria", "Egypt", "Libya", "Morocco", "Sudan", "Tunisia"]]

len(eastafrica_northafrica)

2

In [91]:
new_list = [1, 2, 3, ['a', 'b', 'c']]

len(new_list)

4

# Indexing and Slicing

This works the same as the indexing and slicing of a string.

# Example 1

In [92]:
mylist = [90, 2.5, "Joan", 123, 0.75, True, False, "Ruth", "Jamal"]

In [93]:
mylist[2]

'Joan'

In [94]:
mylist[0:3]

[90, 2.5, 'Joan']

# Example 2

In [95]:
new_list = [1, 2, 3, ['a', 'b', 'c']]

In [96]:
new_list[0]

1

In [97]:
new_list[3]

['a', 'b', 'c']

In [98]:
new_list[3][0]

'a'

# Example 3

In [99]:
password_list = [2, 3, "four", [20, 30, 40, ["one", "two", "three"]]]

In [100]:
password_list[3]

[20, 30, 40, ['one', 'two', 'three']]

In [101]:
password_list[3][3]

['one', 'two', 'three']

In [102]:
password_list[3][3][1:]

['two', 'three']

## Immutability

List is immutable, that is you can change the element of a list to another.

# Example 1

In [103]:
mylist = [1, 2, 3, 4, 5]

In [104]:
mylist[0] 

1

In [105]:
mylist[0] = 9

In [106]:
mylist

[9, 2, 3, 4, 5]

As you can see, we have changed the first element in a list from $1$ to $9$.

# Example 2

In [107]:
another_list = [90, 2.5, 'Aaden', 123, 0.75, True, False, 'Adhra', True, "No"]

In [108]:
another_list[2]

'Aaden'

In [109]:
another_list[2] = "Jamal"

In [110]:
another_list

[90, 2.5, 'Jamal', 123, 0.75, True, False, 'Adhra', True, 'No']

# List Methods

Methods are actions you can call from an objects. Their typical format is:

    mylist = [elements in a list]
    
    mylist.method()
    
You must call the parenthesis to execute the method! Let's go through a few methods that pertain to lists.

# .append( ) method

This appends or add an object to the end of the list. It can only add one element at a time

# Examples

In [111]:
mylist = [90, 2.5, 'Aaden', 123, 0.75, True, False, 'Adhra', True, "No"]

In [112]:
mylist

[90, 2.5, 'Aaden', 123, 0.75, True, False, 'Adhra', True, 'No']

In [113]:
mylist.append(6)

In [114]:
mylist

[90, 2.5, 'Aaden', 123, 0.75, True, False, 'Adhra', True, 'No', 6]

In [115]:
mylist.append("Jamal")

In [116]:
mylist

[90, 2.5, 'Aaden', 123, 0.75, True, False, 'Adhra', True, 'No', 6, 'Jamal']

# .extend()

Alternative to `.append()` is `.extend()` which extends list by appending more than one element.

In [117]:
mylist.extend(["Addisu", "Aklilu", "Amondi", "Bakari"])

In [118]:
mylist

[90,
 2.5,
 'Aaden',
 123,
 0.75,
 True,
 False,
 'Adhra',
 True,
 'No',
 6,
 'Jamal',
 'Addisu',
 'Aklilu',
 'Amondi',
 'Bakari']

# .insert( ) method

`.insert()` method insert object before a given index position.

# Examples

In [119]:
mylist = [90, 2.5, 'Aaden', 123, 0.75, True, False, 'Adhra', True, "No"]

mylist.insert(3, "Bakaffa")

In [120]:
mylist

[90, 2.5, 'Aaden', 'Bakaffa', 123, 0.75, True, False, 'Adhra', True, 'No']

In [121]:
visited_countries = ["America", "Rwanda", "Singapore", "Italy", "Canada", "Mauritius"]

In [122]:
visited_countries.insert(0, "South Africa")

In [123]:
visited_countries

['South Africa',
 'America',
 'Rwanda',
 'Singapore',
 'Italy',
 'Canada',
 'Mauritius']

# .pop() method

`.pop()` method remove and return item at index (default last).      

In [124]:
mylist = [90, 2.5, 'Aaden', 123, 0.75, True, False, 'Adhra', True, "No"]

mylist.pop()

'No'

In [125]:
mylist

[90, 2.5, 'Aaden', 123, 0.75, True, False, 'Adhra', True]

In [126]:
mylist.pop(0)

90

In [127]:
mylist

[2.5, 'Aaden', 123, 0.75, True, False, 'Adhra', True]

# .reverse() method

`.reverse()` method reverse the order of the list

In [128]:
your_list = ["Sylvera", 90, 2.5, 'Aaden', 123, 0.75, True, False, 'Adhra', True, "No"]

In [129]:
your_list.reverse()

In [130]:
your_list

['No', True, 'Adhra', False, True, 0.75, 123, 'Aaden', 2.5, 90, 'Sylvera']

In [131]:
visited_countries = ["America", "Rwanda", "Singapore", "Italy", "Canada", "Mauritius"]

visited_countries.reverse()

visited_countries

['Mauritius', 'Canada', 'Italy', 'Singapore', 'Rwanda', 'America']

# .sort() method

`.sort()` method sort the list in ascending order and return None

# Example 1

In [132]:
# Example 1

egg_weight = [59, 56, 61, 68, 52, 53, 69, 54, 57, 51]


egg_weight.sort()

In [133]:
egg_weight 

[51, 52, 53, 54, 56, 57, 59, 61, 68, 69]

# Example 2

Data relating to the marks of 13 students in the Introduction to Python quiz are given below:

10, 15, 10, 9, 18, 16, 14, 12, 16, 13, 15, 20, 17.

Sort the marks in descending order.

In [134]:
marks = [10, 15, 10, 9, 18, 16, 14, 12, 16, 13, 15, 20, 17]

In [135]:
marks.sort(reverse = True)

In [136]:
marks

[20, 18, 17, 16, 16, 15, 15, 14, 13, 12, 10, 10, 9]

# Class activity

1. Create a list that contains the names of your best friends.

1. Use Python to access the third element in the name list.

1. How many friends are in your list?

# Tuples 

Tuples are ordered sequences just like a list, but have one major difference, they are **immutable**. That is, you can not *change* them. So in practice what does this actually mean? It means that you can not reassign an item once its in the tuple, unlike a list, where you can do a reassignment.

Just like the elements in a list are put in a square bracket `[ ]` seperated by a comma, elements in a tuple are enclosed in a parentheses or brackets `( )` separated by comma `,`.

# Tip

You use parenthesis and commas for a tuple

List is immutable while tupple is mutatable


# Examples

In [137]:
# An empty tuple

atuple = ()

type(atuple)

tuple

In [138]:
t = (3, 2)

t

(3, 2)

# Tuple of integers 

In [139]:
ages = (23, 2, 45, 6, 76, 7)

In [140]:
type(ages)

tuple

# Tuple of float

In [141]:
marks = (23.1, 20.8, 25.1, 17.9)

In [142]:
type(marks)

tuple

# Tuple of string

In [143]:
teachers_name = ("Diric", "Bilen", "Baruk", "Jamal", "Gelila")

In [144]:
type(teachers_name)

tuple

# Mixed Tuple

In [145]:
mixed_tuple = (21, 12.3, 33.6, 9, "Gelila", True, False)

In [146]:
type(marks)

tuple

# Nexted tuple

This is also known as a tuple of tuple

In [147]:
nexted = ("Ethopia", (1, 2.5, 6), ('Kenya'), (1, 22, 14, 15))

In [148]:
type(nexted)

tuple

# The len function

We can get the number of elements in a tuple by using `len()` function.

## Example 1

In [149]:
mixed_tuple = (21, 12.3, 33.6, 23.8, 9, "Gelila", True, False)

In [150]:
len(mixed_tuple)

8

## Example 2

In [151]:
nexted = ("Ethopia", (1, 2.5, 6), ('Kenya'), (1, 22, 14, 15))

len(nexted)

4

# Indexing and Slicing

This works the same as the indexing and slicing of a list.

# Example 1

In [152]:
mytuple = (90, 2.5, 'Aaden', 123, 0.75, True, False, 'Adhra', True, "No")

In [153]:
mytuple[3]

123

In [154]:
mytuple[0:3]

(90, 2.5, 'Aaden')

# Example 2

In [155]:
new_tuple = (1, 2, 3, ('a', 'b', 'c'))

In [156]:
new_tuple[0]

1

In [157]:
new_tuple[3]

('a', 'b', 'c')

In [158]:
new_tuple[3][0]

'a'

## Mutability

While list is immutable, tuple is not. That is, you can't replace element of a tuple.

# Examples

In [159]:
mytuple = (1, 2, 3)

In [160]:
mytuple[0] = 9

TypeError: 'tuple' object does not support item assignment

As you can see, you can't add or change element of a tuple.

List's methods are different from tuple's method. Therefore, none of the methods in a list can work in tuple. For example:

In [161]:
mytuple.append('NOPE!')

AttributeError: 'tuple' object has no attribute 'append'

## Tuple methods

Tuples only have two methods available `.index()` and `.count()`

`.index()` returns the index of value.

`.count()` returns number of occurrences of value

In [162]:
t = ("a", "b", "a",  "c", "a")

In [163]:

t.index("b")

1

In [164]:
t.count("a")

3

## Why use tuples?

Lists and tuples are very similar, so you may find yourself exchanging use cases for either one. However, you should use a tuple for collections or sequences that shouldn't be changed, such as the dates of the year, or user information such as an address, street, city , etc.

# Class activity

1. Create a tuple that includes the names of your best friends

1. Use Python to access the third element in the name list.

1. What is the length of the tuple?

# Sets

Another fundamental data structure is Set! Set is an unordered and unindexed collection of unique elements. We can construct them by using a curly bracket`{ }` while elements in a set is being separated by a comma (,).

Let's go ahead and make a set to see how it works:

# Example 1

In [165]:
call_received = {0, 1, 4, 2, 3, 5}

call_received

{0, 1, 2, 3, 4, 5}

In [166]:
type(call_received)

set

You will notice that elements in a set has been arranged in an organised ascending order

# Example 2

In [167]:
# Here are the set of fruits in my Fridge

myfruit = {"Apple", "Banana", "Cherry", "Orange", "Pineapples", "Grape", "Pawpaw"}

In [168]:
myfruit

{'Apple', 'Banana', 'Cherry', 'Grape', 'Orange', 'Pawpaw', 'Pineapples'}

In [169]:
type(myfruit)

set

# A note

One unique feature about set is that, it doesn't support duplicate of an element.

Example 1

In [170]:
student_age = {19, 20, 15, 19, 16, 21, 17}

student_age

{15, 16, 17, 19, 20, 21}

In [171]:
type(student_age)

set

You will see that our initial elements in student age are 19, 20, 15, 19, 16, 21, and 17 but set has removed all the duplicates and we now left with the elements 15, 16, 17, 19, 20, and 21

# The len function

We can get the number of elements in a set by using `len()` function.

## Example 

In [172]:
len(student_age)

6

In [173]:
# Set also support mixed data type

myset= {90, 2.5, 'Aaden', 123, 0.75, True, False, 'Adhra', True, "No"}

In [174]:
len(myset)

9

# Indexing and Slicing

Since, set are unordered collection of unique elements, indexing has no meaning. Hence, the slicing operator `[]` will not work.

In [175]:
students_age = {19, 20, 15, 16, 21, 17}

students_age[1]

TypeError: 'set' object is not subscriptable

As you can see, that throws an error. Since indexing is not working, then element in a set is not replaceable.

# Adding element to a set

Since set is mutable, hence it is possible to add element to an existing set by using `.add()` attribute. If the element is already present in the set, then the function will ignore adding that element.

# Example 1

In [176]:
student_age = {19, 20, 15, 16, 21, 17}

In [177]:
student_age.add(18)

In [178]:
student_age

{15, 16, 17, 18, 19, 20, 21}

# Example 2

In [179]:
myset= {"Gurey", "Guuleed", "Jamal", "Ruth", "Habsade", "Habtom"}

myset

{'Gurey', 'Guuleed', 'Habsade', 'Habtom', 'Jamal', 'Ruth'}

In [180]:
len(myset)

6

In [181]:
myset.add("Juma")

In [182]:
myset

{'Gurey', 'Guuleed', 'Habsade', 'Habtom', 'Jamal', 'Juma', 'Ruth'}

In [183]:
len(myset)

7

# Basic Set Operations 

We can perform set operations like union, intersection, compliment of two sets. As you know, sets have unique values. They eliminate duplicates. We can represent the relationship beween sets in a diagram known as a Venn daigram.

![](Images/Venn.png)

## Union of sets

The union of set `A` and `B`, denoted by $A \cup B$,  is the collection of all elements in both sets without any duplication of elements.


![](Images/union.jpg)


To get the union of two sets, we put `|` at the middle of the two sets.

# Example 1

In [184]:
A = {7, 2, 5}

B = {2, 5, 1, 8}

What is the union of `A` and `B`?

In [185]:
AunionB = A | B

In [186]:
print(AunionB)

{1, 2, 5, 7, 8}


We can also use `.union()` attribute to get the union of a set

In [187]:
AunionB = A.union(B)

In [188]:
print(AunionB)

{1, 2, 5, 7, 8}


# Example 2

In [189]:
odd_number = {1, 3, 5, 7, 9}

even_number = {2, 4, 6, 8, 10}

What is the union of set `odd_number` and `even_number`?

In [190]:
all_numbers = odd_number | even_number

In [191]:
print(all_numbers)

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}


In [192]:
print(all_numbers)

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}


# Example 3

In [193]:
myfruits = {"Apple", "Banana", "Cherry", "Orange", "Water melon" }

friend_fruit = {"Pineapples", "Grape", "Pawpaw", "Banana", "Mango"}

In [194]:
all_fruits = myfruits | friend_fruit

all_fruits

{'Apple',
 'Banana',
 'Cherry',
 'Grape',
 'Mango',
 'Orange',
 'Pawpaw',
 'Pineapples',
 'Water melon'}

In [195]:
len(all_fruits)

9

My friend and I have 9 fruits altogether.

# Class activity

![](Images/Union_1.jpg)

Consider the venn diagram above:

1. represent both sets X and Y in Python

2. what is the union of set X and Y?

# Intersection

Intersection of two sets `A` and `B`, denoted by $A \cap B$, is the set containing all elements of A that also belong to B (or equivalently, all elements of B that also belong to A).


![](Images/intersection.jpg)

To compute intersection of two in Python, we put `&` at the middle of the two sets. 

# Example 1

In [196]:
A = {2, 4, 5, 1, 3}

B = {1, 3, 9, 12}

In [197]:
intersection = A & B

In [198]:
print(intersection)

{1, 3}


We can also use the function  `.intersection()`. 

In [199]:
intersection = A.intersection(B)

In [200]:
print(intersection)

{1, 3}


# Example 2

In [201]:
myfruits = {"Apple", "Banana", "Cherry", "Orange", "Water melon" }

friend_fruits = {"Pineapples", "Grape", "Pawpaw", "Banana", "Mango"}

In [202]:
common_fruit = myfruits & friend_fruit

common_fruit

{'Banana'}

As you can see, we have oly Banana in common.

# Class activity

![](Images/Intersection_1.png)

Consider the venn diagram above:

1. represent both sets P and Q in Python

2. what is the intersection of set X and Y?

# Complement of a set (or set difference)


The complement or set difference of sets A and B, denoted by `A – B`, is the set of all elements in A that are not in B.

![](Images/difference.jpg)


To compute complement of two sets in Python, we put `-` at the middle of the two sets. 

# Example 1

In [203]:
A = {2, 4, 6, 3, 5, 7}

B = {3, 5, 7, 9, 11, 13}

difference = A-B

In [204]:
print(difference)

{2, 4, 6}


We can also use the function `.difference()`. 

In [205]:
difference = A.difference(B)

In [206]:
print (difference)

{2, 4, 6}


# Example 2

In [207]:
myfruits = {"Apple", "Banana", "Cherry", "Orange", "Water melon" }

friend_fruits = {"Pineapples", "Grape", "Pawpaw", "Banana", "Mango"}

In [208]:
# Fruits that I have that my friend did not have

difference_fruit = myfruits - friend_fruit

difference_fruit

{'Apple', 'Cherry', 'Orange', 'Water melon'}

The fruits that I have that my friend did not have are Apple, Cherry, Orange, and Water melon.

# Class activity

![](Images/difference_1.jpg)

Consider the venn diagram above:

1. represent both sets A and C in Python

2. what is the difference of set C and A?

# Dictionary

Dictionary is an ordered collection of `key-value` pairs.That is, it makes use of two elements, namely, a key and a value. Dictionary is usually used when we have a huge amount of data. We must know the `key` before we can retrieve the `value`.

We can create a dictionary by defining keys and value elements inside a curly bracket`{}. 

# Example 1

In [209]:
my_dictionary = {"Key 1": "Value 1", "Key 2": "Value 2",  "Key 3": "Value 3"}

In [210]:
my_dictionary

{'Key 1': 'Value 1', 'Key 2': 'Value 2', 'Key 3': 'Value 3'}

In [211]:
type(country)

str

# Example 2

In [212]:
# Life expectancy is the average number of year a person is expected to live based on the year of its birth

# life expectancy in the year 2020

life_expectancy = {"Nigeria": 60, "Kenya": 69, "Uganda": 68, "Ethiopia": 68, "Sudan": 67, "Rwanda": 65, "Tanzania": 64, "Somalia": 54}

In [213]:
life_expectancy

{'Nigeria': 60,
 'Kenya': 69,
 'Uganda': 68,
 'Ethiopia': 68,
 'Sudan': 67,
 'Rwanda': 65,
 'Tanzania': 64,
 'Somalia': 54}

In [214]:
type(life_expectancy)

dict

# Example 3

In [215]:
# Some countries in the given Africa regions

africa_regions = {"East Africa": ["Ethiopia", "Kenya", "Rwanda", "Somalia", "South Sudan", "Uganda", "Burundi"],
                    "North Africa": ["Algeria", "Egypt", "Libya", "Morocco", "Sudan", "Tunisia"],
                    "West Africa": ["Nigeria", "Ghana" "Senegal", "Benin", "Liberia", "Mali", "Niger"],
                      "Central Africa": ["Angola", "Cameroon", "Chad", "Congo", "DRC", "Gabon"],
                      "Southern Africa": ["Botswana", "Eswatini", "Lesotho", "Namibia", "RÈunion", "South Africa"]}

In [216]:
print(africa_regions)

{'East Africa': ['Ethiopia', 'Kenya', 'Rwanda', 'Somalia', 'South Sudan', 'Uganda', 'Burundi'], 'North Africa': ['Algeria', 'Egypt', 'Libya', 'Morocco', 'Sudan', 'Tunisia'], 'West Africa': ['Nigeria', 'GhanaSenegal', 'Benin', 'Liberia', 'Mali', 'Niger'], 'Central Africa': ['Angola', 'Cameroon', 'Chad', 'Congo', 'DRC', 'Gabon'], 'Southern Africa': ['Botswana', 'Eswatini', 'Lesotho', 'Namibia', 'RÈunion', 'South Africa']}


In [217]:
type(africa_regions)

dict

## Dictionary Length

To determine how many items a dictionary has, use the `len()` function.

## Example 1

In [218]:
life_expectancy = {"Nigeria": 60, "Kenya": 69, "Uganda": 68, "Ethiopia": 68, "Sudan": 67, "Rwanda": 65, 
                   "Tanzania": 64, "Somalia": 54}

len(life_expectancy)

8

## Example 2

In [219]:
africa_regions = {"East Africa": ["Ethiopia", "Kenya", "Rwanda", "Somalia", "South Sudan", "Uganda", "Burundi"],
                    "North Africa": ["Algeria", "Egypt", "Libya", "Morocco", "Sudan", "Tunisia"],
                    "West Africa": ["Nigeria", "Ghana" "Senegal", "Benin", "Liberia", "Mali", "Niger"],
                      "Central Africa": ["Angola", "Cameroon", "Chad", "Congo", "DRC", "Gabon"],
                      "Southern Africa": ["Botswana", "Eswatini", "Lesotho", "Namibia", "RÈunion", "South Africa"]}

len(africa_regions)

5

# Accessing dictionary items

Dictionary items are presented in key : value pairs, and can be referred to by using the key name. To access a specific value in the dictionary data set, you need to index the right `key`. Dictionaries in Python are mutable and the elements in a dictionary can be added, removed, modified, and changed accordingly.

You can access the items of a dictionary by referring to its key name, inside square brackets. For example, consider life expectancy dictionay:

In [220]:
life_expectancy = {"Nigeria": 60, "Kenya": 69, "Uganda": 68, "Ethiopia": 68, "Sudan": 67, "Rwanda": 65, "Tanzania": 64, "Somalia": 54}

To access the value in the key Nigeria, we use: 

In [221]:
life_expectancy["Nigeria"]

60

Also, for Ethiopia, we use:

In [222]:
life_expectancy["Ethiopia"]

68

You can not access items in a dictionary by index

In [223]:
life_expectancy[1]

KeyError: 1

# Dictionary Methods

Methods in a dictionary are as follows:


- `.keys()`

- `.values()`

- `.items()`

- `.updates()`

## .keys() method

The `.keys()` method will return a list of all the keys in the dictionary.

## Example 1

In [None]:
life_expectancy = {"Nigeria": 60, "Kenya": 69, "Uganda": 68, "Ethiopia": 68, "Sudan": 67, "Rwanda": 65, 
                   "Tanzania": 64, "Somalia": 54}

In [None]:
life_expectancy.keys()

## Example 2

In [None]:
africa_regions = {"East Africa": ["Ethiopia", "Kenya", "Rwanda", "Somalia", "South Sudan", "Uganda", "Burundi"],
                    "North Africa": ["Algeria", "Egypt", "Libya", "Morocco", "Sudan", "Tunisia"],
                    "West Africa": ["Nigeria", "Ghana" "Senegal", "Benin", "Liberia", "Mali", "Niger"],
                      "Central Africa": ["Angola", "Cameroon", "Chad", "Congo", "DRC", "Gabon"],
                      "Southern Africa": ["Botswana", "Eswatini", "Lesotho", "Namibia", "RÈunion", "South Africa"]}

In [None]:
africa_regions.keys()

## .values() method

The `.values()` method will return a list of all the values in the dictionary.

## Example 1

In [None]:
life_expectancy = {"Nigeria": 60, "Kenya": 69, "Uganda": 68, "Ethiopia": 68, "Sudan": 67, "Rwanda": 65, 
                   "Tanzania": 64, "Somalia": 54}

In [None]:
life_expectancy.values()

## Example 2

In [None]:
africa_regions = {"East Africa": ["Ethiopia", "Kenya", "Rwanda", "Somalia", "South Sudan", "Uganda", "Burundi"],
                    "North Africa": ["Algeria", "Egypt", "Libya", "Morocco", "Sudan", "Tunisia"],
                    "West Africa": ["Nigeria", "Ghana" "Senegal", "Benin", "Liberia", "Mali", "Niger"],
                      "Central Africa": ["Angola", "Cameroon", "Chad", "Congo", "DRC", "Gabon"],
                      "Southern Africa": ["Botswana", "Eswatini", "Lesotho", "Namibia", "RÈunion", "South Africa"]}

In [None]:
africa_regions.values()

# .items() method

The `items()` method will return each item in a dictionary, as tuples in a list.

## Example 1

In [None]:
life_expectancy = {"Nigeria": 60, "Kenya": 69, "Uganda": 68, "Ethiopia": 68, "Sudan": 67, "Rwanda": 65, 
                   "Tanzania": 64, "Somalia": 54}

In [None]:
life_expectancy.items()

## Example 2

In [None]:
africa_regions = {"East Africa": ["Ethiopia", "Kenya", "Rwanda", "Somalia", "South Sudan", "Uganda", "Burundi"],
                "North Africa": ["Algeria", "Egypt", "Libya", "Morocco", "Sudan", "Tunisia"],
                "West Africa": ["Nigeria", "Ghana" "Senegal", "Benin", "Liberia", "Mali", "Niger"],
                "Central Africa": ["Angola", "Cameroon", "Chad", "Congo", "DRC", "Gabon"],
                "Southern Africa": ["Botswana", "Eswatini", "Lesotho", "Namibia", "RÈunion", "South Africa"]}

In [None]:
africa_regions.items()

# Update Dictionary

The `.update()` method will update the dictionary with the items from the given argument. The argument must be an object with key:value pairs.

In [227]:
life_expectancy = {"Nigeria": 60, "Kenya": 69, "Uganda": 68, "Ethiopia": 68, "Sudan": 67, "Rwanda": 65, 
                   "Tanzania": 64, "Somalia": 54}

You were given the life expectancy of Burundi as $70$. You can add that to the existing dictionary by using:

In [228]:
life_expectancy.update({"Burundi": 67})

In [229]:
life_expectancy

{'Nigeria': 60,
 'Kenya': 69,
 'Uganda': 68,
 'Ethiopia': 68,
 'Sudan': 67,
 'Rwanda': 65,
 'Tanzania': 64,
 'Somalia': 54,
 'Burundi': 67}

We can also use a new index key and value to add a new item

In [230]:
life_expectancy["Eritrea"] = 66

In [231]:
life_expectancy

{'Nigeria': 60,
 'Kenya': 69,
 'Uganda': 68,
 'Ethiopia': 68,
 'Sudan': 67,
 'Rwanda': 65,
 'Tanzania': 64,
 'Somalia': 54,
 'Burundi': 67,
 'Eritrea': 66}

# Removing Items in a dictionary

The `del` keyword removes the item with the specified key name.

# Example 1

For example, to delete information about Eritrea in the life expectancy dictionary we use:

In [232]:
del life_expectancy["Eritrea"]

In [233]:
life_expectancy

{'Nigeria': 60,
 'Kenya': 69,
 'Uganda': 68,
 'Ethiopia': 68,
 'Sudan': 67,
 'Rwanda': 65,
 'Tanzania': 64,
 'Somalia': 54,
 'Burundi': 67}

# Example 2
To delete information about Burundi, we use:

In [234]:
del life_expectancy["Burundi"]

In [235]:
life_expectancy

{'Nigeria': 60,
 'Kenya': 69,
 'Uganda': 68,
 'Ethiopia': 68,
 'Sudan': 67,
 'Rwanda': 65,
 'Tanzania': 64,
 'Somalia': 54}

# Class activity

Just like the life expectancy data, the number of confirmed COVID-19 cases were also given and shown below:

|             |                                |
|-------------|--------------------------------|
| Burundi     | 694                            |
| Ethiopia    | 113295                         |
| Ghana       | 52198                          |
| Kenya       | 88380                          |
| Nigeria     | 68937                          |
| Rwanda      | 6129                           |
| Somalia     | 4525                           |
| South Sudan | 3166                           |
| Sudan       | 19196                          |
| Uganda      | 22499                          |

1. Create a dictionary for the given data and name it `COVID_19`

2. How many keys are in the data?

3. Remove Nigeria from the data

4. Add Tanzania with the value $509$

# Class activity

Consider the dictionary:

In [None]:
d = {"fruits": ["apples", "oranges", "pears", "mangoes"],
"vegetables": ["tomatoes", "lettuce", "spinach", "green peppers"],
"meat": ["chicken", "fish", "beef", "ostrich"],
"dairy": ["yogurt", "milk", "cheese", "ice-cream"] }

a. How many keys does d have?

b. List the values of d

c. How do you access ”spinach” using the dictionary d?

d. How do you add a new fruit?