# Python Fundamentals - VI: Sequences

## Introduction

In this lesson, we will learn about data types and data structures that are sequences of items. These data structures are very useful in data analysis applications. We will discuss
 * Ranges
 * Lists
 * Tuples
 * Strings
 * Dictionaries

Note: 
1. Use the TOC to navigate between sections.
2. **This lesson does not cover all operations and functions related to these data structures. Please read this [section of the library reference](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range) for a more thorough coverage of useful concepts.**


## Sequence
A sequence is an ordered collection of items (i.e., items have specified positions). Sequences can be mutable (can be modified after they are created) or immutable (cannot be modified after they are created).

## Ranges

A range is an immutable sequence of numbers. It can be created using the range() function. 

The range() function returns a sequence of numbers with the specified (or default) start value, end value, and increment. Execute the examples below and add comments explaining what the range function returns. 

In [None]:
# returns a sequence from __ to __ in increments of __
for val in range(0,5,1):
    print(val)

In [None]:
# returns a sequence from __ to __ in increments of __
for val in range(0,11,2):
    print(val)

In [None]:
# returns a sequence from __ to __ in increments of __
for val in range(10,0,-1):
    print(val)

In [None]:
# returns a sequence from __ to __ in increments of __
for val in range(1,11):
    print(val)

In [1]:
# returns a sequence from __ to __ in increments of __
for val in range(5):
    print(val)

0
1
2
3
4


## Lists

A list is a mutable sequence of items or elements. Lists can contain elements of any data type.

### Creating a list 
A list is created by enclosing comma separated elements in square brackets. Let's look at a few examples below.

In [None]:
# empty list
blank_list =
print(blank_list)

In [None]:
# list of names (strings)
names =  # Ajay, Bella, Yue
print(names)

In [None]:
# list of ages
ages =  # 30, 28, 31
print(ages)

In [None]:
# list of mixed types - name (string), age (integer), exam scores (list)
student_info =  # Ajay, 30, [90,95]
print(student_info)

In [None]:
# list of mixed types - name (string), age (integer), exam scores (list) from previously created objects
name = "Bella"
age = 28
exam_scores = [90,95]
stud_info = # create the list
print(stud_info)

One use case is to create a list from a range.

In [None]:
# alternate way (using typecasting) to create a list from a range 1-3
roll_call_numbers = 
print(roll_call_numbers)

Sometimes you want the same value repeated multiple times.

In [None]:
# create a list with 2 repeated 6 times


### Accessing list elements

Since the elements of a list are ordered, we can access them by specifying the position or index of an element. Keep in mind that the indices start from zero. If you have a list with 5 elements then the indices are 0-4. 

In [2]:
countries = ['Australia','Bangladesh','Croatia','Denmark','Ethiopia']

In [3]:
print(countries) # print the entire list (as a list object)

['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia']


In [4]:
 # first element 
countries[0]

'Australia'

In [5]:
 # second element 
countries [1]


'Bangladesh'

In [8]:
 # fifth or last element using index
countries[4]

'Ethiopia'

In [10]:
 # last element using -index
countries[-2]

'Denmark'

In [13]:
print(len(countries))
# last element using len() and relative index
countries[len(countries)-2]

5


'Denmark'

In [None]:
 # second last element using -index

In [16]:
print("--- printing all countries in the list individually using indices---")
for i in range(1,len(countries)+1):
    print(countries[i-1])


--- printing all countries in the list individually using indices---
Australia
Bangladesh
Croatia
Denmark
Ethiopia


In [3]:
print("--- printing all countries in the list individually using an enumerator and in uppercase ---")
for i in range(1,len(countries)+1):
    print(countries[i-1].upper())

--- printing all countries in the list individually using an enumerator and in uppercase ---
AUSTRALIA
BANGLADESH
CROATIA
DENMARK
ETHIOPIA


In [4]:
print("--- printing list ---")
print(countries)

--- printing list ---
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia']


In [5]:
print("--- printing all countries in the list without square brackets ---")
print(*countries)

--- printing all countries in the list without square brackets ---
Australia Bangladesh Croatia Denmark Ethiopia


In [6]:
print("--- printing all countries in the list individually without iteration ---")
print(*countries, sep = "\n")

--- printing all countries in the list individually without iteration ---
Australia
Bangladesh
Croatia
Denmark
Ethiopia


### Slicing lists

Slicing refers to retrieving specific parts / sections of a list. The general syntax for slicing is `list[[start]:[end][:[step]]]`. This format returns a slice that starts from the index *start* (0, if unspecified) to the index  *(end - 1)* (end of list, if unspecified).

**Note:** The convention for writing syntax is to place option arguments inside square brackets. 

In [11]:
# Return the entire list using the slicing syntax
country_slice = countries[0:2]
print(country_slice)

['Australia', 'Bangladesh']


In [19]:
# Return first 3 countries
country_slice = countries[0:2]
print(country_slice)

['Australia', 'Bangladesh']


In [21]:
# Return last 3 countries
country_slice = countries[4:2:-1]
print(country_slice)

['Ethiopia', 'Denmark']


In [20]:
# Print each country (in a new row) in the slice that contains the 2nd, 3rd and 4th entries.
country_slice = countries[1:4]
print(*country_slice, sep = "\n")

Bangladesh
Croatia
Denmark


In [23]:
# Print alternate country names from the 2nd, 3rd and 4th entries.
country_slice = countries[1::2]
print(*country_slice, sep = "\n")

Bangladesh
Denmark


In [24]:
# note the difference between the two statements below. The first is a string, the second is a list
print(countries[0])
print(countries[0:1])

print(type(countries[0]))
print(type(countries[0:1]))

Australia
['Australia']
<class 'str'>
<class 'list'>


### Modifying lists (aka mutable sequence operations)

Lists are mutable that is, they can be modified. You can modify lists by assigning new values to some list elements, appending new elements to a list, inserting a new element in the list, or removing elements from a list. You can also permanently change the sorting order of a list. The relevant syntax is included below. 

- Assign a new value to a list element <br/>
`list[start[:[endindex]][:[step]]] = value`
- Append an item to the list (add at the end) <br/>
`list.append(item)`
- Append a new list to the list (add at the end) <br/>
`list.extend(newlist)` <br/>
- Insert an item at a specified position (shifts items that come after and increases the length of the list by 1)<br/>
`list.insert(insert_index, item_to_insert)` 
- Remove the item at the specified index (last item, if index is unspecified)<br/>
`list.pop([index_to_remove])` 
- Remove a specified item from a list<br/>
`list.remove(item_to_remove)` 
- Remove a specified slice from a list<br/>
`del list[start[:end][:skip]]` 
- Remove all elements from a list<br/>
`list.clear()` 
- Reverse a list <br/>
`list.reverse()`
- Sort a list <br/>
`list.sort([reverse = False])`


**Note:** In all cases above the original list is modified as a result of the operation. There are some list operations that don't modify the orginal list. This includes slicing and other operations explored in subsequent subsections.

In [25]:
# Create four lists
countries_north_america = ['Canada', 'United States']
countries_south_america = ['mexico', 'Peru', 'Brazil', 'Chile']
countries_asia = ['China', 'Bangladesh', 'India']
countries_africa = ['Nigeria', 'South Africa', 'Tanzania', 'Egypt', 'Ethiopia']

In [27]:
# Change the first element of countries_south_america to Mexico
countries_south_america[0]="Mexico"
print(countries_south_america)

['Mexico', 'Peru', 'Brazil', 'Chile']


In [29]:
# insert "Indonesia" as the third country in countries_asia
countries_asia.insert(2,"Indonesia")
print(countries_asia)

['China', 'Bangladesh', 'Indonesia', 'Indonesia']


In [30]:
# sort countries_africa in descending order
countries_africa.sort(reverse=True)
print(countries_africa)

['Tanzania', 'South Africa', 'Nigeria', 'Ethiopia', 'Egypt']


### Aliasing vs creating a copy of the list

Sometimes you will want to assign a different name to the same list (create an alias). When you do so, changes made to one alias will be reflected in references to other aliases. Let's look at an example.  

In [2]:
countries = ['Australia','Bangladesh','Croatia','Denmark','Ethiopia']
countrylist=countries
#assign countrcies to countrylist

In [3]:
print("Printing countries")
print(countries)
print("Printing countrylist")
print(countrylist)

Printing countries
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia']
Printing countrylist
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia']


In [4]:
# Append France to countrylist
countrylist.append("France")
print("Append France to countrylist")


Append France to countrylist


In [5]:
print("Printing countries")
print(countries)
print("Printing countrylist")
print(countrylist)

Printing countries
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia', 'France']
Printing countrylist
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia', 'France']


You may encounter a situation where you want to copy a list and modify the copy only (not the original). You can do so by assigning a 'full' slice of the list to another list variable or by using the copy() function as shown below.

In [44]:
countries = ['Australia','Bangladesh','Croatia','Denmark','Ethiopia']

# assign countries to countrylist using the slicing notation 


print("Printing countrylist")
print(countrylist)

Printing countrylist
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia', 'France']


In [45]:
# Append France to countrylist
print("Append France to countrylist")
countrylist.append('France')

print("Printing countries")
print(countries)
print("Printing countrylist")
print(countrylist)

Append France to countrylist
Printing countries
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia']
Printing countrylist
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia', 'France', 'France']


In [46]:
# using the copy function
countries = ['Australia','Bangladesh','Croatia','Denmark','Ethiopia']

# assign countries to countrylist using the copy function


print("Printing countries")
print(countries)
print("Printing countrylist")
print(countrylist)

Printing countries
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia']
Printing countrylist
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia', 'France', 'France']


In [47]:
print("Append France to countrylist")
countrylist.append('France')

print("Printing countries")
print(countries)
print("Printing countrylist")
print(countrylist)

Append France to countrylist
Printing countries
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia']
Printing countrylist
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia', 'France', 'France', 'France']


When functions return lists, they typically return a copy. 

In [48]:
# create list
countries = ['Australia','Croatia','Bangladesh','Denmark','Ethiopia']
print(countries)

['Australia', 'Croatia', 'Bangladesh', 'Denmark', 'Ethiopia']


In [49]:
# permanently sort the list in reverse order
countries.sort(reverse=True)
print(countries)

['Ethiopia', 'Denmark', 'Croatia', 'Bangladesh', 'Australia']


In [50]:
countries = ['Australia','Croatia','Bangladesh','Denmark','Ethiopia']
print(countries)

['Australia', 'Croatia', 'Bangladesh', 'Denmark', 'Ethiopia']


In [51]:
# temporarily sort the list and store the result in countries_rev
countries_rev=sorted(countries, reverse = True)
print(countries)
print(countries_rev)

['Australia', 'Croatia', 'Bangladesh', 'Denmark', 'Ethiopia']
['Ethiopia', 'Denmark', 'Croatia', 'Bangladesh', 'Australia']


In [55]:
# create two lists
countries = ['Australia','Bangladesh','Croatia']
more_countries = ['Denmark','Ethiopia']

print(*countries)
print(*more_countries)

Australia Bangladesh Croatia
Denmark Ethiopia


In [56]:
#combine the lists by appending more_countries list to countries list
countries.append(more_countries)
print(countries)
print(more_countries)

['Australia', 'Bangladesh', 'Croatia', ['Denmark', 'Ethiopia']]
['Denmark', 'Ethiopia']


In [57]:
# create two lists
countries = ['Australia','Bangladesh','Croatia']
more_countries = ['Denmark','Ethiopia']
print(countries)
print(more_countries)

['Australia', 'Bangladesh', 'Croatia']
['Denmark', 'Ethiopia']


In [None]:
# what if we want to append the countries from list to the other list?

print(countries)
print(more_countries)

In [58]:
# create two lists
countries = ['Australia','Bangladesh','Croatia']
more_countries = ['Denmark','Ethiopia']
print(countries)
print(more_countries)

['Australia', 'Bangladesh', 'Croatia']
['Denmark', 'Ethiopia']


In [59]:
#combine the lists without changing either and store result in combined_countries
combined_countries=countries+more_countries
print(countries)
print(more_countries)
print(combined_countries)

# you could use countries = countries + more_countries to change the original list countries

['Australia', 'Bangladesh', 'Croatia']
['Denmark', 'Ethiopia']
['Australia', 'Bangladesh', 'Croatia', 'Denmark', 'Ethiopia']


### Other useful list operations

Find if an element exists (or does not) in a list <br/>
`element in list`
`element not in list`

Find the smallest item in a list <br/>
`min(list)`

Find the largest item in a list <br/>
`max(list)`

Add the values in a list <br/>
`sum(list)`

Count the number of occurances of an item in a list <br/>
`list.count(item)`

Index of an item in a list at or after the start index and before the end index <br/>
`list.index(item[, start[, end]])`

Compare two lists <br/>
`list1 == list2`
`list1 < list2` and so on. <br/>
List comparison can appear quite complicated.
For two lists to be equal they must have the same number of elements and the corresponding elements should have the same data type and value. If lists are being compared for anything other than equality, they can have different lengths.
Lists (and other sequences) are compared by doing a pairwise comparison starting from index 0. If the items are equal, the next pair is compared and so on until the end of the list or until non equal items are encountered. The result of the pairwaise comparison of the first pair of non equal items determines the result. If one list is shorter, a NULL value is assumed.

**Note**: Some list functions (e.g., sum) only work on numbers. Others work on various data types. <br/>

In [62]:
num_list = [1,10,3,8,27,10,54,10,2,8]

In [63]:
# print the min value
print(min(num_list))

1


In [64]:
# is the number 10 in the list?
num_to_search = 10
print(num_to_search in num_list)

True


In [65]:
# number of times 10 appears in the list
print(num_list.count(num_to_search))

3


In [66]:
# where does 3 appear in the list?
num_list.index(3)

2

In [67]:
list1 = [1,2,3]
list2 = [1,2,3]
list3 = [1,3,4]
list4 = [1,2,3,4]
list5 = ["1","2","3"]
list6 = ['a','b','c']
list7 = [1,'a',2]
list8 = [1,'b',2]
list9 = [1,2,'a']
list10 = [2,3,1]

print(list1 == list2)
print(list1 == list3)
print(list1 == list4)
print(list1 == list5)
print(list1 == list6)
print(list5 == list6)
print(list7 == list8)
print(list7 == list9)
print(list1 < list4)
print(list1 < list3)
print(list1 < list10)

True
False
False
False
False
False
False
False
True
True
True


## Tuples

A tuple is an immutable sequence of objects of different types.

### Creating a tuple

A tuple is created by enclosing comma separated elements in round brackets. Let's look at a few examples below.

In [69]:
# empty tuple
blank_tuple = ()
print(blank_tuple)

()


In [70]:
# tuple of ages
ages =  (30,28,31)# 30, 28, 31
print(ages)

(30, 28, 31)


In [71]:
ages2= (30,28,31,30,28,31) # 30,28,31,30,28,31
print(ages2)

(30, 28, 31, 30, 28, 31)


In [75]:
# tuple of mixed types - name (string), age (integer), exam scores (list)
student_info = ("Ajay", 30, 90,95)# Ajay, 30, [90,95]
print(student_info)

('Ajay', 30, 90, 95)


In [76]:
rollcall = range(1,4) # convert range to tuple
print(rollcall)


range(1, 4)


In [None]:
chars = 'Hello' # convert string to tuple
print(chars)

### Changing a tuple

Since tuples are immutable, you can't change the value of a tuple after it is created. You may still assign a new tuple to a variable.

In [78]:
num = (1,2,3)
num1 = (4,5)

num=num1
print(num)
# num[0] = 7
# num = num1

(4, 5)


### Accessing tuple elements

Tuple elements can be accessed using the same syntax as list elements.

### Slicing tuples

Tuples can be sliced the same way as lists.

### Tuple operations (unmutable sequence operations)

Since tuples are immutable, they only support operations that do not modify the tuple. This includes

Searching for an item in a tuple <br/>
`item in tuple`
`item not in tuple`

Combining two tuples <br/>
`tuple1 + tuple2`

Replicating a tuple n times <br/>
`tuple * n`

Slicing a tuple <br/>
`tuple[start:[end][:step]`

Length of a tuple <br/>
`len(tuple)`

Smallest item in a tuple <br/>
`min(tuple)`

Largest item in a tuple <br/>
`max(tuple)`

Count the number of occurances of an item in a tuple <br/>
`tuple.count(item)`

Find the index of the first occurance of an item in a tuple at or after the start index and before the end index <br/>
`tuple.index(item[,start[,end]])`

## Strings as a sequence of characters

Strings are an immutable sequence of individual characters that make up the string. 

In [79]:
# a string is a sequence of characters
message = "Hello, World!"
print(message)

Hello, World!


In [80]:
for char in message:
    print(char.upper())

H
E
L
L
O
,
 
W
O
R
L
D
!


In [81]:
print(message[0])

H


In [82]:
# strings are immutable
message[0] = 'h'
print(message)

TypeError: 'str' object does not support item assignment

In [None]:
# immutable operations that work on sequences also work on strings
print(message.count('o'))

**To get a better understanding of working with strings, you must read Chapter 8 of the Think Python textbook. This is required reading.**

## Dictionaries

A dictionary allows for storing key-value pairs such that there is a one-to-one mapping between a key and a value. Keys must be unique. Both keys and values can be of any datatype, unlike a list where indices must be integers.

In [None]:
# create a blank dictionary

course_code_to_name_dict = {} as a variable name; it is a functio}

In [3]:
# create a dictionary with items

course_code_to_name_dict = {"SCM 421":"Supply chain analytics",
                            "BAN 831":"Programming skills for businsess",
                            "MIS 431":"Business data management"} # don't use dict as a variable name; it is a functiond

In [4]:
course_code_to_name_dict

{'SCM 421': 'Supply chain analytics',
 'BAN 831': 'Programming skills for businsess',
 'MIS 431': 'Business data management'}

In [5]:
# add new item
course_code_to_name_dict['BAN 830']="Descriptive Analysis"


In [6]:
course_code_to_name_dict

{'SCM 421': 'Supply chain analytics',
 'BAN 831': 'Programming skills for businsess',
 'MIS 431': 'Business data management',
 'BAN 830': 'Descriptive Analysis'}

In [7]:
# retrieve all keys
course_code_to_name_dict.keys()

dict_keys(['SCM 421', 'BAN 831', 'MIS 431', 'BAN 830'])

In [8]:
# retrieve all values
course_code_to_name_dict.values()

dict_values(['Supply chain analytics', 'Programming skills for businsess', 'Business data management', 'Descriptive Analysis'])

In [10]:
# retrieve value using key
course_code_to_name_dict['BAN 831']

'Programming skills for businsess'

In [12]:
# retrieve value using invalid key

course_code_to_name_dict['Ban 831']

KeyError: 'Ban 831'