# [CPSC 322]() Data Science Algorithms
[Gonzaga University](https://www.gonzaga.edu/) |
[Sophina Luitel](https://www.gonzaga.edu/school-of-engineering-applied-science/faculty/detail/sophina-luitel-phd-0dba6a9d)

---


# Tuples and Dictionaries

What are our learning objectives for this lesson?
* Work with commonly used built-in Python data structures
    * Tuples
    * Dictionaries

Content used in this lesson is based upon information in the following sources:
* None to report


## Tuples

So far, we have seen two types of sequential collections: strings, made of characters, and lists, made of elements of any type. Lists can be modified, but strings cannot, so strings are immutable and lists are mutable.

A tuple is another sequence that can hold items of any type. Like lists, tuples store multiple values, but they are immutable. Tuples are written as a comma-separated sequence of values, usually enclosed in parentheses.

In [36]:
my_tuple = "x", "y", "z"
print(my_tuple)
print(type(my_tuple))

# need a comma after a single element initialization
my_tuple2 = (1, )
print(my_tuple2)

# need a comma after a single element initialization
not_a_tuple = ("a")
print(not_a_tuple)
print(type(not_a_tuple))

# creating an empty tuple
empty_tuple = tuple()
print(empty_tuple)
print(type(empty_tuple))

('x', 'y', 'z')
<class 'tuple'>
(1,)
a
<class 'str'>
()
<class 'tuple'>


Tuple indexing and slicing works the same as for lists:

In [37]:
my_tuple = ("x", "y", "z")
print(my_tuple[1])
print(my_tuple[0:2])

y
('x', 'y')


HOWEVER, tuples are immutable, so you cannot modify them. The follow code demonstrates the immutability of tuples:

In [39]:
my_tuple = ("x", "y", "z")
# crashes! tuples are immutable, you cannot change them
my_tuple[2] = "a"

TypeError: 'tuple' object does not support item assignment

## Concatenation, Repetition, and Methods

### 1. Concatenation
- Combine two or more tuples to create a new tuple.

### 2. Repetition
- Repeat the elements of a tuple multiple times using *.



In [26]:
a = (1, 2)
b = (3,4)

print(a+b)

#repetition
t = (1, 2)
print(t* 3)  # (1, 2, 1, 2, 1, 2)



(1, 2, 3, 4)
(1, 2, 1, 2, 1, 2)


### Methods

- Tuples are immutable, so they have very few methods.
- You cannot change, add, or remove elements in a tuple. But if a tuple contains a mutable object (like a list), that object can be modified.

| Method        | Description                                           | Example                       | Output |
|---------------|-------------------------------------------------------|-------------------------------|--------|
| `count(value)` | Counts how many times `value` appears in the tuple  | `(1, 2, 2, 3).count(2)`      | 2      |
| `index(value)` | Returns the index of the first occurrence of `value` | `(1, 2, 2, 3).index(3)`      | 3      |

**Note:**
- Since tuples are immutable, methods like `append()`, `remove()`, or `pop()` do not exist.
- Any operation that seems to “change” a tuple (like concatenation or repetition) creates a new tuple instead.


In [18]:
# if a tuple contains a mutable object, you can modify
my_tuple =(23,[1,2,3],"Harvey")
my_tuple[1].append(5)
print(my_tuple)


(23, [1, 2, 3, 5], 'Harvey')


### Tuple Use Cases
- Fixed collections that should not change
- Dictionary keys
- Returning multiple values from functions
- Lightweight and memory-efficient compared to lists

In [25]:
#returning multiple values from functions
def squareValue(num):
    '''
    '''
    return num, num*num

val=squareValue(6)
print(val)
print(type(val))


(6, 36)
<class 'tuple'>


## Key-Value Pairs

Some items act as unique identifiers. For example:  
* Student ID number  
* Library card number  
* Passport number  
* Vehicle registration number  

These items are all **keys** because they uniquely identify something. For instance, at a university, there may be several students named "John Smith." How does the university distinguish between them? Each student is assigned a unique student ID:

| Student ID | Last Name | First Name |
|------------|-----------|------------|
| 10234      | Smith     | Jane       |
| 10456      | Smith     | John       |
| 10789      | Smith     | John       |
| 10901      | Brown     | Alex       |

Here, the **key** is the student ID, and the **value** is the student’s record (name, courses, grades, etc.). Together, they form a **key-value pair**.  

- Keys must be unique.  
- Values can be repeated.  
- A structure that stores keys and their corresponding values is called a **dictionary**.


## Dictionaries
A dictionary is like a list, but instead of numeric indices, it uses **keys**. Keys can be integers, strings, or other immutable types, but **not lists**.  

- Dictionaries are declared using curly braces `{ }.  

## Dictionary Operations
**1. Creating**

In [68]:
# declares an empty dictionary
my_dict = {}
print(my_dict)
# can also use dict()
my_second_dict = dict()
print(my_second_dict)

{}
{}


We can initialize a dictionary with values using comma separated `key:value` pairs:

In [33]:
state_capitals = {'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland'}
print(state_capitals)



{'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland'}
dict_keys(['washington', 'idaho', 'oregon'])


We can create a dictionary from a list of tuples, where each tuple in the list is a key-value pair:

In [38]:
# roman numerals
values = [("I",1), ("V", 5), ("X", 10), ("L", 50)]
roman_numerals = dict(values)
print(roman_numerals)

{'I': 1, 'V': 5, 'X': 10, 'L': 50}


We can also convert a dictionary back to a list of tuples with the dictionary method `items()` and the built-in function `list()`:

In [57]:
list_of_tuples = list(roman_numerals.items())
print(list_of_tuples)

[('I', 1), ('V', 5), ('X', 10), ('L', 50)]


**2. Accessing**
- We can access an item via a key using hard brackets `[ ]` (similar to indexing into a list)
- If the key does not exist, an error occurs unless a safe access method is used.  

In [77]:
state_capitals = {'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland'}
print(f"The capital of idaho is {state_capitals['idaho']}")


The capital of idaho is boise


In [85]:

#using get() to safely access
print(state_capitals.get('california','Not found')) # get() returns 'None' or a default value you specify

print(state_capitals['california'])

Not found


KeyError: 'california'

**3. Adding & Updating**

Since dictionaries are *mutable*, we can add key-value pairs to the dictionary using hard brackets `[ ]`:

In [97]:
state_capitals = {'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland'}
print(state_capitals)
#adds new key-value pair if key doesn't exist
state_capitals['montana'] = 'helena'
print(state_capitals)

#updates if key already exists
state_capitals['montana']= 'HELENA'
print(state_capitals) 

#update() method can be used to add or update mutiple values
state_capitals.update({'montana':'helena', 'california': 'sacramento'})
print(state_capitals)

#can merge another dictionary into the existing one.
capital={'nevada':'carson city','arizona':'phoenix'}
state_capitals.update(capital)
print(state_capitals)


{'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland'}
{'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland', 'montana': 'helena'}
{'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland', 'montana': 'HELENA'}
{'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland', 'montana': 'helena', 'california': 'sacramento'}
{'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland', 'montana': 'helena', 'california': 'sacramento', 'nevada': 'carson city', 'arizona': 'phoenix'}


Note: keys in a dictionary are not sorted in any particular order.

**4. Deleting**

- Remove a key-value pair by its key.  
- Remove and return a value using a method.  
- Clear all entries from the dictionary.  


In [67]:

students = {10234: ['Jane', 'Smith','CS','Junior'], 10456: ['John','Smith','CS'],10789: ['Alice','Johnson','Math'],11234: ['Jane','Smith','Math','Senior']}

# Delete a key-value pair by key
del students[10789]
print(students)

# Remove and return value with pop()
removed = students.pop(10456)
print("Removed:", removed)
print(students)

# Remove the last inserted key-value pair with popitem()
last = students.popitem()
print("Last removed:", last)
print(students)


# # Clear all items
students.clear()
print(students) 



{10234: ['Jane', 'Smith', 'CS', 'Junior'], 10456: ['John', 'Smith', 'CS'], 11234: ['Jane', 'Smith', 'Math', 'Senior']}
Removed: ['John', 'Smith', 'CS']
{10234: ['Jane', 'Smith', 'CS', 'Junior'], 11234: ['Jane', 'Smith', 'Math', 'Senior']}
Last removed: (11234, ['Jane', 'Smith', 'Math', 'Senior'])
{10234: ['Jane', 'Smith', 'CS', 'Junior']}
{}


**5. Membership**
-  We can test if a key is a valid key in the dictionary with the `in` keyword:

In [98]:
state_capitals = {'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland'}

print('california' in state_capitals)
print('idaho' in state_capitals)
print('olympia' in state_capitals)

False
True
False


**6. Iterating**
- We can traverse a dictionary using a for loop. Depending on what we need, we can loop over **keys**, **values**, or **key-value pairs**.

In [56]:

students={10234: ['Jane', 'Smith','CS','junior'], 10456: ['John','Smith','CS'], 10789: ['John','Smith']}


for student in students:    # we can use keys to access values
    print(f"{student}:{students[student]}")

for key in students.keys(): # Keys
    print(key)

for value in students.values():   # Values
    print(value)

for key, value in students.items(): # Key-Value pairs
    print(key, ":", value)


10234:['Jane', 'Smith', 'CS', 'junior']
10456:['John', 'Smith', 'CS']
10789:['John', 'Smith']
10234
10456
10789
['Jane', 'Smith', 'CS', 'junior']
['John', 'Smith', 'CS']
['John', 'Smith']
10234 : ['Jane', 'Smith', 'CS', 'junior']
10456 : ['John', 'Smith', 'CS']
10789 : ['John', 'Smith']


## Useful Operations
- `len()` → returns the number of key-value pairs.  
- `.keys()` → returns all keys.  
- `.values()` → returns all values.  
- `.items()` → returns all key-value pairs.  

In [111]:
students={10234: ['Jane', 'Smith','CS','junior'], 10456: ['John','Smith','CS'], 10789: ['John','Smith']}

# returns number of key-value pairs
print(len(students))

#returns all keys
print(students.keys())


#returns all values
print(students.values())

#returns all key-value pairs
print(students.items())

3
dict_keys([10234, 10456, 10789])
dict_values([['Jane', 'Smith', 'CS', 'junior'], ['John', 'Smith', 'CS'], ['John', 'Smith']])
dict_items([(10234, ['Jane', 'Smith', 'CS', 'junior']), (10456, ['John', 'Smith', 'CS']), (10789, ['John', 'Smith'])])


## Example Problem: Letter Frequencies
Suppose we want to keep track of the frequency of letters in a word. For example, the word "hello" has 4 letters with the following frequencies:
* h: 1
* e: 1
* l: 2
* o: 1

Let's write a program to prompt the user to enter a word. Our program will tell the user the frequency of each letter in the word.

In [146]:
def compute_letter (word):
    '''
    '''
    histogram={}
    for letter in word:
        if letter in histogram:
            histogram[letter] +=1
        else:
            histogram[letter]=1
    return histogram

userInput= input("Enter a word: ")
countletter=compute_letter(userInput)
print(countletter)

Enter a word:  hello


{'h': 1, 'e': 1, 'l': 2, 'o': 1}


# Practice Question.

Write a function that performs the following steps:

- Reads a text file (pythonIntro.txt) and converts all words to lowercase.

- Computes the frequency of each word using a dictionary (Bag of Words).

- Removes the following stop words:

  * stopword = ['is', 'to', 'this', 'of', 'a', 'the', 'and', 'are', 'so']
- Sorts the remaining words by frequency in descending order.

 - Prints all words along with their frequencies after sorting.

Note: We have now seen lists of tuples, lists of lists, dictionaries of lists, etc. In general, we can have sequences of sequences. The types of sequences that can be nested and the number of nesting levels is up to you, the programmer!