# String

Manipulating strings is one of the most common tasks in data science and machine learning. In fact, there's a whole field of machine learning called **Natural Language Processing** that uses strings as input.

Through out the course, you have seen plenty of `string` variables. In python, strings are surrounded by quotes. The quotes can either be single quote (`'`) or double quote (`"`).

For example, the string `'Hello World'` and `"Hello World"` are the same.

In [8]:
x = 'Hello World'
y = "Hello World"
print(x == y)

True


Personally, I prefer to use double quotes (`"`) because it's easier to read, and there are often occasions where the string contains a single quote (`'`). In this case, if you use "'" to wrap around the string, you need to provide an extra backslash (`\`) before the single quote.

Let's look at an example. The 2 strings below are the same.

```python
first_string = 'Welcome to \'Python for Data Science course\'!'
second_string = "Welcome to 'Python for Data Science course'!"
print(first_string == second_string)
```

In [9]:
first_string = 'Welcome to \'Python for Data Science course\'!'
second_string = "Welcome to 'Python for Data Science course'!"
print(first_string == second_string)

True


**Exercise**

What would happen if you forget a backslash (`\`) before the single quote (`'`) in the first string?

```python
print('Welcome to 'Python for Data Science course'!')
```

In [5]:
# [TODO]
print('Welcome to 'Python for Data Science course'!')

SyntaxError: invalid syntax (3157157420.py, line 2)

**Exercise**

If we define a `third_string` as follow, is it the same as the `first_string` and `second_string`?

```python
third_string = 'Welcome to \"Python for Data Science course\"!'
```

In [7]:
third_string = 'Welcome to \"Python for Data Science course\"!'

# [TODO]
print(third_string == second_string)

False


**Exercise**

Fill in the summary table below.

|`example_str`|Output of `print(f"{example_str}")`|
|:-:|:-:|
|`"This course is awesome!"`|**WRITE YOUR ANSWER HERE!**|
|`"How\'re you?"`|**WRITE YOUR ANSWER HERE!**|
|`"Does this /\\ look like a mountain to you?"`|**WRITE YOUR ANSWER HERE!**|

<font size="5">[TODO] 📖</font>

|`example_str`|Output of `print(f"{example_str}")`|
|:-:|:-:|
|`"This course is awesome!"`|`This course is awesome!`|
|`"How\'re you?"`|`How're you?`|
|`"Does this /\\ look like a mountain to you?"`|`Does this /\ look like a mountain to you?`|

`\n` is a newline character. It's used to create a new line.

The following example will demonstrate the difference between using and not using `\n`:

```python
print("Welcome to Python for Data Science course!")
print("I am your instructor")

print("------------------------------------------")

print("Welcome to Python for Data Science course!\n")
print("I am your instructor")
```

In [10]:
print("Welcome to Python for Data Science course!")
print("I am your instructor")

print("------------------------------------------")

print("Welcome to Python for Data Science course!\n")
print("I am your instructor")

Welcome to Python for Data Science course!
I am your instructor
------------------------------------------
Welcome to Python for Data Science course!

I am your instructor


Python also allows **multiline strings**. You can assign a multiline string to a variable by using triple quotes (either `"` or `'`).

For example, below is a multiline string.

```python
multiline_str = """
I am
learning
Python for Data Science
with Leo
a Data Scientist
"""

print(multiline_str)
```

In [17]:
multiline_str = """
I am
learning
Python for Data Science
with Leo
a Data Scientist
"""

print(multiline_str)


I am
learning
Python for Data Science
with Leo
a Data Scientist



Let's define another variable called `another_str`.

```python
another_str = "I am\nlearning\nPython for Data Science\nwith Leo\na Data Scientist"

print(another_str)
```

In [19]:
another_str = "I am\nlearning\nPython for Data Science\nwith Leo\na Data Scientist"

print(another_str)

I am
learning
Python for Data Science
with Leo
a Data Scientist


**Exercise**

Is `another_str` the same as `multiline_str`? In other words, what would be the result of the following code?

```python
another_str == multiline_str
```

In [22]:
# [TODO]
print(another_str == multiline_str)
print(f"multiline_str contains {len(multiline_str)} characters.")
print(f"another_str contains {len(another_str)} characters.")

False
multiline_str contains 65 characters.
another_str contains 63 characters.


**Strings** can be thought of as a list of characters. Let's see if we can apply what we know about **list** on **string**.

The following exercises will all use this `example_str` variable.

```python
example_str = "I am old enough to drive a car!" 
```

🏎️🏎️🏎️🏎️

In [25]:
example_str = "I am old enough to drive a car!"

**Exercise**

What's the length of `example_str`?

In [26]:
# [TODO]
print(len(example_str))

31


**Exercise**

What is the index of the first `o` character in `example_str`?

In [28]:
# [TODO]
print(example_str.index("o"))

5


**Exercise**

- Can you print the characters from index `2` to index `10`?
- **BONUS**: Can you print the last 4 characters of `example_str`?

In [34]:
# [TODO]
print(f"Characters from index 2 to 10: '{example_str[2:11]}'")
print(f"Last 4 characters: '{example_str[-4:]}'")

Characters from index 2 to 10: 'am old en'
Last 4 characters: 'car!'


**Exercise**

Try looping through the first 5 characters of `example_str` and print each character.

In [36]:
# [TODO]
for char in example_str[:5]:
    print(char)

I
 
a
m
 


**Exercise**

Try modifying the 1st character of `example_str` to `"X"`.

In [37]:
# [TODO]
example_str[0] = "X"

TypeError: 'str' object does not support item assignment

**Exercise**

Can you `append()` a character to `example_str`? **YES** or **NO**?

In [38]:
# [TODO]
example_str.append("ABC")

AttributeError: 'str' object has no attribute 'append'

From the 2 exercises above, you can see that we **CANNOT** modify a string. In other words, **string is immutable**.

What if we still want to modify the value of our string variable `example_str`? 🤔

In this case, we can simply reassign our string variable `example_str` to the desired value.

```python
example_str = "I am old enough to drive a car!"
print(example_str)

example_str = "We are old enough to drive a car!"
print(example_str)
```

In [1]:
example_str = "I am old enough to drive a car!"
print(example_str)

example_str = "We are old enough to drive a car!"
print(example_str)

I am old enough to drive a car!
We are old enough to drive a car!


**String methods**

In `lesson 2`, you have learned to:
- concatenate multiple strings together using the `+` operator.
- duplicate a string using the `*` operator.

In this lesson, you'll learn about **other string methods**.
- Capitalise the first letter of a string using `.capitalize()`.
- Check if a string starts with a certain character using `.startswith()`.
- Check if a string ends with a certain character using `.endswith()`.
- Check if a string contains a certain character using `.in`.
- Check if a string can be converted to an `int` using `.isnumeric()`.
- Convert a string to **UPPERCASE** using `.upper()`.
- Convert a string to **LOWERCASE** using `.lower()`.
- Replace a string with another string using `.replace()`.

Let's perform all of the above-mentioned transformations on a our string variable `example_str`.

```python
print(f"Original:                   {example_str}")
print("-------------------------------------------------------------")
print(f"Capitalise 1st letter:      {example_str.capitalize()}")
print("-------------------------------------------------------------")
print(f"Check if starts with 'I':   {example_str.startswith('I')}")
print("-------------------------------------------------------------")
print(f"Check if ends with '!':     {example_str.endswith('!')}")
print("-------------------------------------------------------------")
print(f"Check if contains 'old':    {'old' in example_str}") 
print("-------------------------------------------------------------")
print(f"Check if is numeric:        {example_str.isnumeric()}")
print("-------------------------------------------------------------")
print(f"Convert to UPPERCASE:       {example_str.upper()}")
print("-------------------------------------------------------------")
print(f"Convert to LOWERCASE:       {example_str.lower()}")
print("-------------------------------------------------------------")
print(f"Replace 'o' with '0':       {example_str.replace('o', '0')}")
```

In [16]:
example_str = "i am old enough to drive a car!"

print(f"Original:                   {example_str}")
print("-------------------------------------------------------------")
print(f"Capitalise 1st letter:      {example_str.capitalize()}")
print("-------------------------------------------------------------")
print(f"Check if starts with 'I':   {example_str.startswith('I')}")
print("-------------------------------------------------------------")
print(f"Check if ends with '!':     {example_str.endswith('!')}")
print("-------------------------------------------------------------")
print(f"Check if contains 'old':    {'old' in example_str}") 
print("-------------------------------------------------------------")
print(f"Check if is numeric:        {example_str.isnumeric()}")
print("-------------------------------------------------------------")
print(f"Convert to UPPERCASE:       {example_str.upper()}")
print("-------------------------------------------------------------")
print(f"Convert to LOWERCASE:       {example_str.lower()}")
print("-------------------------------------------------------------")
print(f"Replace 'o' with '0':       {example_str.replace('o', '0')}")

Original:                   i am old enough to drive a car!
-------------------------------------------------------------
Capitalise 1st letter:      I am old enough to drive a car!
-------------------------------------------------------------
Check if starts with 'I':   False
-------------------------------------------------------------
Check if ends with '!':     True
-------------------------------------------------------------
Check if contains 'old':    True
-------------------------------------------------------------
Check if is numeric:        False
-------------------------------------------------------------
Convert to UPPERCASE:       I AM OLD ENOUGH TO DRIVE A CAR!
-------------------------------------------------------------
Convert to LOWERCASE:       i am old enough to drive a car!
-------------------------------------------------------------
Replace 'o' with '0':       i am 0ld en0ugh t0 drive a car!


**Exercise**

What's the output of the following code?

```python
print(f"Check if contains 'Old': {'Old' in example_str}")
```

**Hint**: Python is **CASE-SENSITIVE**.

In [11]:
# [TODO]
print(f"Check if contains 'Old': {'Old' in example_str}")

Check if contains 'Old': False


Thus, to ensure that we cover for all forms of the word `Old`, it is common practice to use `upper()` or `lower()` method. 

Our code should be:
```python
print(f"Check if contains 'Old': {'Old'.lower() in example_str.lower()}")
```

In [12]:
print(f"Check if contains 'Old': {'Old'.lower() in example_str.lower()}")

Check if contains 'Old': True


**Exercise**

**HARD** 🤯

Write a function that takes in 2 strings and returns the number of times the second string appears in the first string.
- The function should be **Case-Insensitive**.
- If the second string is not in the first string, return 0.
- If the second string is empty, return the length of the first string.

In [1]:
# [TODO]
def count_occurences(fir_str: str, sec_str: str) -> int:
    sec_str = sec_str.lower()
    fir_str = fir_str.lower()
    if (fir_str.find(sec_str) == -1):
        return 0
    elif (len(sec_str) == 0):
        return len(fir_str)
    else:
        return len(fir_str.split(sec_str)) - 1

fir_str = "I am not old enough to watch restricted content this year 😿!"
sec_str = "i"

print(f"Count occurences of '{sec_str}': {count_occurences(fir_str, sec_str)}")

Count occurences of 'i': 3


In [61]:
# [TODO]
def count_occurences(fir_str, sec_str):
    fir_str = fir_str.lower()
    sec_str = sec_str.lower()

    if sec_str not in fir_str:
        return 0
    elif len(sec_str) == 0:
        return len(fir_str)
    else:
        # split the first string using the second string
        # this will remove all occurences of the second string 
        # from the first string
        split_str = fir_str.split(sec_str)

        # rejoin the split strings
        join_str = "".join(split_str)

        # the difference in length of the first string and the joined string
        # divided by the length of the second string
        # will be what we need
        return int((len(fir_str) - len(join_str))/len(sec_str))

fir_str = "I am not old enough to watch restricted content this year 😿!"
sec_str = "i"

print(f"Count occurences of '{sec_str}': {count_occurences(fir_str, sec_str)}")

Count occurences of 'i': 3


In [62]:
# [TODO]
def count_occurences(fir_str, sec_str):
    return fir_str.lower().count(sec_str.lower())

fir_str = "I am not old enough to watch restricted content this year 😿!"
sec_str = "i"

print(f"Count occurences of '{sec_str}': {count_occurences(fir_str, sec_str)}")

Count occurences of 'i': 3


It's very common in **Natural Language Processing** to break a large sentence into component words. 

We'll do that using the `split()` method.

Let's first use the `help()` function to view the documentation for the `split()` method.

```python
help(str.split)
```

In [21]:
help(str.split)

Help on method_descriptor:

split(self, /, sep=None, maxsplit=-1)
    Return a list of the words in the string, using sep as the delimiter string.
    
    sep
      The delimiter according which to split the string.
      None (the default value) means split according to any whitespace,
      and discard empty strings from the result.
    maxsplit
      Maximum number of splits to do.
      -1 (the default value) means no limit.



From the above output, we can see that `split()` takes a single argument `sep` which is the character or string that we want to split the string on. The result of running the function is a list of smaller strings.

We know that our `example_str` is a sentence having words separated by a space ` ` character. Let's split `example_str` based on the space character.

```python
example_str.split(" ")
```

In [23]:
example_str = "I am old enough to drive a car!"

example_str.split(" ")

['I', 'am', 'old', 'enough', 'to', 'drive', 'a', 'car!']

**Exercise**

Without running the code below, can you tell me what the output of it is?

```python
print(len("2022-07-25".split("-")))
```

In [24]:
# [TODO]
print(len("2022-07-25".split("-")))

3


The opposite of `split()` is the `join()` method, which can be used to concatenate multiple strings together using a certain character or string as a separator.

Let's look at the documentation for the `join()` method.

```python
help(str.join)
```

In [27]:
help(str.join)

Help on method_descriptor:

join(self, iterable, /)
    Concatenate any number of strings.
    
    The string whose method is called is inserted in between each given string.
    The result is returned as a new string.
    
    Example: '.'.join(['ab', 'pq', 'rs']) -> 'ab.pq.rs'



From the above example `'.'.join(["ab", "pq", "rs"])` returns `"ab.pq.rs"`. The 3 strings are joined together using the `'.'` character.

**Exercise**

Let's reassemble the following words into a sentence!

```python
words = ["I", "love", "Python", ". It's my", "favourite", "language"]
```

In [26]:
# [TODO]
words = ["I", "love", "Python.", "It's my", "favourite", "language."]
print(" ".join(words))

I love Python. It's my favourite language


**Exercise**

You know that strings can be concatenated using the `+`. What if we want to concatenate a `str` and an `int`?

Will the following code run without error?

```python
age = 21
print("I am " + age)
```

In [28]:
# [TODO]
age = 21
print("I am " + age)

TypeError: can only concatenate str (not "int") to str

There are multiple ways to solve this problem. 
- We can convert the `int` to a `str` using the `str()` function.
- We can also use the `format()` method.
- OR we can use the `f-string` syntax as you have seen in countless occasions.

I prefer using `f-string` syntax since it is **more readable** and **easy to understand**.

In [31]:
age = 21
print("Method 1: We can convert the `int` to a `str` using the `str()` function.") 
print("I am " + str(age))
print("\n")

print("Method 2: We can also use the `format()` method.")
print("I am {}".format(age))
print("\n")

print("Method 3: We can also use the `f-string` syntax.")
print(f"I am {age}")
print("\n")

Method 1: We can convert the `int` to a `str` using the `str()` function.
I am 21


Method 2: We can also use the `format()` method.
I am 21


Method 3: We can also use the `f-string` syntax.
I am 21




Nonetheless, it's worth noting that `format()` method is still a useful tool for formatting strings.
The `format()` method takes in an unlimited number of arguments, and you can use **index numbers** to specify which argument you want to use.

```python
age = 21
characteristic = "rich"
assets = "BTCs and ETHs"

print("I am {1} years old. I am super {0} and I have lots of {2}".format(characteristic, age, assets))
```

In [32]:
age = 21
characteristic = "rich"
assets = "BTCs and ETHs"

print("I am {1} years old. I am super {0} and I have lots of {2}".format(characteristic, age, assets))

I am 21 years old. I am super rich and I have lots of BTCs and ETHs


# Dictionary

**Python dictionary** is an unordered collection of items. Each item is a `key-value` pair.
- A dictionary is denoted by the curly braces `{}`. 
- Items inside a dictionary are separated by a comma `,`.
- Keys and values are separated by a colon `:`.
- While values can be of any data type, **keys must be of immutable** data type and must be **unique**.

Let's look at a few examples of dictionaries.
1. Empty dictionary:

    ```python
    empty_dictionary = {}
    ```

1. Dictionary with string keys and mixed type values:

    ```python
    employee_details = {
        "name": "Leo",
        "age": 21,
        "is_data_scientist": True,
        "is_programmer": False,
    }
    ```
    
1. Dictionary with mixed type keys:

    ```python
    random_dictionary = {
        1: "a",
        "b": 2,
    }
    ```

**Exercise**

Define an empty dictionary and print it to the console! What is the output of your `print()`?

```python
empty_dictionary = {}
print(empty_dictionary)
```

In [6]:
# [TODO]
empty_dictionary = {}
print(empty_dictionary)

{}


Let's define a dictionary named `movie_info` and see how we can access its keys and values.

```python
movie_info = {
    "title": "Em Va Trinh",
    "year": 2021,
    "cast": [
        "Avin Lu",
        "Hong Ha",
        "Lan Thy",
    ],
    "ost": "Ballad to the dead",
}
```

In [58]:
movie_info = {
    "title": "Em Va Trinh",
    "year": 2021,
    "cast": [
        "Avin Lu",
        "Hong Ha",
        "Lan Thy",
    ],
    "ost": "Ballad to the dead",
}

We can access **the value** of the dictionary by wrapping a pair of square brackets `[]` around the **corresponding key**.

For example, if we want to retrieve the value of the `"title"` key, we can use the following code:

```python
print(movie_info["title"])
```

In [59]:
print(movie_info["title"])

Em Va Trinh


**Exercise**

Who are the casts of the movie `Em Va Trinh`? Can you print each cast member on a separate line in the console?

**Hint**: What's the data type of `movie_info["cast"]`?

In [60]:
# [TODO]
print(type(movie_info["cast"]))

# since movie_info["cast"] is a list, we can easily loop through it
for actor in movie_info["cast"]:
    print(actor)

<class 'list'>
Avin Lu
Hong Ha
Lan Thy


**Exercise**

We know that `Tran Luc` plays the older version of `Trinh Cong Son`. How do we add `Tran Luc` to the `cast` list?

**Hint**: How do you add an element to a list?

In [61]:
# [TODO]
print(f"Before: {movie_info['cast']}")
movie_info["cast"].append("Tran Luc")
print(f"After: {movie_info['cast']}")

Before: ['Avin Lu', 'Hong Ha', 'Lan Thy']
After: ['Avin Lu', 'Hong Ha', 'Lan Thy', 'Tran Luc']


What if we want to add a new `key-value` pair to our existing dictionary?

Let's see how to add `director: "Phan Gia Nhat Linh"` to the `movie_info` dictionary.

```python
movie_info["director"] = "Phan Gia Nhat Linh"
```

In [62]:
print("Before: ")
print(movie_info)

movie_info["director"] = "Phan Gia Nhat Linh"

print("-------------------------------------------------------------")

print("After: ")
print(movie_info)

Before: 
{'title': 'Em Va Trinh', 'year': 2021, 'cast': ['Avin Lu', 'Hong Ha', 'Lan Thy', 'Tran Luc'], 'ost': 'Ballad to the dead'}
-------------------------------------------------------------
After: 
{'title': 'Em Va Trinh', 'year': 2021, 'cast': ['Avin Lu', 'Hong Ha', 'Lan Thy', 'Tran Luc'], 'ost': 'Ballad to the dead', 'director': 'Phan Gia Nhat Linh'}


Another way to add a new `key-value` pair to an existing dictionary is to use the `update()` method.

```python
movie_info.update({
    "director": "Phan Gia Nhat Linh"
})
```

**Exercise**

We know that the movie `Em Va Trinh` was released in `2022`. Nonetheless, the value of key `year` is `2021`. How do we update the value of key `year` to the correct value of `2022`?
- Access the value of the key `year` and print it to the screen.
- Assign the value of `2022` to the key `year`.
- Print the new value of the key `year` to the screen.

In [63]:
# [TODO]
print(f"Before: {movie_info['year']}")
movie_info["year"] = 2022
print(f"After: {movie_info['year']}")

Before: 2021
After: 2022


The `.update()` method can also be used to update the value of a key.

```python
movie_info.update({"year": 2022})
```

If we want to remove a `key-value` pair from a dictionary, we can use the following methods:
- `pop()` method: prodive a `key` and the `value` of that `key` will be removed from the dictionary.
- `del` keyword: `del movie_info["year"]` will remove the `year` `key-value` pair from the dictionary.

**Exercise**

Remove the `ost` `key-value` pair from the `movie_info` dictionary.

In [64]:
# [TODO]
print("Before: ")
print(movie_info)

movie_info.pop("ost")
print("-------------------------------------------------------------")

print("After: ")
print(movie_info)

Before: 
{'title': 'Em Va Trinh', 'year': 2022, 'cast': ['Avin Lu', 'Hong Ha', 'Lan Thy', 'Tran Luc'], 'ost': 'Ballad to the dead', 'director': 'Phan Gia Nhat Linh'}
-------------------------------------------------------------
After: 
{'title': 'Em Va Trinh', 'year': 2022, 'cast': ['Avin Lu', 'Hong Ha', 'Lan Thy', 'Tran Luc'], 'director': 'Phan Gia Nhat Linh'}


What if we want to access a non-existent key in the dictionary?

```python
movie_info["producer"]
```

In [29]:
movie_info["producer"]

KeyError: 'producer'

In order to access all the keys in the dictionary, we can use the `keys()` method.

```python
movie_info.keys()
```

In [32]:
movie_info.keys()

dict_keys(['title', 'year', 'cast', 'director'])

In order to not get the `KeyError`, it's recommended to check if the `key` exists before accessing it. 

Similar to checking if an element exists in a list, you can do this by using the `in` keyword.

```python
print("producer" in movie_info.keys())
```

In [50]:
print(f"Is 'producer' a key of movie_info? {'producer' in movie_info.keys()}")
print("-------------------------------------------------------------")
print(f"Is 'director' a key of movie_info? {'director' in movie_info.keys()}")

Is 'producer' a key of movie_info? False
-------------------------------------------------------------
Is 'director' a key of movie_info? True


Another way to check if a `key` exists in a dictionary is to use the `get()` method.

```python
movie_info.get("producer")
```

If there's no value associated with the `key`, the `get()` method returns `None`. Otherwise, `get()` method returns the corresponding `value` of the `key`.

In [54]:
print(movie_info.get("director"))
print(movie_info.get("producer"))

Phan Gia Nhat Linh
None


In order to access all the values in the dictionary, we can use the `values()` method.

```python
movie_info.values()
```

In [56]:
movie_info.values()

dict_values(['Em Va Trinh', 2000, ['Avin Lu', 'Hong Ha', 'Lan Thy', 'Tran Luc'], 'Phan Gia Nhat Linh'])

**Exercise**

Given the following dictionary, find the key with the largest value. If there are multiple keys with the same largest value, print all of them.

```python
final_scores = {
    "Alice": 90,
    "Bob": 80,
    "Homelander": 50,
    "Jayden": 20,
    "Leo": 10,
    "Starlight": 50,
    "Tony": 90,
}
```

**Hint**: `max()` function can be used to find the largest value in a list.

In [65]:
# [TODO]
final_scores = {
    "Alice": 90,
    "Bob": 80,
    "Homelander": 50,
    "Jayden": 20,
    "Leo": 10,
    "Starlight": 50,
    "Tony": 90,
}

max_score = max(final_scores.values())
for name in final_scores.keys():
    if final_scores[name] == max_score:
        print(name)

Alice
Tony


**Exercise**

Can you solve the above exercise using list comprehension?

In [67]:
# [TODO]
[name for name in final_scores.keys() if final_scores[name] == max(final_scores.values())]

['Alice', 'Tony']

In order to access all the items in the dictionary, we can use the `items()` method.

```python
movie_info.items()
```

In [57]:
movie_info.items()

dict_items([('title', 'Em Va Trinh'), ('year', 2000), ('cast', ['Avin Lu', 'Hong Ha', 'Lan Thy', 'Tran Luc']), ('director', 'Phan Gia Nhat Linh')])

As you can see, the `items()` method returns a list of tuples. Each tuple contains a key-value pair.

We won't be learning about the `tuples` type in this course, but you can just remember that tuples are **immutable** and are wrapped in parentheses `()`.

**Exercise**

Write a function that takes in a dictionary and returns a list of all the keys in the dictionary.

In [35]:
# [TODO]
def print_keys_from_dict(dictionary):
    return [key for key in dictionary.keys()]

print_keys_from_dict(movie_info)

['title', 'year', 'cast', 'director']

Let's loop over the `items` in the dictionary and print each `key-value` pair on a separate line to the screen.

```python
for k, v in movie_info.items():
    print(k, v)
```

In [37]:
for k, v in movie_info.items():
    print(f"{k}: {v}")

title: Em Va Trinh
year: 2022
cast: ['Avin Lu', 'Hong Ha', 'Lan Thy', 'Tran Luc']
director: Phan Gia Nhat Linh


In addition to **list comprehension**, Python also has **dictionary comprehension**.

Let's use **dictionary comprehension** to create a dictionary named `initials` having:
- keys as the casts of the movie `Em Va Trinh`
- values as the first letter of each cast member.

```python
initials = {
    k: k[0] for k in movie_info["cast"] if len(k) > 0
}

print(initials)
```

In [44]:
initials = {
    k: k[0] for k in movie_info["cast"]
}

print(initials)

{'Avin Lu': 'A', 'Hong Ha': 'H', 'Lan Thy': 'L', 'Tran Luc': 'T'}


**Exercise**

Use **dictionary comprehension** to create the following dictionary:

```bash
cubes = {1: 1, 2: 8, 3: 27, 4: 64, 5: 125}
```

**Hint**: `value` = `key`**3

In [40]:
# [TODO]
cubes = {k: k**3 for k in range(1, 6)}
print(cubes)

{1: 1, 2: 8, 3: 27, 4: 64, 5: 125}


**Exercise**

Given the list `numbers` below:

```python
numbers = list(range(1, 10))
```

Create a dictionary named `even_numbers_doubled` that has:
- keys as the even numbers in `numbers`
- values as the doubled value of the even numbers.

```python
even_numbers_doubled == {2: 4, 4: 8, 6: 12, 8: 16}
```

In [46]:
# [TODO]
numbers = list(range(1, 10))
even_numbers_doubled = {
    k: k*2 for k in numbers if k % 2 == 0
}
print(even_numbers_doubled)

{2: 4, 4: 8, 6: 12, 8: 16}
