# Debugging Python code


Go through the exercises below.

- Each exercise contains some code with **one or more mistakes**.
- The mistakes can either prompt an error or not.
- There might be multiple ways to fix the mistakes.
- Improving the code readability is also encouraged.

In [None]:
# data creation
beatles = ["John Lennon", "Paul McCartney", "George Harrison", "Ringo Starr"]

numbers = [1, 2, 3, 4, 5]

capitals = {"Germany": "Berlin",
            "Russia": "Moscow",
            "France": "Paris",
            "China": "Beijing",
            "Egypt": "Cairo",
            "Brazil": "Sao Paulo"
            }

top_profitable_films = {
    "Film": ["Avengers: Endgame", "Avatar", "Titanic", "Star Wars: The Force Awakens", "Jurassic World",
             "The Lion King", "The Avengers", "Frozen II", "Frozen", "Beauty and the Beast"],
    "Year": ["2019", "2007", "1997", "2015", "2015", "2019", "2012", "2019", "2013", "2017"],
    "Worldwide Gross (in billions)": ["2.798", "2.789", "2.194", "2.073", "1.673", "1.656", "1.519",
                                      "1.450", "1.276", "1.263"]
    }

**Logical error**: the capital of Brazil is acually Brasilia, not Sao Paulo!

In [None]:
capitals["Brazil"] = "Brasilia"

## Exercise 1:

In [None]:
for c in Capitals.keys():
  print(f"{c} is the capital of {Capitals[c]}.")

**Solution:**
1. The `NameError` with the message "NameError: name 'Capitals' is not defined" occurs because variable `capitals` that we actually defined is not capitalized. `Capitals` does not exist.

In [None]:
for c in capitals.keys():
  print(f"{c} is the capital of {capitals[c]}.")

2. We have to switch the country and the capital for the sentences to make sense.

In [None]:
for c in capitals.keys():
  print(f"{capitals[c]} is the capital of {c}.")

3. Optionally, we can give a more meaningful name to the iterator variable so that the code is easier to understand.

In [None]:
for country in capitals.keys():
  print(f"{capitals[country]} is the capital of {country}.")

## Exercise 2:
Let's imagine we want to show our love for Ringo Starr and print a love statement for him as many times as numbers are in the `numbers` list. For all Beatles who are not Ringo, we want to print as many times a hate statement. The output should look like this:

```
I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!


I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!


I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!


I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!


I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!
```



In [None]:
for beatle in beatles:
  if beatle = "Ringo Starr":
    for n in numbers:
      print(f"I love {beatle}!")
  if beatle != "Ringo Starr":
    print(f"I hate {beatle}!")
      print("\n")

**Solution:**

1. A single `=` sign is the assignment operator. For the logical operator "equals to" we need to use the double equal sign `==`:

In [None]:
for beatle in beatles:
  if beatle == "Ringo Starr":
    for n in numbers:
      print(f"I love {beatle}!")
  if beatle != "Ringo Starr":
    print(f"I hate {beatle}!")
      print("\n")

2. The `print("\n")` statement needs to be properly indented.

In [None]:
for beatle in beatles:
  if beatle == "Ringo Starr":
    for n in numbers:
      print(f"I love {beatle}!")
  if beatle != "Ringo Starr":
    for n in numbers:
      print(f"I hate {beatle}!")
  print("\n")

3. We want to iterate through `numbers` first, and then through `beatles`.

In [None]:
for n in numbers:
  for beatle in beatles:
    if beatle == "Ringo Starr":
        print(f"I love {beatle}!")
    if beatle != "Ringo Starr":
      print(f"I hate {beatle}!")
  print("\n")

4. Optionally, we can replace `if beatle != "Ringo Starr":` with `else`. It will make our code simpler and more elegant:

In [None]:
for n in numbers:
  for beatle in beatles:
    if beatle == "Ringo Starr":
        print(f"I love {beatle}!")
    else:
      print(f"I hate {beatle}!")
  print("\n")

## Exercise 3:

In [None]:
top_profitable_films = pd.DataFrame(top_profitable_films)
top_profitable_films.head

1. We have not imported pandas yet, hence the NameError for `pd`.



In [None]:
import pandas as pd
top_profitable_films = pd.DataFrame(top_profitable_films)
top_profitable_films.head

2. It is not a good practice to overwrite the variable of the original dataset when creating a new version of it. We want to preserve the dictionary `top_profitable_films`.

In [None]:
import pandas as pd
top_films_df = pd.DataFrame(top_profitable_films)
top_films_df.head

3. `head()` is a method, and therefore needs the parentheses.


In [None]:
import pandas as pd
top_films_df = pd.DataFrame(top_profitable_films)
top_films_df.head()

## Exercise 4:

In [None]:
top_films_df[top_films_df["Film"]=="Avatar"]["Year"] = "2009"

**Solution:**

In general, when selecting data from a DataFrame, and ALWAYS when overwriting data from it, use `.loc[]` instead of simple `[]`.

When fixing the infamous `A value is trying to be set on a copy of a slice from a DataFrame.` error, it's better to have a fresh start, so create the dataframe again.

In [None]:
top_profitable_films = pd.DataFrame(top_profitable_films)

top_films_df.loc[top_films_df["Film"]=="Avatar", "Year"] = "2009"

In [None]:
top_films_df.head(2)

## Exercise 5:

We want to get the average gross profit of all films:

In [None]:
top_films_df["Worldwide Gross (in billions)"].avg()

**Solution:**

1. The `AttributeError: 'Series' object has no attribute 'avg'` tells us that the method we have used does not exist for a Pandas column (which is a Series). A quick google shows us that the method we need is `mean()`:

In [None]:
top_films_df["Worldwide Gross (in billions)"].mean()

2. The `TypeError` and the message `Could not convert 2.7982.7892.1942.0731.6731.6561.5191.4501.2761.263 to numeric` we understand that these numbers don't have a numeric data type, which does not allow Pandas to compute their mean. Let's change it:

In [None]:
top_films_df.loc[:, "Worldwide Gross (in billions)"] = pd.to_numeric(top_films_df.loc[:, "Worldwide Gross (in billions)"])
top_films_df["Worldwide Gross (in billions)"].mean()