# Debugging Python code

Go through the exercises below.

- Each exercise contains some code with **one or more mistakes**.
- **Important**: note that the mistakes can either prompt an error or not. Some mistakes are "logical errors", and you will have to understand why the output is not the desired one.
- There might be multiple ways to fix the mistakes.
- Improving the code readability is also encouraged.

In [98]:
# data creation
beatles = ["John Lennon", "Paul McCartney", "George Harrison", "Ringo Starr"]

numbers = [1, 2, 3, 4, 5]

capitals = {"Germany": "Berlin",
            "Russia": "Moscow",
            "France": "Paris",
            "China": "Beijing",
            "Egypt": "Cairo",
            "Brazil": "Sao Paulo"
            }

top_profitable_films = {
    "Film": ["Avengers: Endgame", "Avatar", "Titanic", "Star Wars: The Force Awakens", "Jurassic World",
             "The Lion King", "The Avengers", "Frozen II", "Frozen", "Beauty and the Beast"],
    "Year": ["2019", "2007", "1997", "2015", "2015", "2019", "2012", "2019", "2013", "2017"],
    "Worldwide Gross (in billions)": ["2.798", "2.789", "2.194", "2.073", "1.673", "1.656", "1.519",
                                      "1.450", "1.276", "1.263"]
    }

## Exercise 1:

In [None]:
for c in Capitals.keys():  # the error is that the name of the dataframe is case sensitive
  print(f"{c} is the capital of {Capitals[c]}.")

NameError: name 'Capitals' is not defined

In [68]:
# My Code

for c in capitals.keys(): # use "capitals" instead of "Capitals"
  print(f"{capitals[c]} is the capital of {c}.")

Berlin is the capital of Germany.
Moscow is the capital of Russia.
Paris is the capital of France.
Beijing is the capital of China.
Cairo is the capital of Egypt.
Sao Paulo is the capital of Brazil.


## Exercise 2:
Let's imagine we want to show our love for Ringo Starr and print a love statement for him as many times as numbers are in the `numbers` list. For all Beatles who are not Ringo, we want to print as many times a hate statement. The output should look like this:

```
I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!


I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!


I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!


I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!


I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!
```



In [7]:
for beatle in beatles:  # in this error, the format is wrong, also we don't need to check again whether or not the Beetle is Ringo Starr
  if beatle = "Ringo Starr":
    for n in numbers:
      print(f"I love {beatle}!")
  if beatle != "Ringo Starr":
    print(f"I hate {beatle}!")
      print("\n")

SyntaxError: invalid syntax. Maybe you meant '==' or ':=' instead of '='? (<ipython-input-7-5581e587c023>, line 2)

In [77]:
# My Code

for n in numbers:  # This statement will mean that it will loop 5 times because there are 5 items in "numbers" we then define as "n"
  for beatle in beatles:  # this will make it loop trough all the beetles
    if beatle == "Ringo Starr": # checking for Ringo Starr
      print(f"I love {beatle}!")
    else:                         # we don't need another check here, "else:" is enough
     print(f"I hate {beatle}!")
  print("\n")

I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!


I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!


I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!


I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!


I hate John Lennon!
I hate Paul McCartney!
I hate George Harrison!
I love Ringo Starr!




## Exercise 3:

In [28]:
top_profitable_films = pd.DataFrame(top_profitable_films) # where is the library ?
top_profitable_films.head

In [78]:
# My Code

import pandas as pd # import the library

top_profitable_films = pd.DataFrame(top_profitable_films)
top_profitable_films.head() # .head always needs parentheses

Unnamed: 0,Film,Year,Worldwide Gross (in billions)
0,Avengers: Endgame,2019,2.798
1,Avatar,2009,2.789
2,Titanic,1997,2.194
3,Star Wars: The Force Awakens,2015,2.073
4,Jurassic World,2015,1.673


## Exercise 4:

We realised that, in our top_films_df, the year of the movie Avatar is wrong. We want to replace it for the correct one, 2009:

In [37]:
top_films_df[top_films_df["Film"]=="Avatar"]["Year"] = "2009" # is top_films_df even specified ?

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  top_films_df[top_films_df["Film"]=="Avatar"]["Year"] = "2009"


In [39]:
# My Code

top_films_df = pd.DataFrame(top_profitable_films) # specifying top_films_df just in case
top_films_df

top_films_df.loc[top_films_df["Film"]=="Avatar", "Year"] = "2009" # Setting .loc argument here is wise because of an ambiguity problem otherwise (you can't modify values directly because the above code is chaining indexes instead of selecting a distinct cell)

## Exercise 5:

We want to get the average gross profit of all films:

In [61]:
top_films_df["Worldwide Gross (in billions)"].avg() # .avg does not exist in pandas

TypeError: Could not convert string '2.7982.7892.1942.0731.6731.6561.5191.4501.2761.263' to numeric

In [102]:
# My Code

top_films_df["Worldwide Gross (in billions)"].astype(float).mean() # There are 2 errors here first of all, there is no .avg in pandas library so we just use .mean secondly, the output is a float number we can convert by defining the output as float by using ".astype()" or converting it with pd.to_numeric()




1.8691000000000002

In [104]:
top_films_df = pd.DataFrame(top_profitable_films)

result = round(top_films_df["Worldwide Gross (in billions)"].astype(float).mean(), 2)
f_result = f"{result} billions"
print(f_result)

1.87 billions
