# The World's Worst Variable

For this challenge, we will be dealing with the world's worst variable. 

Being able to navigate through complicated data structures and access the information you need is an important skill for a data scientist to hone! The variable we've created below has a lot of different data types nested inside it and we're going to ask you to access data, update data, delete data, and perform operations on data.

Make liberal use of the built-in python methods (`len`, `type()`) and dictionary methods (`dict.items()`, `dict.keys()`, `dict.values()`) to find out what kind of data type you are working with. 

We're going to need to obtain data deeper inside the variable for most of these tasks and that's going to prove tricky!

Let's first load up the variable as `worlds_worst_variable` and then print it out in a useful way.

In [1]:
worlds_worst_variable = [{("Iron Man", 2008): {"Plot": "Rich man builds metal suit and blasts things",
                                               "Actors": ["RDJ", "Gwyneth Paltrow", "Jon Favreau"],
                                               "Box Office Yields": [{"US": 4000000}, {"England": 4500000}, {"Asia": 50000}]},
                        ("Thor", 2011): {"Plot": "Blonde man bashes things with hammer",
                                    "Actors": ["Chris Hemsworth", "Natalie Portman", "Jon Favreau"],
                                    "Box Office Yields": [{"US": 300000}, {"Europe": 320000}, {"Asia": 70000}]},
                        ("Doctor Strange", 2016): {"Plot": "Boy with spider powers saves New York",
                                                   "Actors": ["Benedict Cumberbatch", "Benedict Wong", "Rachel McAdams"],
                                                   "Box Office Yields": [{"US": 353300}, {"Europe": 320500}, {"Asia": 74000}]},
                        ("Captain America: Civil War", 2016): {"Plot": "Metal boy and America man fight",
                                                               "Actors": ["Chris Evans", "RDJ", "Scarlett Johanson"],
                                                               "Box Office Yields": [{"US": 3200300}, {"Europe": 3750500}, {"Asia": 7344000}]},
                        ("Deadpool", 2016): {"Plot": "Man in red and black suit can't die",
                                             "Actors": ["Ryan Reynolds", "Morena Baccarrin", "TJ Miller"],
                                             "Box Office Yields": [{"US": 200300}, {"Europe": 30500}, {"Asia": 7344000}]},
                        ("Morbius", 2022): {"Plot": "Terrible movie where man becomes a vampire",
                                            "Actors": ["Jared Leto", "Matt Smith", "Adria Arjona"],
                                            "Box Office Yields": [{"US": 9900300}, {"Europe": 440500}, {"Asia": 74000}]}}]

OK, let's print this nasty variable out to get a better look at how it's laid out! We're going to use `pretty print` to make this data structure a little easier to look at

In [2]:
import pprint
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(worlds_worst_variable)

[   {   ('Captain America: Civil War', 2016): {   'Actors': [   'Chris Evans',
                                                                'RDJ',
                                                                'Scarlett '
                                                                'Johanson'],
                                                  'Box Office Yields': [   {   'US': 3200300},
                                                                           {   'Europe': 3750500},
                                                                           {   'Asia': 7344000}],
                                                  'Plot': 'Metal boy and '
                                                          'America man fight'},
        ('Deadpool', 2016): {   'Actors': [   'Ryan Reynolds',
                                              'Morena Baccarrin',
                                              'TJ Miller'],
                                'Box Office Yields': [   {'US

!["Shocked Deadpool"](https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExY3ViamRhcm41cm4wYm1sbHdvaHltdHJ5ZW5mZXRpc3drdTR5bXB2OCZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/57ZvMMkuBIVMlU88Yh/giphy.gif)

Oh the horrors!

## Basic Investigation

First things first, what type is `worlds_worst_variable`?

In [3]:
type(worlds_worst_variable)

list

How many elements does this variable contain? _**Hint:** Here we mean: what is its length?_

In [4]:
len(worlds_worst_variable)

1

What type is the first element inside this variable?

In [5]:
type(worlds_worst_variable[0])

dict

Hmmm... that type could have lots of elements inside it. Let's get a list of every key inside.

In [6]:
worlds_worst_variable[0].keys()

dict_keys([('Iron Man', 2008), ('Thor', 2011), ('Doctor Strange', 2016), ('Captain America: Civil War', 2016), ('Deadpool', 2016), ('Morbius', 2022)])

## Task 1 -  I am Iron Man!
Let's get down to business! Get all the information associated with the Iron Man movie and store it in a variable `iron_man`

In [7]:
iron_man = worlds_worst_variable[0][("Iron Man", 2008)]
iron_man

{'Plot': 'Rich man builds metal suit and blasts things',
 'Actors': ['RDJ', 'Gwyneth Paltrow', 'Jon Favreau'],
 'Box Office Yields': [{'US': 4000000}, {'England': 4500000}, {'Asia': 50000}]}

Hmmm, looks like there is a mistake in the Iron Man details... It's unlikely that England alone had a higher box office yield than the entirety of the United States! A quick check would likely tell us that the key should be "Europe". So, we should probably change the "England" key to "Europe" in the Iron Man dictionary associated with `Box Office Yields`

Sounds easy! Unfortunately, dictionary keys are immutable and difficult to change by design. The best approach would be to add a new entry with the correct key and remove the old entry with the incorrect key.


So first let's look at the value associated with the key "Box Office Yields" in our `iron_man` details. What type is it?

In [8]:
# Find the dictionary that we need to correct
iron_man['Box Office Yields']

[{'US': 4000000}, {'England': 4500000}, {'Asia': 50000}]

In [9]:
type(iron_man["Box Office Yields"])

list

Hmmmm... that's an unusual way to store this information, definitely not best practice, but we're not here to fix the fundamental structural issues in this variable!

Let's just make a little change. We will replace the incorrect dictionary in this list with the correct information `{'Europe: 4500000}`

Make sure you're changing the original variable `worlds_worst_variable` and not our `iron_man` variable!

In [10]:
worlds_worst_variable[0][("Iron Man", 2008)]['Box Office Yields'][1]['Europe'] = 4500000

Let's take a look at that list of "Box Office Yields" now!

In [11]:
worlds_worst_variable[0][("Iron Man", 2008)]['Box Office Yields']

[{'US': 4000000}, {'England': 4500000, 'Europe': 4500000}, {'Asia': 50000}]

Now we need to take out the incorrect key-value pair

In [12]:
worlds_worst_variable[0][("Iron Man", 2008)]['Box Office Yields'][1].pop('England')

4500000

Fantastic! Now the original variable has been corrected. We could have achieved this in one line with the `.pop()` dictionary method. Because `.pop()` *returns* the value that is being removed from the list.

`.pop()` removes an entry from a list or dictionary and returns that entry for use. Refer to the [python docs](https://docs.python.org/3/library/stdtypes.html?#dict.pop) or give it a google for more information.

That would have looked like this:

`worlds_worst_variable[0][("Iron Man", 2008)]['Box Office Yields'][1]['Europe'] = worlds_worst_variable[0][("Iron Man", 2008)]['Box Office Yields'][1].pop('England')`

However, it's not necessary to do everything in one go like this when we're still learning. Let's learn to walk before we run!

Let's save the "Europe" key-value pair dictionary we created to a variable called `updated_box_office`

In [13]:
# Confirming change
updated_box_office = worlds_worst_variable[0][("Iron Man", 2008)]['Box Office Yields'][1]
updated_box_office

{'Europe': 4500000}

##  Task 2 - Missing Michael in Morbius

We're missing a surprising cast member from the biggest box-office-smash blockbuster hit of 2022.  That's right, we'll need to add Michael Keaton to the cast of Morbius

First, let's see which actors did make it into the cast list of our terrible variable!

In [14]:
# Find the data type to determine what method to use
worlds_worst_variable[0][("Morbius", 2022)]['Actors']

['Jared Leto', 'Matt Smith', 'Adria Arjona']

What type is that cast list?

In [15]:
type(worlds_worst_variable[0][("Morbius", 2022)]['Actors'])

list

Great, it should be pretty easy to add some information to that data type. Go ahead and add `'Michael Keaton'` to the `worlds_worst_variable`

In [16]:
# It's a list, .append() is the go to method
worlds_worst_variable[0][("Morbius", 2022)]['Actors'].append('Michael Keaton')

Let's take a look at that cast list now and make sure it worked! Save that updated cast list to a variable `morbius_cast`

In [17]:
# Confirming change
worlds_worst_variable[0][("Morbius", 2022)]['Actors']

['Jared Leto', 'Matt Smith', 'Adria Arjona', 'Michael Keaton']

## Task 3 - Doctor Spider?

Let's look at the details for `'Doctor Strange'` in `worlds_worst_variable`

In [18]:
# have a look at the existing data
worlds_worst_variable[0][('Doctor Strange', 2016)]

{'Plot': 'Boy with spider powers saves New York',
 'Actors': ['Benedict Cumberbatch', 'Benedict Wong', 'Rachel McAdams'],
 'Box Office Yields': [{'US': 353300}, {'Europe': 320500}, {'Asia': 74000}]}

Hmmmm... I'm not sure that plot is correct. Looks like we're going to have to update that value in the Doctor Strange dictionary.

A more fitting plot description might be `"Man with goatee learns to make sparks."`

Before we can make this change, make sure we know exactly what kind of data we're working with.

In [19]:
# Determine data type
type(worlds_worst_variable[0][("Doctor Strange", 2016)])

dict

Knowing that it is a dictionary, reassign the value in the relevant key-value pair so that we've replaced that spider plot with our goatee plot!

In [20]:
worlds_worst_variable[0][("Doctor Strange", 2016)]['Plot'] = "Man with goatee learns to make sparks."

Now check that the `worlds_worst_variable` has been updated with the correct plot

In [21]:
# Confirm changes
worlds_worst_variable[0][("Doctor Strange", 2016)]

{'Plot': 'Man with goatee learns to make sparks.',
 'Actors': ['Benedict Cumberbatch', 'Benedict Wong', 'Rachel McAdams'],
 'Box Office Yields': [{'US': 353300}, {'Europe': 320500}, {'Asia': 74000}]}

## Task 4 -  Total European Box Office in 2016

There are quite a few movies from 2016 in our variable. Let's add up all of the European box office yields for the movies released in 2016.

**Hint:** First you might want to figure out how they are all stored, then you'll probably want to write some kind of for loop

In [22]:
# find where European yields are in the variable
worlds_worst_variable[0][('Iron Man', 2008)]['Box Office Yields'][1]

{'Europe': 4500000}

In [23]:
total_yield = 0

for movie_name, movie_info in worlds_worst_variable[0].items():
    # When building loops, work iteratively, look at intermediate steps, for example:
    # print(movie_name, movie_info)

    # if statement to check year is 2016
    if movie_name[1] == 2016:
        # add yield to total_yield
        total_yield += movie_info['Box Office Yields'][1]['Europe']

print(f'Total yield in Europe is {total_yield} American dollary doos.')

Total yield in Europe is 4101500 American dollary doos.


### This for loop could have been a list comprehension... Or could it?

Sometimes, we're too busy thinking about whether we _CAN_ write a list comprehension, we forget to stop and think about whether we _SHOULD_

In [24]:
# as comprehension
total_yield = sum([movie_info['Box Office Yields'][1]['Europe'] for movie_name, movie_info in worlds_worst_variable[0].items() if movie_name[1] == 2016 ])
print(f'Total yield in Europe is {total_yield} American dollars.')

Total yield in Europe is 4101500 American dollars.


Readability of code is almost always preferable to single line code and this code definitely isn't easily readable...

This is why you **_should not_** use list comprehensions on variables with complex structures!

# Excelsior! You did it! Time to push to kitt!