<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

## Python Review with Movie Data


---

In this lab you'll be using the provided imdb `movies` list below as your dataset. 

This lab is designed to practice iteration and funcitons in particular. The normal questions are more gentle, and the challenge questions are suitable for advanced/expert python or programming-experienced students. 

All the questions require writing functions and also use iteration to solve. You should print out a test of each function you write.


### 1. Load the provided list of movies dictionaries.

In [2]:
# List of movies dictionaries:
movies = [
{
"name": "Usual Suspects", 
"imdb": 7.0,
"category": "Thriller"
},
{
"name": "Hitman",
"imdb": 6.3,
"category": "Action"
},
{
"name": "Dark Knight",
"imdb": 9.0,
"category": "Adventure"
},
{
"name": "The Help",
"imdb": 8.0,
"category": "Drama"
},
{
"name": "The Choice",
"imdb": 6.2,
"category": "Romance"
},
{
"name": "Colonia",
"imdb": 7.4,
"category": "Romance"
},
{
"name": "Love",
"imdb": 6.0,
"category": "Romance"
},
{
"name": "Bride Wars",
"imdb": 5.4,
"category": "Romance"
},
{
"name": "AlphaJet",
"imdb": 3.2,
"category": "War"
},
{
"name": "Ringing Crime",
"imdb": 4.0,
"category": "Crime"
},
{
"name": "Joking muck",
"imdb": 7.2,
"category": "Comedy"
},
{
"name": "What is the name",
"imdb": 9.2,
"category": "Suspense"
},
{
"name": "Detective",
"imdb": 7.0,
"category": "Suspense"
},
{
"name": "Exam",
"imdb": 4.2,
"category": "Thriller"
},
{
"name": "We Two",
"imdb": 7.2,
"category": "Romance"
}
]

---

### 2. Filtering data by IMDB score

#### 2.1 

Write a function that:

1. Accepts a single movie dictionary from the `movies` list as an argument.
2. Returns True if the IMDB score is above 5.5.

#### 2.2 [Challenge] 

Write a function that:

1. Accepts the movies list and a specified category.
2. Returns True if the average score of the category is higher than the average score of all movies.

In [3]:
def filter21(dictionary):
    return dictionary['imdb'] > 5.5 

filter21(movies[2])

True

In [40]:
# solution 2 (for loop syntax)
def filter22_for_loop(cat, movies = movies):
    movies_sum = []
    movies_sum_cat = []
    
    n_movies = len(movies)
    n_movies_cat = 0
    
    for movie in movies:        
        movies_sum.append(movie['imdb'])
        if movie['category'] == cat:
            n_movies_cat = n_movies_cat + 1
            movies_sum_cat.append(movie['imdb'])

    avg = sum(movies_sum) / n_movies
    avg_cat = sum(movies_sum_cat) / n_movies_cat
    
    return avg_cat > avg

filter22_for_loop('Suspense')

True

In [39]:
# solution 2 (list comprehension)
def filter22(cat, movies = movies):
    avg = sum([movie['imdb'] for movie in movies])/len(movies)
    cat_movies = [movie for movie in movies if movie['category'] == cat]
    return sum([movie['imdb'] for movie in cat_movies])/len(cat_movies) > avg

filter22('Suspense')

True

On average, Suspense movies have a higher average score than the average score of all the movies.

---

### 3. Creating subsets by numeric condition

#### 3.1

Write a function that:

1. Accepts the list of movies and a specified imdb score.
2. Returns the sublist of movies that have a score greater than the specified score.

#### 3.2 [Expert] 

Write a function that:

1. Accepts the movies list as an argument.
2. Returns the movies list sorted first by category and then by movie according to average score and individual score, respectively.

In [25]:
# solution 1 (for loop syntax)
good_movies = []
for movie in movies:
    if movie['imdb'] > 7:
        good_movies.append(movie['name'])
        
print(good_movies)

['Dark Knight', 'The Help', 'Colonia', 'Joking muck', 'What is the name', 'We Two']


In [27]:
# solution 2 (list comprehension syntax)
def filter31(score, movies=movies):
    return [movie['name'] for movie in movies if movie['imdb']>score]

filter31(7)

['Dark Knight',
 'The Help',
 'Colonia',
 'Joking muck',
 'What is the name',
 'We Two']

---

### 4. Creating subsets by string condition

#### 4.1

Write a function that:

1. Accepts the movies list and a category name.
2. Returns the movie names within that category (case-insensitive!)
3. If the category is not in the data, print a message that it does not exist and return None.

Recall that to convert a string to lowercase, you can use:

```python
mystring = 'Dumb and Dumber'
lowercase_mystring = mystring.lower()
print lowercase_mystring
'dumb and dumber'
```

#### 4.2 [Challenge]

Write a function that:

1. Accepts the movies list and a "search string".
2. Returns a dictionary with keys `'category'` and `'title'` whose values are lists of categories that contain the search string and titles that contain the search string, respectively (case-insensitive!)

In [41]:
def filter41(cat, movies = movies):
    movie_list = [movie['name'] for movie in movies if movie['category'].lower() == cat.lower()]
    if len(movie_list) >= 1:
        return movie_list
    else:
        print(f"{cat} is not a valid category")
        return None

filter41('romance')

['The Choice', 'Colonia', 'Love', 'Bride Wars', 'We Two']

In [42]:
filter41('Romance')

['The Choice', 'Colonia', 'Love', 'Bride Wars', 'We Two']

In [53]:
def filter42(search, movies=movies):
    dct = {} 
    dct['title'] = [movie['name'] for movie in movies if search.lower() in movie['name'].lower()]
    # we use a set here because we want to remove duplicates
    # then we use a list by convention for our values
    dct['category'] = list(set([movie['category'] for movie in movies if search.lower()\
                                in movie['category'].lower()]))
    return dct

# if r is in the movie title, return it!
filter42('r')

{'title': ['Dark Knight', 'Bride Wars', 'Ringing Crime'],
 'category': ['War', 'Thriller', 'Adventure', 'Romance', 'Crime', 'Drama']}