<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

## Python Review With Movie Data

_Author: Kiefer Katovich and Dave Yerrington (San Francisco)_

---

In this lab, you'll be using the [IMDb](http://www.imdb.com/) `movies` list below as your data set. 

This lab is designed to help you practice iteration and functions in particular. The normal questions are more gentle, and the challenge questions are suitable for advanced/expert Python students or those with programming experience. 

All of the questions require writing functions and using iteration to solve. You should print out a test of each function you write.


### 1) Load the provided list of `movies` dictionaries.

In [72]:
import pandas as pd

In [73]:
# List of movies dictionaries:

movies = [
{
"name": "Usual Suspects", 
"imdb": 7.0,
"category": "Thriller"
},
{
"name": "Hitman",
"imdb": 6.3,
"category": "Action"
},
{
"name": "Dark Knight",
"imdb": 9.0,
"category": "Adventure"
},
{
"name": "The Help",
"imdb": 8.0,
"category": "Drama"
},
{
"name": "The Choice",
"imdb": 6.2,
"category": "Romance"
},
{
"name": "Colonia",
"imdb": 7.4,
"category": "Romance"
},
{
"name": "Love",
"imdb": 6.0,
"category": "Romance"
},
{
"name": "Bride Wars",
"imdb": 5.4,
"category": "Romance"
},
{
"name": "AlphaJet",
"imdb": 3.2,
"category": "War"
},
{
"name": "Ringing Crime",
"imdb": 4.0,
"category": "Crime"
},
{
"name": "Joking muck",
"imdb": 7.2,
"category": "Comedy"
},
{
"name": "What is the name",
"imdb": 9.2,
"category": "Suspense"
},
{
"name": "Detective",
"imdb": 7.0,
"category": "Suspense"
},
{
"name": "Exam",
"imdb": 4.2,
"category": "Thriller"
},
{
"name": "We Two",
"imdb": 7.2,
"category": "Romance"
}
]

In [74]:
movies

[{'category': 'Thriller', 'imdb': 7.0, 'name': 'Usual Suspects'},
 {'category': 'Action', 'imdb': 6.3, 'name': 'Hitman'},
 {'category': 'Adventure', 'imdb': 9.0, 'name': 'Dark Knight'},
 {'category': 'Drama', 'imdb': 8.0, 'name': 'The Help'},
 {'category': 'Romance', 'imdb': 6.2, 'name': 'The Choice'},
 {'category': 'Romance', 'imdb': 7.4, 'name': 'Colonia'},
 {'category': 'Romance', 'imdb': 6.0, 'name': 'Love'},
 {'category': 'Romance', 'imdb': 5.4, 'name': 'Bride Wars'},
 {'category': 'War', 'imdb': 3.2, 'name': 'AlphaJet'},
 {'category': 'Crime', 'imdb': 4.0, 'name': 'Ringing Crime'},
 {'category': 'Comedy', 'imdb': 7.2, 'name': 'Joking muck'},
 {'category': 'Suspense', 'imdb': 9.2, 'name': 'What is the name'},
 {'category': 'Suspense', 'imdb': 7.0, 'name': 'Detective'},
 {'category': 'Thriller', 'imdb': 4.2, 'name': 'Exam'},
 {'category': 'Romance', 'imdb': 7.2, 'name': 'We Two'}]

In [75]:
movies[:2]

[{'category': 'Thriller', 'imdb': 7.0, 'name': 'Usual Suspects'},
 {'category': 'Action', 'imdb': 6.3, 'name': 'Hitman'}]

In [76]:
movies[2]['name']

'Dark Knight'

In [77]:
pd.DataFrame(movies)

Unnamed: 0,category,imdb,name
0,Thriller,7.0,Usual Suspects
1,Action,6.3,Hitman
2,Adventure,9.0,Dark Knight
3,Drama,8.0,The Help
4,Romance,6.2,The Choice
5,Romance,7.4,Colonia
6,Romance,6.0,Love
7,Romance,5.4,Bride Wars
8,War,3.2,AlphaJet
9,Crime,4.0,Ringing Crime


---

### 2) Filtering data by IMDb score.

#### 2.1)

Write a function that:

1) Accepts a single movie dictionary from the `movies` list as an argument.
2) Returns `True` if the IMDb score is greater than 5.5.

#### 2.2 [Challenge])

Write a function that:

1) Accepts the `movies` list and a specified category.
2) Returns `True` if the average score of the category is higher than the average score of all movies.

In [78]:
# 2.1
def better_than_average_movie(movie_dict):
    return(movie_dict['imdb']>5.5)

In [79]:
better_than_average_movie(movies[0])

True

In [80]:
better_than_average_movie(movies[8])

False

In [81]:
# 2.2
def better_than_average_category(category, movie_list=movies):
    movie_df = pd.DataFrame(movie_list)
    avg_score = movie_df.imdb.mean()
    
    return(movie_df[movie_df['category']==category].imdb.mean() > avg_score)

In [82]:
better_than_average_category('Suspense')

True

In [83]:
better_than_average_category('Thriller')

False

---

### 3) Creating subsets by numeric condition.

#### 3.1)

Write a function that:

1) Accepts the list of movies and a specified IMDb score.
2) Returns the sublist of movies that have scores greater than the one specified.

#### 3.2 [Expert])

Write a function that:

1) Accepts the `movies` list as an argument.
2) Returns the `movies` list sorted first by category and then by movie according to category average score and individual IMDb score, respectively.

In [91]:
# 3.1
def movies_rated_above(score, movie_list=movies):
    movie_df = pd.DataFrame(movie_list)
    
    return(list(movie_df[movie_df['imdb']>score].name))

In [92]:
movies_rated_above(7.5)

['Dark Knight', 'The Help', 'What is the name']

In [93]:
movies_rated_above(5.5)

['Usual Suspects',
 'Hitman',
 'Dark Knight',
 'The Help',
 'The Choice',
 'Colonia',
 'Love',
 'Joking muck',
 'What is the name',
 'Detective',
 'We Two']

In [None]:
# 3.2 Sorted first by category and then by movie according to category average score and individual IMDb score


In [85]:
movies

[{'category': 'Thriller', 'imdb': 7.0, 'name': 'Usual Suspects'},
 {'category': 'Action', 'imdb': 6.3, 'name': 'Hitman'},
 {'category': 'Adventure', 'imdb': 9.0, 'name': 'Dark Knight'},
 {'category': 'Drama', 'imdb': 8.0, 'name': 'The Help'},
 {'category': 'Romance', 'imdb': 6.2, 'name': 'The Choice'},
 {'category': 'Romance', 'imdb': 7.4, 'name': 'Colonia'},
 {'category': 'Romance', 'imdb': 6.0, 'name': 'Love'},
 {'category': 'Romance', 'imdb': 5.4, 'name': 'Bride Wars'},
 {'category': 'War', 'imdb': 3.2, 'name': 'AlphaJet'},
 {'category': 'Crime', 'imdb': 4.0, 'name': 'Ringing Crime'},
 {'category': 'Comedy', 'imdb': 7.2, 'name': 'Joking muck'},
 {'category': 'Suspense', 'imdb': 9.2, 'name': 'What is the name'},
 {'category': 'Suspense', 'imdb': 7.0, 'name': 'Detective'},
 {'category': 'Thriller', 'imdb': 4.2, 'name': 'Exam'},
 {'category': 'Romance', 'imdb': 7.2, 'name': 'We Two'}]

---

### 4) Creating subsets by string condition.

#### 4.1)

Write a function that:

1) Accepts the `movies` list and a category name.
2) Returns the movie names within that category (case-insensitive!).
3) If the category is not in the data, prints a message that says it does not exist and returns `None`.

Recall that, to convert a string to lowercase, you can use:

```python
mystring = 'Dumb and Dumber'
lowercase_mystring = mystring.lower()
print lowercase_mystring
'dumb and dumber'
```

#### 4.2 [Challenge])

Write a function that:

1) Accepts the `movies` list and a "search string."
2) Returns a dictionary with the keys `'category'` and `'title'` whose values are lists of categories that contain the search string and titles that contain the search string, respectively (case-insensitive!).

In [None]:
# Your code here.

---

### 5) Multiple conditions.

#### 5.1)

Write a function that:

1) Accepts the `movies` list and a "search criteria" variable.
2) If the criteria variable is numeric, return a list of movie titles with a score greater than or equal to the criteria.
3) If the criteria variable is a string, return a list of movie titles that match that category (case-insensitive!). If there is no match, return an empty list and print an informative message.

#### 5.2 [Expert])

Write a function that:

1) Accepts the `movies` list and a string search criteria variable.
2) The search criteria variable can contain within it:
  - Boolean operations: `'AND'`, `'OR'`, and `'NOT'` (can have/be lowercase as well, we just capitalized for clarity).
  - Search criteria specified with the syntax `score=...`, `category=...`, and/or `title=...`, where the `...` indicates what to look for.
    - If `score` is present, it indicates scores greater than or equal to the value.
    - For `category` and `title`, the string indicates that the category or title must _contain_ the search string (case-insensitive).
3) Return the matches for the search criteria specified.

In [None]:
# Your code here.