# Has Many Movie Lab

### Introduction
In this lab we will continue to look at the "Has-Many" relationships in our data. The database we will be using during this lab contains information about a selection of movies and related entities such as actors, directors and writers. A movie entity will have relationships with actor, director, and writer entities. The actors, directors and writers will also have relationships with themselves (i.e. a director will have worked with many actors). In problems below, we will use our knowledge of these relationships to build SQL queries.

Let's begin by connecting to the database and reviewing the schema of the tables.

In [2]:
import sqlite3
conn = sqlite3.connect('movie_films_actors.db')
cursor = conn.cursor()

In [1]:
import pandas as pd
root_url = "https://raw.githubusercontent.com/data-eng-10-21/has-many-movies-lab/main/"
names = ['actors',
 'directors',
 'writers',
  'movies',
 'movie_actors',
 'movie_directors',
 'movie_writers']

loaded_dfs = [pd.read_csv(f'{url}{name}.csv') for name in names]


In [3]:
for index, name in enumerate(names):
    loaded_dfs[index].to_sql(f'{name}', conn, index = False, if_exists = 'replace')

In [4]:
cursor.execute('SELECT name from sqlite_master where type= "table"')
cursor.fetchall()

[('actors',),
 ('directors',),
 ('writers',),
 ('movies',),
 ('movie_actors',),
 ('movie_directors',),
 ('movie_writers',)]

In [5]:
cursor.execute('PRAGMA table_info(movies)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0),
 (1, 'title', 'TEXT', 0, None, 0),
 (2, 'studio', 'TEXT', 0, None, 0),
 (3, 'runtime', 'REAL', 0, None, 0),
 (4, 'description', 'TEXT', 0, None, 0),
 (5, 'release_date', 'TEXT', 0, None, 0),
 (6, 'year', 'INTEGER', 0, None, 0)]

In [6]:
cursor.execute('PRAGMA table_info(actors)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0), (1, 'name', 'TEXT', 0, None, 0)]

In [7]:
cursor.execute('PRAGMA table_info(directors)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0), (1, 'name', 'TEXT', 0, None, 0)]

In [8]:
cursor.execute('PRAGMA table_info(writers)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0), (1, 'name', 'TEXT', 0, None, 0)]

In [9]:
cursor.execute('PRAGMA table_info(movie_actors)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0),
 (1, 'movie_id', 'INTEGER', 0, None, 0),
 (2, 'actor_id', 'INTEGER', 0, None, 0)]

In [10]:
cursor.execute('PRAGMA table_info(movie_directors)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0),
 (1, 'movie_id', 'INTEGER', 0, None, 0),
 (2, 'director_id', 'INTEGER', 0, None, 0)]

In [44]:
cursor.execute('PRAGMA table_info(movie_writers)')
cursor.fetchall()

[(0, 'index', 'INTEGER', 0, None, 0),
 (1, 'movie_id', 'INTEGER', 0, None, 0),
 (2, 'writer_id', 'INTEGER', 0, None, 0)]

Let's start off with some basic one table queries:

* What is the title, length, and id of the movie with the longest runtime?

In [None]:

# [('Never Sleep Again: The Elm Street Legacy', 480.0, 11415)]

* Using your answer from the previous question, how many actors were credited for the movie with the longest runtime? Hint: Use the COUNT function with the movie ID

In [None]:

# [(6,)]

* What was the shortest movie released in 2006?

In [None]:


# [('The Guardian',)]

### Has Many

* What are the names of the actors in Toy Story?

In [None]:


# [('Tom Hanks',),
#  ('Jim Varney',),
#  ('Wallace Shawn',),
#  ('Don Rickles',),
#  ('John Ratzenberger',),
#  ('Tim Allen',)]

* What is the name of the director of Toy Story?

In [None]:


# [('John Lasseter',)]

* What are the names of the writers of Toy Story?

In [None]:

# [('Joss Whedon',), ('Joel Cohen',), ('Andrew Stanton',), ('Alec Sokolow',)]

* What is the name and actor id of the actor with the most credits in the database?

In [None]:


# ('Robert De Niro', 429, 78)]

* What are the titles of the movies the actor from the previous question has been in, after the year 2005?

In [None]:

# [("New Year's Eve",),
#  ('Mr. Warmth: The Don Rickles Project',),
#  ('Hands of Stone',),
#  ('Last Vegas',),
#  ('I Knew It Was You: Rediscovering John Cazale',),
#  ('Stardust',),
#  ('Killer Elite',),
#  ("Everybody's Fine",),
#  ('Stone',),
#  ('Machete',),
#  ('Red Lights',),
#  ('Righteous Kill',),
#  ('The Good Shepherd',),
#  ('The Bag Man',),
#  ('Being Flynn',),
#  ('Joy',),
#  ('The Wizard of Lies',),
#  ('Limitless',),
#  ('Killing Season',),
#  ('The Family',),
#  ('Heist',),
#  ('Great Expectations',),
#  ('Little Fockers',),
#  ('What Just Happened?',),
#  ('The Comedian',),
#  ('The Big Wedding',),
#  ('Dirty Grandpa',),
#  ('Grudge Match',)]

* What are the titles of movies with more than two directors?

In [None]:


# [('The Land Before Time III: The Time of Great Giving',),
#  ('101 Dalmatians',),
#  ('The Trip',),
#  ("Planet Terror (Grindhouse Presents: Robert Rodriguez's Planet Terror)",),
#  ('The Mummy',),
#  ('The Snowman',),
#  ('Zootopia',),
# ...

### Has Many Through

* What is the name of the writer in the database that has been credited the most times during the year 2018?

In [1]:


# [('Ryan Engle', 3)]

* What is the name of the actor or actress in the database that has been credited the most between 2010 and 2015 (inclusive)?

In [None]:


# [('James Franco', 22)]

* What are the names of all actors who performed in more than 3 movies in 2010?

In [None]:


# [('Aaron Taylor-Johnson',),
#  ('Adam Scott',),
#  ('Barry Pepper',),
#  ('Ben Stiller',),
#  ('Danny Huston',),
#  ('Gemma Arterton',),
#  ('Helen Mirren',),
#  ('Jay Baruchel',),
#  ('Jessica Alba',),
#  ('Jonah Hill',),
#  ('Josh Brolin',),
#  ('Josh Duhamel',),
#  ('Keith David',),
#  ('Liam Neeson',),
#  ('Matt Damon',),
#  ('Melissa Leo',),
#  ('Patricia Clarkson',),
#  ('Pierce Brosnan',),
#  ('Ralph Fiennes',),
#  ('Susan Sarandon',),
#  ('Zach Galifianakis',)]

* What studio has Steven Spielberg worked with the most?

In [None]:


# [('Universal Pictures', 7)]

* What years did Steven Spielberg direct 2 movies?

In [None]:


# [(1989, 2), (1993, 2), (1997, 2), (2002, 2), (2005, 2), (2011, 2), (2018, 2)]

* How many movies has each of the actors from Toy Story been in? (movie ID is 3648)

In [None]:


# [('Tom Hanks', 46),
#  ('Jim Varney', 8),
#  ('Wallace Shawn', 27),
#  ('Don Rickles', 10),
#  ('John Ratzenberger', 7),
#  ('Tim Allen', 20)]

* What are the names of other movies the director of Toy Story directed?

In [None]:


# [('Cars 2',), ('Cars',), ("A Bug's Life",), ('Toy Story 2',), ('Toy Story',)]

* What are the names of all the directors Tom Hanks has worked with? (Actor ID 189)

In [None]:


# [('Robert Zemeckis',),
#  ('Tom Hanks',),
#  ('Penny Marshall',),
#  ('Chris Paine',),
#  ('Doug Nichol',),
#  ('Steven Spielberg',),
#  ('Tom Tykwer',),
#  ('Sam Mendes',),
#  ('Steve Purcell (II)',),
#  ('Nora Ephron',),
#  ('Paul Greengrass',),
#  ('Ron Howard',),
#  ('Stephen Daldry',),
#  ('James Ponsoldt',),
#  ('Frank Darabont',),
#  ('David Seltzer',),
#  ('Meg Ryan',),
#  ('Lana Wachowski',),
#  ('Lilly Wachowski',),
#  ('Dario Argento',),
#  ('Angus MacLane',),
#  ('Clint Eastwood',),
#  ('Lee Unkrich',),
#  ('John Lasseter',),
#  ('Joel Coen',),
#  ('Alexander Mackendrick',),
#  ('Ethan Coen',),
#  ('Tom Mankiewicz',),
#  ('Stan Dragoti',),
#  ('Mike Nichols',),
#  ('Kevin Pollak',),
#  ('Roger Spottiswoode',),
#  ('Joe Dante',),
#  ('John Patrick Shanley',),
#  ('Garry Marshall',),
#  ('John Lee Hancock',),
#  ('Brian DePalma',)]

* What is the name of the director Tom Hanks has worked with the most?

In [None]:


# [('Steven Spielberg', 5)]

* What are the names of all the writers Tom Hanks has worked with?

> Last question.

In [None]:


# [('Eric Roth',),
#  ('Nia Vardalos',),
#  ('Tom Hanks',),
#  ('Gary Ross',),
#  ('Anne Spielberg',),
#  ('Chris Paine',),
#  ('Scott Frank',),
#  ('Robert Rodat',),
#  ('Frank Darabont',),
#  ('Tom Tykwer',),
#  ('Max Allan Collins',),
#  ('David Self',),
#  ('Richard Piers Rayner',),
#  ('Steve Purcell (II)',),
#  ('Jeff Nathanson',),
#  ('Sacha Gervasi',),
#  ('Nora Ephron',),
#  ('Delia Ephron',),
#  ('Mikls Lszl',),
#  ('Billy Ray',),
#  ('David Koepp',),
#  ('Akiva Goldsman',),
#  ('Dave Eggers',),
#  ('James Ponsoldt',),
#  ('Joel Coen',),
#  ('Ethan Coen',),
#  ('Matt Charman',),
#  ('Matthew Charman',),
#  ('David Seltzer',),
#  ('Erik Jendresen',),
#  ('Josh Singer',),
#  ('Liz Hannah',),
#  ('Lilly Wachowski',),
#  ('Lana Wachowski',),
#  ('Lowell Ganz',),
#  ('Bruce Jay Friedman',),
#  ('Babaloo Mandel',),
#  ('Brian Grazer',),
#  ('William Broyles',),
#  ('Dario Argento',),
#  ('Andrew Stanton',),
#  ('Todd Komarnicki',),
#  ('John Lasseter',),
#  ('Lee Unkrich',),
#  ('Michael Arndt',),
#  ('Robert Zemeckis',),
#  ('Joss Whedon',),
#  ('Joel Cohen',),
#  ('Alec Sokolow',),
#  ('William Rose',),
#  ('Al Reinert',),
#  ('Tom Mankiewicz',),
#  ('Dan Aykroyd',),
#  ('Alan Zweibel',),
#  ('Robert Klane',),
#  ('Norman Steinberg',),
#  ('Aaron Sorkin',),
#  ('Kim Wilson',),
#  ('Kevin Pollak',),
#  ('John Varhous',),
#  ('John Vorhaus',),
#  ('Michael Blodgett',),
#  ('Dennis Shryack',),
#  ('Daniel Petrie Jr.',),
#  ('Timothy Harris',),
#  ('Dana Olsen',),
#  ('Robert Collector',),
#  ('Mitch Markowitz',),
#  ('David S. Ward',),
#  ('Jeff Arch',),
#  ('Larry Atlas',),
#  ('John Patrick Shanley',),
#  ('Michael Preminger',),
#  ('Larry Grusin',),
#  ('Rick Podell',),
#  ('Kelly Marcel',),
#  ('Sue Smith',),
#  ('Michael Cristofer',)]

### Conclusion
The movie database we queried during this lab had a multitude of relationships between the tables. We saw how we could use JOIN to connect the tables, in order query information about entities in different tables. 