# Self join
Merging a table to itself can be useful when you want to compare values in a column to other values in the same column. In this exercise, you will practice this by creating a table that for each movie will list the movie director and a member of the crew on one row. 

You have been given a table called **crews**, which has columns **id**, **job**, and **name**. 
- First, merge the table to itself using the movie ID. This merge will give you a larger table where for each movie, every job is matched against each other. 
- Then select only those rows with a director in the left table, and avoid having a row where the director's job is listed in both the left and right tables. This filtering will remove job combinations that aren't with the director.

In [3]:
import pandas as pd

crews = pd.read_pickle(r'datasets/crews.p')
crews

Unnamed: 0,id,department,job,name
0,19995,Editing,Editor,Stephen E. Rivkin
2,19995,Sound,Sound Designer,Christopher Boyes
4,19995,Production,Casting,Mali Finn
6,19995,Directing,Director,James Cameron
7,19995,Writing,Writer,James Cameron
...,...,...,...,...
129574,126186,Directing,Director,Daniel Hsia
129576,25975,Production,Executive Producer,Clark Peterson
129578,25975,Directing,Director,Brian Herzlinger
129579,25975,Directing,Director,Jon Gunn


In [29]:
# Merge the crews table to itself
crews_self_merged = pd.merge(
    left=crews,
    right=crews,
    on='id',
    how='inner',
    suffixes=['_dir', '_crew']
)



# Create a Boolean index to select the appropriate
boolean_filter = ((crews_self_merged['job_dir'] == 'Director') & (crews_self_merged['job_crew'] != 'Director'))
direct_crews = crews_self_merged[boolean_filter]

# Print the first few rows of direct_crews
direct_crews.head()

Unnamed: 0,id,department_dir,job_dir,name_dir,department_crew,job_crew,name_crew
156,19995,Directing,Director,James Cameron,Editing,Editor,Stephen E. Rivkin
157,19995,Directing,Director,James Cameron,Sound,Sound Designer,Christopher Boyes
158,19995,Directing,Director,James Cameron,Production,Casting,Mali Finn
160,19995,Directing,Director,James Cameron,Writing,Writer,James Cameron
161,19995,Directing,Director,James Cameron,Art,Set Designer,Richard F. Mays


Great job! By merging the table to itself, you compared the value of the __director__ from the jobs column to other values from the jobs column. With the output, you can quickly see different movie directors and the people they worked with in the same movie.