# Enriching a dataset
Setting `how='left'` with the `.merge()` method is a useful technique for enriching or enhancing a dataset with additional information from a different table. In this exercise, you will start off with a sample of movie data from the movie series Toy Story. Your goal is to enrich this data by adding the marketing tag line for each movie. You will compare the results of a left join versus an inner join.

The *toy_story* DataFrame contains *the Toy Story* movies. The *toy_story* and *taglines* DataFrames have been loaded for you.

In [3]:
import pandas as pd

path=r'/media/documentos/Cursos/Data Science/Python/Data_Science_Python/data_sets/'

movies=pd.read_pickle(path+'movies.p')
toy_story=movies[movies['title'].str.contains('Toy Story', na=False)]
print('toy_story \n',toy_story.head(),'\n')

taglines=pd.read_pickle(path+'taglines.p')
#land_use.astype({'ward': 'int32'}).dtypes
print('taglines \n',taglines.head(),'\n')

toy_story 
          id        title  popularity release_date
103   10193  Toy Story 3   59.995418   2010-06-16
2637    863  Toy Story 2   73.575118   1999-10-30
3716    862    Toy Story   73.640445   1995-10-30 

taglines 
        id                                         tagline
0   19995                     Enter the World of Pandora.
1     285  At the end of the world, the adventure begins.
2  206647                           A Plan No One Escapes
3   49026                                 The Legend Ends
4   49529            Lost in our world, found in another. 



- Merge *toy_story* and *taglines* on the *id* column with a **left join**, and save the result as *toystory_tag*

In [4]:
# Merge the toy_story and taglines tables with a left join
toystory_tag = toy_story.merge(taglines,on='id',how='left')

# Print the rows and shape of toystory_tag
print(toystory_tag)
print(toystory_tag.shape)

      id        title  popularity release_date                   tagline
0  10193  Toy Story 3   59.995418   2010-06-16  No toy gets left behind.
1    863  Toy Story 2   73.575118   1999-10-30        The toys are back!
2    862    Toy Story   73.640445   1995-10-30                       NaN
(3, 5)


- With *toy_story* as the left table, merge to it *taglines* on the *id* column with an **inner join**, and save as *toystory_tag*.

In [5]:
# Merge the toy_story and taglines tables with a inner join
toystory_tag = toy_story.merge(taglines,on='id')

# Print the rows and shape of toystory_tag
print(toystory_tag)
print(toystory_tag.shape)

      id        title  popularity release_date                   tagline
0  10193  Toy Story 3   59.995418   2010-06-16  No toy gets left behind.
1    863  Toy Story 2   73.575118   1999-10-30        The toys are back!
(2, 5)
