# Introduction

Oftentimes data will come to us with column names, index names, or other naming conventions that we are not satisfied with. In that case, you'll learn how to use pandas functions to change the names of the offending entries to something better.

You'll also explore how to combine data from multiple DataFrames and/or Series.


# Renaming

The first function we'll introduce here is `rename()`, which lets you change index names and/or column names. For example, to change the `points` column in our dataset to `score`, we would do:

In [1]:

import pandas as pd
#pd.set_option('max_rows', 5)
reviews = pd.read_csv("winemag-data-130k-v2.csv", index_col=0)

In [3]:
reviews.head()

Unnamed: 0_level_0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,dupe?
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
94355,Austria,"""Chremisa,"" the ancient name of Krems, is comm...",Edition Chremisa Sandgrube 13,85,24.0,Niederösterreich,,,Roger Voss,@vossroger,Winzer Krems 2011 Edition Chremisa Sandgrube 1...,Grüner Veltliner,Winzer Krems,
126883,US,$10 for this very drinkable Cab? That's crazy....,,87,10.0,California,North Coast,North Coast,Virginie Boone,@vboone,Line 39 2009 Cabernet Sauvignon (North Coast),Cabernet Sauvignon,Line 39,
119493,US,$14 is a pretty good price for a Chardonnay th...,Whiplash,86,14.0,California,California,California Other,,,Jamieson Ranch 2011 Whiplash Chardonnay (Calif...,Chardonnay,Jamieson Ranch,
126909,Spain,"). Earth, cola and leather aromas are good, ho...",Finca Resalso,86,15.0,Northern Spain,Ribera del Duero,,Michael Schachner,@wineschach,Emilio Moro 2009 Finca Resalso (Ribera del Du...,Tinto Fino,Emilio Moro,
119752,Spain,). Light and lemony on the nose. The palate ha...,,87,17.0,Galicia,Rías Baixas,,Michael Schachner,@wineschach,La Caña 2010 Albariño (Rías Baixas),Albariño,La Caña,


In [5]:
reviews.rename(columns={'points': 'score'})

Unnamed: 0_level_0,country,description,designation,score,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,dupe?
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
94355,Austria,"""Chremisa,"" the ancient name of Krems, is comm...",Edition Chremisa Sandgrube 13,85,24.0,Niederösterreich,,,Roger Voss,@vossroger,Winzer Krems 2011 Edition Chremisa Sandgrube 1...,Grüner Veltliner,Winzer Krems,
126883,US,$10 for this very drinkable Cab? That's crazy....,,87,10.0,California,North Coast,North Coast,Virginie Boone,@vboone,Line 39 2009 Cabernet Sauvignon (North Coast),Cabernet Sauvignon,Line 39,
119493,US,$14 is a pretty good price for a Chardonnay th...,Whiplash,86,14.0,California,California,California Other,,,Jamieson Ranch 2011 Whiplash Chardonnay (Calif...,Chardonnay,Jamieson Ranch,
126909,Spain,"). Earth, cola and leather aromas are good, ho...",Finca Resalso,86,15.0,Northern Spain,Ribera del Duero,,Michael Schachner,@wineschach,Emilio Moro 2009 Finca Resalso (Ribera del Du...,Tinto Fino,Emilio Moro,
119752,Spain,). Light and lemony on the nose. The palate ha...,,87,17.0,Galicia,Rías Baixas,,Michael Schachner,@wineschach,La Caña 2010 Albariño (Rías Baixas),Albariño,La Caña,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
80210,Italy,Zonchera is Ceretto's more affordable base Bar...,Zonchera,90,48.0,Piedmont,Barolo,,,,Ceretto 2004 Zonchera (Barolo),Nebbiolo,Ceretto,
76487,Italy,Zonin's 2006 Amarone opens with very ripe arom...,,88,70.0,Veneto,Amarone della Valpolicella,,,,Zonin 2006 Amarone della Valpolicella,"Corvina, Rondinella, Molinara",Zonin,
86953,Italy,Zorzettig's precious Picolit dessert wine deli...,,90,,Northeastern Italy,Colli Orientali del Friuli,,,,Zorzettig 2006 Picolit (Colli Orientali del Fr...,Picolit,Zorzettig,
18824,US,Zucca has made a fragrant and floral Sangioves...,Sangiovese Rosato,87,18.0,California,Amador County,Sierra Foothills,Virginie Boone,@vboone,Zucca 2010 Sangiovese Rosato Rosé (Amador County),Rosé,Zucca,


`rename()` lets you rename index _or_ column values by specifying a `index` or `column` keyword parameter, respectively. It supports a variety of input formats, but usually a Python dictionary is the most convenient. Here is an example using it to rename some elements of the index.

In [7]:
reviews.rename(index={0: 'firstEntry', 1: 'secondEntry'})

Unnamed: 0_level_0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,dupe?
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
94355,Austria,"""Chremisa,"" the ancient name of Krems, is comm...",Edition Chremisa Sandgrube 13,85,24.0,Niederösterreich,,,Roger Voss,@vossroger,Winzer Krems 2011 Edition Chremisa Sandgrube 1...,Grüner Veltliner,Winzer Krems,
126883,US,$10 for this very drinkable Cab? That's crazy....,,87,10.0,California,North Coast,North Coast,Virginie Boone,@vboone,Line 39 2009 Cabernet Sauvignon (North Coast),Cabernet Sauvignon,Line 39,
119493,US,$14 is a pretty good price for a Chardonnay th...,Whiplash,86,14.0,California,California,California Other,,,Jamieson Ranch 2011 Whiplash Chardonnay (Calif...,Chardonnay,Jamieson Ranch,
126909,Spain,"). Earth, cola and leather aromas are good, ho...",Finca Resalso,86,15.0,Northern Spain,Ribera del Duero,,Michael Schachner,@wineschach,Emilio Moro 2009 Finca Resalso (Ribera del Du...,Tinto Fino,Emilio Moro,
119752,Spain,). Light and lemony on the nose. The palate ha...,,87,17.0,Galicia,Rías Baixas,,Michael Schachner,@wineschach,La Caña 2010 Albariño (Rías Baixas),Albariño,La Caña,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
80210,Italy,Zonchera is Ceretto's more affordable base Bar...,Zonchera,90,48.0,Piedmont,Barolo,,,,Ceretto 2004 Zonchera (Barolo),Nebbiolo,Ceretto,
76487,Italy,Zonin's 2006 Amarone opens with very ripe arom...,,88,70.0,Veneto,Amarone della Valpolicella,,,,Zonin 2006 Amarone della Valpolicella,"Corvina, Rondinella, Molinara",Zonin,
86953,Italy,Zorzettig's precious Picolit dessert wine deli...,,90,,Northeastern Italy,Colli Orientali del Friuli,,,,Zorzettig 2006 Picolit (Colli Orientali del Fr...,Picolit,Zorzettig,
18824,US,Zucca has made a fragrant and floral Sangioves...,Sangiovese Rosato,87,18.0,California,Amador County,Sierra Foothills,Virginie Boone,@vboone,Zucca 2010 Sangiovese Rosato Rosé (Amador County),Rosé,Zucca,


You'll probably rename columns very often, but rename index values very rarely.  For that, `set_index()` is usually more convenient.

Both the row index and the column index can have their own `name` attribute. The complimentary `rename_axis()` method may be used to change these names. For example:

In [9]:
reviews.rename_axis("wines", axis='rows').rename_axis("fields", axis='columns')

fields,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,dupe?
wines,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
94355,Austria,"""Chremisa,"" the ancient name of Krems, is comm...",Edition Chremisa Sandgrube 13,85,24.0,Niederösterreich,,,Roger Voss,@vossroger,Winzer Krems 2011 Edition Chremisa Sandgrube 1...,Grüner Veltliner,Winzer Krems,
126883,US,$10 for this very drinkable Cab? That's crazy....,,87,10.0,California,North Coast,North Coast,Virginie Boone,@vboone,Line 39 2009 Cabernet Sauvignon (North Coast),Cabernet Sauvignon,Line 39,
119493,US,$14 is a pretty good price for a Chardonnay th...,Whiplash,86,14.0,California,California,California Other,,,Jamieson Ranch 2011 Whiplash Chardonnay (Calif...,Chardonnay,Jamieson Ranch,
126909,Spain,"). Earth, cola and leather aromas are good, ho...",Finca Resalso,86,15.0,Northern Spain,Ribera del Duero,,Michael Schachner,@wineschach,Emilio Moro 2009 Finca Resalso (Ribera del Du...,Tinto Fino,Emilio Moro,
119752,Spain,). Light and lemony on the nose. The palate ha...,,87,17.0,Galicia,Rías Baixas,,Michael Schachner,@wineschach,La Caña 2010 Albariño (Rías Baixas),Albariño,La Caña,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
80210,Italy,Zonchera is Ceretto's more affordable base Bar...,Zonchera,90,48.0,Piedmont,Barolo,,,,Ceretto 2004 Zonchera (Barolo),Nebbiolo,Ceretto,
76487,Italy,Zonin's 2006 Amarone opens with very ripe arom...,,88,70.0,Veneto,Amarone della Valpolicella,,,,Zonin 2006 Amarone della Valpolicella,"Corvina, Rondinella, Molinara",Zonin,
86953,Italy,Zorzettig's precious Picolit dessert wine deli...,,90,,Northeastern Italy,Colli Orientali del Friuli,,,,Zorzettig 2006 Picolit (Colli Orientali del Fr...,Picolit,Zorzettig,
18824,US,Zucca has made a fragrant and floral Sangioves...,Sangiovese Rosato,87,18.0,California,Amador County,Sierra Foothills,Virginie Boone,@vboone,Zucca 2010 Sangiovese Rosato Rosé (Amador County),Rosé,Zucca,


# Combining

When performing operations on a dataset, we will sometimes need to combine different DataFrames and/or Series in non-trivial ways. Pandas has three core methods for doing this. In order of increasing complexity, these are `concat()`, `join()`, and `merge()`. Most of what `merge()` can do can also be done more simply with `join()`, so we will omit it and focus on the first two functions here.

The simplest combining method is `concat()`. Given a list of elements, this function will smush those elements together along an axis.

This is useful when we have data in different DataFrame or Series objects but having the same fields (columns). One example: the [YouTube Videos dataset](https://www.kaggle.com/datasnaek/youtube-new), which splits the data up based on country of origin (e.g. Canada and the UK, in this example). If we want to study multiple countries simultaneously, we can use `concat()` to smush them together:

In [5]:
canadian_youtube = pd.read_csv("../input/youtube-new/CAvideos.csv") 
british_youtube = pd.read_csv("../input/youtube-new/GBvideos.csv") 

df = pd.concat([canadian_youtube, british_youtube])

Unnamed: 0,video_id,trending_date,title,channel_title,category_id,publish_time,tags,views,likes,dislikes,comment_count,thumbnail_link,comments_disabled,ratings_disabled,video_error_or_removed,description
0,n1WpP7iowLc,17.14.11,Eminem - Walk On Water (Audio) ft. Beyoncé,EminemVEVO,10,2017-11-10T17:00:03.000Z,"Eminem|""Walk""|""On""|""Water""|""Aftermath/Shady/In...",17158579,787425,43420,125882,https://i.ytimg.com/vi/n1WpP7iowLc/default.jpg,False,False,False,Eminem's new track Walk on Water ft. Beyoncé i...
1,0dBIkQ4Mz1M,17.14.11,PLUSH - Bad Unboxing Fan Mail,iDubbbzTV,23,2017-11-13T17:00:00.000Z,"plush|""bad unboxing""|""unboxing""|""fan mail""|""id...",1014651,127794,1688,13030,https://i.ytimg.com/vi/0dBIkQ4Mz1M/default.jpg,False,False,False,STill got a lot of packages. Probably will las...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
38914,-DRsfNObKIQ,18.14.06,Eleni Foureira - Fuego - Cyprus - LIVE - First...,Eurovision Song Contest,24,2018-05-08T20:32:32.000Z,"Eurovision Song Contest|""2018""|""Lisbon""|""Cypru...",14317515,151870,45875,26766,https://i.ytimg.com/vi/-DRsfNObKIQ/default.jpg,False,False,False,Eleni Foureira represented Cyprus at the first...
38915,4YFo4bdMO8Q,18.14.06,KYLE - Ikuyo feat. 2 Chainz & Sophia Black [A...,SuperDuperKyle,10,2018-05-11T04:06:35.000Z,"Kyle|""SuperDuperKyle""|""Ikuyo""|""2 Chainz""|""Soph...",607552,18271,274,1423,https://i.ytimg.com/vi/4YFo4bdMO8Q/default.jpg,False,False,False,Debut album 'Light of Mine' out now: http://ky...
