# Renaming And Combining
* Oftentimes data will come to us with column names, index names, or other naming conventions that we are not satisfied with. In that case, you'll learn how to use pandas functions to change the names of the offending entries to something better.



In [2]:
# importing the data on which we are going to work.
import pandas as pd
reviews = pd.read_csv('winemag_data\winemag-data-130k-v2.csv')

## Renaming
* to be able to rename columns or index labels in a dataframe, we can use the `rename()` method.

In [3]:
reviews.rename(columns={'points' : 'scores'}, index={0:'firstEntry'})

Unnamed: 0.1,Unnamed: 0,country,description,designation,scores,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
firstEntry,0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia
1,1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos
2,2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm
3,3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,87,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian
4,4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,87,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
129966,129966,Germany,Notes of honeysuckle and cantaloupe sweeten th...,Brauneberger Juffer-Sonnenuhr Spätlese,90,28.0,Mosel,,,Anna Lee C. Iijima,,Dr. H. Thanisch (Erben Müller-Burggraef) 2013 ...,Riesling,Dr. H. Thanisch (Erben Müller-Burggraef)
129967,129967,US,Citation is given as much as a decade of bottl...,,90,75.0,Oregon,Oregon,Oregon Other,Paul Gregutt,@paulgwine,Citation 2004 Pinot Noir (Oregon),Pinot Noir,Citation
129968,129968,France,Well-drained gravel soil gives this wine its c...,Kritt,90,30.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Gresser 2013 Kritt Gewurztraminer (Als...,Gewürztraminer,Domaine Gresser
129969,129969,France,"A dry style of Pinot Gris, this is crisp with ...",,90,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Marcel Deiss 2012 Pinot Gris (Alsace),Pinot Gris,Domaine Marcel Deiss


* Sometimes, we need to set a name for the whole table columns, or all rows. 
* indicating what each will represents
* ie: fields for columns.
* ie: records for rows.
* what we are doing actually is that we are trying to rename the axes labes of the dataframe. 
* so to change them we cann rename_axis, and determine which axis we want to change using **axis** parameter, then we send the name we want


In [4]:
reviews.rename_axis("wines", axis='rows').rename_axis("fields", axis='columns')

fields,Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
wines,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
0,0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia
1,1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos
2,2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm
3,3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,87,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian
4,4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,87,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
129966,129966,Germany,Notes of honeysuckle and cantaloupe sweeten th...,Brauneberger Juffer-Sonnenuhr Spätlese,90,28.0,Mosel,,,Anna Lee C. Iijima,,Dr. H. Thanisch (Erben Müller-Burggraef) 2013 ...,Riesling,Dr. H. Thanisch (Erben Müller-Burggraef)
129967,129967,US,Citation is given as much as a decade of bottl...,,90,75.0,Oregon,Oregon,Oregon Other,Paul Gregutt,@paulgwine,Citation 2004 Pinot Noir (Oregon),Pinot Noir,Citation
129968,129968,France,Well-drained gravel soil gives this wine its c...,Kritt,90,30.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Gresser 2013 Kritt Gewurztraminer (Als...,Gewürztraminer,Domaine Gresser
129969,129969,France,"A dry style of Pinot Gris, this is crisp with ...",,90,32.0,Alsace,Alsace,,Roger Voss,@vossroger,Domaine Marcel Deiss 2012 Pinot Gris (Alsace),Pinot Gris,Domaine Marcel Deiss


## Combining
* When performing operations on a dataset, we will sometimes need to combine different DataFrames and/or Series in non-trivial ways.
* Pandas has three core methods for doing this. In order of increasing complexity, these are **concat()**, **join()**, and **merge()**
* Most of what merge() can do can also be done more simply with join(), so we will omit it and focus on the first two functions here.
* The simplest combining method is concat(). Given a list of elements, this function will smush those elements together along an axis.
* **This is useful when we have data in different DataFrame or Series objects but having the same fields (columns).**

In [5]:
firstData = pd.DataFrame({'first element' : [1,2,3], 'secondelement': [4,5,6]})
firstData

Unnamed: 0,first element,secondelement
0,1,4
1,2,5
2,3,6


In [9]:
secondData = pd.DataFrame({'first element' : [7,8,9], 'secondelement': [14,51,16]})
secondData

Unnamed: 0,first element,secondelement
0,7,14
1,8,51
2,9,16


In [10]:
pd.concat([firstData, secondData])

Unnamed: 0,first element,secondelement
0,1,4
1,2,5
2,3,6
0,7,14
1,8,51
2,9,16


### Join
* The middlemost combiner in terms of complexity is join(). join() lets you combine different DataFrame objects which have an index in common.

In [12]:
thirdData = pd.DataFrame({'first element' : [7,8,9], 'secondelement': [14,51,16]})
thirdData

Unnamed: 0,first element,secondelement
0,7,14
1,8,51
2,9,16


In [13]:
fourth = pd.DataFrame({'third element' : [7,8,9], 'secondelement': [14,51,16]})
fourth

Unnamed: 0,third element,secondelement
0,7,14
1,8,51
2,9,16


In [14]:
fourth.join(thirdData, lsuffix='Left', rsuffix='Right')

Unnamed: 0,third element,secondelementLeft,first element,secondelementRight
0,7,14,7,14
1,8,51,8,51
2,9,16,9,16


In [15]:
reviews.set_index('wines')

KeyError: "None of ['wines'] are in the columns"