# *Column* Operations  


- [Selecting Columns](#Selecting-Columns)  
- [Creating new Dataframes based on Selected Columns from an Existing Dataframe](#Creating-new-Dataframes-based-on-Selected-Columns-from-an-Existing-Dataframe)  


- [Dropping Columns](#Dropping-Columns)  
- [Keeping Columns](#Keeping-Columns)  


- [Renaming Columns](#Renaming-Columns)  
- [Reordering Columns](#Reordering-Columns)    

    


In [1]:
import pandas as pd

In [2]:
# Read the CSV file and display top rows of the dataframe
df = pd.read_csv('https://raw.githubusercontent.com/justmarkham/pandas-videos/master/data/drinks.csv')
df.head()

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol,continent
0,Afghanistan,0,0,0,0.0,Asia
1,Albania,89,132,54,4.9,Europe
2,Algeria,25,0,14,0.7,Africa
3,Andorra,245,138,312,12.4,Europe
4,Angola,217,57,45,5.9,Africa


# Selecting Columns

### Selecting a Single Column

In [3]:
# Select the Country column and display the first ten rows
df['country'].head(10)

0          Afghanistan
1              Albania
2              Algeria
3              Andorra
4               Angola
5    Antigua & Barbuda
6            Argentina
7              Armenia
8            Australia
9              Austria
Name: country, dtype: object

### Selecting Multiple Columns

In [4]:
# Create a List with the selected column names
selected_columns = ['country', 'continent']

# Display just those columns
df[selected_columns].head()

Unnamed: 0,country,continent
0,Afghanistan,Asia
1,Albania,Europe
2,Algeria,Africa
3,Andorra,Europe
4,Angola,Africa


# Creating new Dataframes based on Selected Columns from an Existing Dataframe

In [5]:
# Create a new dataframe named df_continents that has the data from the country and continent columns 
selected_columns = ['country', 'continent']
df_continents = df[selected_columns]

# Display top rows
df_continents.head()

Unnamed: 0,country,continent
0,Afghanistan,Asia
1,Albania,Europe
2,Algeria,Africa
3,Andorra,Europe
4,Angola,Africa


# Dropping Columns  
- If we just want to drop a few columns from the dataframe, this is probably the way to go.

In [6]:
df.head(2)

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol,continent
0,Afghanistan,0,0,0,0.0,Asia
1,Albania,89,132,54,4.9,Europe


#### Create new Dataframe by Dropping selected columns from original dataframe  

In [7]:
# Drop beer servings and wine servings df 
# Note:  inplace=False so we are creating a new dataframe
df_no_beer_wine = df.drop(['beer_servings', 'wine_servings'], axis='columns', inplace=False)

df_no_beer_wine.head()

Unnamed: 0,country,spirit_servings,total_litres_of_pure_alcohol,continent
0,Afghanistan,0,0.0,Asia
1,Albania,132,4.9,Europe
2,Algeria,0,0.7,Africa
3,Andorra,138,12.4,Europe
4,Angola,57,5.9,Africa


# Keeping Columns  
- Particularly if we have many columns, rather than Dropping the one you don't want, it may be easier to just keep the ones you do. 
- Note: you can also reorder the columns in this process

In [8]:
df.head(2)

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol,continent
0,Afghanistan,0,0,0,0.0,Asia
1,Albania,89,132,54,4.9,Europe


#### Create new Dataframe by Keeping selected columns from original dataframe  

In [9]:
# Create new df_beer:  Keep continent', 'country', 'beer_servings columns
columns_to_keep = ['continent', 'country', 'beer_servings']
df_beer = df[columns_to_keep]

df_beer.head()

Unnamed: 0,continent,country,beer_servings
0,Asia,Afghanistan,0
1,Europe,Albania,89
2,Africa,Algeria,25
3,Europe,Andorra,245
4,Africa,Angola,217


# Renaming Columns  
- 'beer_servings' to 'Beer'  
-   'spirit_servings' to 'Spirits'
-   'wine_servings' to 'Wine'

In [10]:
# Create the new_columns Dictionary with the renamings in it
new_columns = {
               'beer_servings':'Beer', 
               'spirit_servings' : 'Spirits',
               'wine_servings' : 'Wine'
              }

# Rename the columns in the df dataframe (inplace!)
df.rename(columns = new_columns, inplace=True)
#df_renamed = df.rename(columns = new_columns, inplace=False)

# Display top rows
df.head()

Unnamed: 0,country,Beer,Spirits,Wine,total_litres_of_pure_alcohol,continent
0,Afghanistan,0,0,0,0.0,Asia
1,Albania,89,132,54,4.9,Europe
2,Algeria,25,0,14,0.7,Africa
3,Andorra,245,138,312,12.4,Europe
4,Angola,217,57,45,5.9,Africa


# Reordering Columns

In [11]:
df.columns

Index(['country', 'Beer', 'Spirits', 'Wine', 'total_litres_of_pure_alcohol',
       'continent'],
      dtype='object')

In [12]:
# Create a List containing the Reordered columns 
# Note:  This must include all the existing columns and use their correct column names!
new_cols = ['country', 'continent', 'total_litres_of_pure_alcohol', 'Beer', 'Wine', 'Spirits']
new_cols

['country',
 'continent',
 'total_litres_of_pure_alcohol',
 'Beer',
 'Wine',
 'Spirits']

In [13]:
df.head(2)

Unnamed: 0,country,Beer,Spirits,Wine,total_litres_of_pure_alcohol,continent
0,Afghanistan,0,0,0,0.0,Asia
1,Albania,89,132,54,4.9,Europe


In [14]:
# Create a new df_reordered dataframe based on the reordered columns of the df dataframe
df_reordered = df[new_cols]
df_reordered.head()

Unnamed: 0,country,continent,total_litres_of_pure_alcohol,Beer,Wine,Spirits
0,Afghanistan,Asia,0.0,0,0,0
1,Albania,Europe,4.9,89,54,132
2,Algeria,Africa,0.7,25,14,0
3,Andorra,Europe,12.4,245,312,138
4,Angola,Africa,5.9,217,45,57
