In [1]:
import pandas as pd

## How to use the "axis" parameter in pandas? 

In [2]:
# read a dataset of alcohol consumption into a DataFrame

drinks = pd.read_csv('http://bit.ly/drinksbycountry')
drinks.head()

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol,continent
0,Afghanistan,0,0,0,0.0,Asia
1,Albania,89,132,54,4.9,Europe
2,Algeria,25,0,14,0.7,Africa
3,Andorra,245,138,312,12.4,Europe
4,Angola,217,57,45,5.9,Africa


In [3]:
# drop a column (temporarily)

drinks.drop('continent', axis=1).head()

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol
0,Afghanistan,0,0,0,0.0
1,Albania,89,132,54,4.9
2,Algeria,25,0,14,0.7
3,Andorra,245,138,312,12.4
4,Angola,217,57,45,5.9


In [4]:
# drop a row (temporarily)

drinks.drop(2, axis=0).head()

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol,continent
0,Afghanistan,0,0,0,0.0,Asia
1,Albania,89,132,54,4.9,Europe
3,Andorra,245,138,312,12.4,Europe
4,Angola,217,57,45,5.9,Africa
5,Antigua & Barbuda,102,128,45,4.9,North America


When **referring to rows or columns** with the axis parameter:

- **axis 0** refers to rows
- **axis 1** refers to columns

In [5]:
# calculate the mean of each numeric column

drinks.mean()

  drinks.mean()


beer_servings                   106.160622
spirit_servings                  80.994819
wine_servings                    49.450777
total_litres_of_pure_alcohol      4.717098
dtype: float64

In [6]:
# or equivalently, specify the axis explicitly

drinks.mean(axis=0)

  drinks.mean(axis=0)


beer_servings                   106.160622
spirit_servings                  80.994819
wine_servings                    49.450777
total_litres_of_pure_alcohol      4.717098
dtype: float64

In [7]:
# calculate the mean of each row

drinks.mean(axis=1).head()

  drinks.mean(axis=1).head()


0      0.000
1     69.975
2      9.925
3    176.850
4     81.225
dtype: float64

When performing a **mathematical operation** with the axis parameter:

- **axis 0** means the operation should "move down" the row axis
- **axis 1** means the operation should "move across" the column axis

In [8]:
# 'index' is an alias for axis 0

drinks.mean(axis='index')

  drinks.mean(axis='index')


beer_servings                   106.160622
spirit_servings                  80.994819
wine_servings                    49.450777
total_litres_of_pure_alcohol      4.717098
dtype: float64

In [9]:
# 'columns' is an alias for axis 1

drinks.mean(axis='columns').head()

  drinks.mean(axis='columns').head()


0      0.000
1     69.975
2      9.925
3    176.850
4     81.225
dtype: float64