While executing operations on a pandas Dataframe, like subtract Dataframe(sub()) and some of the aggregation methods like sum(), 
mean() etc in descriptive statistics, use "axis" argument.

"Axis" can be specified by name or integer:

Series: no axis argument needed.

DataFrame: “index” (axis=0, default), “columns” (axis=1)

In [2]:
import pandas as pd

In [3]:
drinks = pd.read_csv('http://bit.ly/drinksbycountry')

In [5]:
drinks.head() # Look of first 5 rows

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol,continent
0,Afghanistan,0,0,0,0.0,Asia
1,Albania,89,132,54,4.9,Europe
2,Algeria,25,0,14,0.7,Africa
3,Andorra,245,138,312,12.4,Europe
4,Angola,217,57,45,5.9,Africa


Above each row represents, country and their alcohol consumptions per adult.

Assume, 'continent' column is not used much and will drop that column from 'drinks' Dataframe as below:

In [6]:
drinks.drop('continent', axis=1).head() # head(): we are only viewing only first 5 rows of the resulting Dataframe. axis=1(columns)
# drop() : Remove rows or columns by specifying label names and corresponding axis

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol
0,Afghanistan,0,0,0,0.0
1,Albania,89,132,54,4.9
2,Algeria,25,0,14,0.7
3,Andorra,245,138,312,12.4
4,Angola,217,57,45,5.9


Above shows, 'continent' column is dropped from 'drinks' Dataframe and had to specify "axis=1" to say to drop a column.
Since we did not commit by saying " inplace=True ", so it actually did not remove the column 'continent', its just short-term.

Next, Assume i wanted to drop a row, which is row 2(index) as below: 

drinks.drop(2, axis=0).head() # axis=0(Rows)

Above results says, 'continent' column is back here, but row(index)2 is dropped.

So, here instead of creating 2 different methods for dropping column and dropping rows separately, pandas thought to make a single method called "drop()" method with 'axis' parameter where 'axis=0' for 'rows' and 'axis=1' for 'columns'. 
So whenever you run a pandas command and above are the ways we specify which we want to refer to columns(axis=1) and rows(axis=0).

###### Trying 'Axis' in various parameter:

Mathematical operation 'mean'

In [9]:
drinks.mean()

beer_servings                   106.160622
spirit_servings                  80.994819
wine_servings                    49.450777
total_litres_of_pure_alcohol      4.717098
dtype: float64

Above results says, mean of the each of the numeric columns in 'drinks' Dataframe and columns returned are Series(beer_servings Series, spirit_servings Series, wine_servings Series and total_litres_of_pure_alcohol Series).

So why did it give mean values of each Series?
Firstly this is doing column means and thats because the default behavior of the mean is axis=0(rows).

In [10]:
drinks.mean(axis=0)

beer_servings                   106.160622
spirit_servings                  80.994819
wine_servings                    49.450777
total_litres_of_pure_alcohol      4.717098
dtype: float64

drinks.mean() and drinks.mean(axis=0) are giving same results. That says "axis=0" is default value in mean() method.

In [11]:
drinks.mean(axis=1)

0        0.000
1       69.975
2        9.925
3      176.850
4       81.225
        ...   
188    110.925
189     29.000
190      1.525
191     14.375
192     22.675
Length: 193, dtype: float64

eg:  row 1(index):  (89 + 132 + 54 + 4.9)	= 279.9  : mean = 279.9/4 = 69.975

drinks.mean(axis=0) meaning, its saying mean operation wanted to move down(across the row axis), direction is to move down. its like imagine that these 4 numeric columns (beer_servings , spirit_servings , wine_servings and total_litres_of_pure_alcohol) kind of being collapsed down into a single set of 4 numbers which represents the mean of each column. 

So instead if you are looking for the mean of each row, then drinks.mean(axis=1), we can think of mean operation moving along the column (axis=1) axis (side by side) [eg: row 1(index): (89 + 132 + 54 + 4.9) = 279.9 : mean = 279.9/4 = 69.975 ].
So this is not a meaningful number because each column value has different units or servings or liters.

So this explaination says how axis value of 0 or 1 works in mean() method. 

To support above , when you are moving along the axis=1 (column), you keep all the rows which is 193 rows in 'drinks' Dataframe which has a result of shape 193 as below:

In [14]:
drinks.mean(axis=1).shape 

(193,)

When we did axis=0(row), we got a shape of 4 because it was collapsing all the columns(rows in individual columns) down into one number as below:

In [15]:
drinks.mean(axis=0).shape 

(4,)

###### Useful tip: Aliases for Axis numbers

Instead of Axis=0, you can say Axis='index' as a string. Both results the same output as below :

In [16]:
drinks.mean(axis='index').shape

(4,)

Instead of Axis=1, you can say Axis='columns' as a string. Both results the same output as below :

In [17]:
drinks.mean(axis='columns').shape

(193,)

Choose above of two which is opt, where both (0,1) and ('index', 'columns') are valid in pandas.