## 如何理解Pandas 和 Numpy里的axis

In [1]:
import pandas as pd

In [2]:
drinks = pd.DataFrame([['Afghanistan', 0, 0, 0, 0.0, 'Asia'], 
                      ['Albania', 89, 132, 54,4.9, 'Eurpoe'],
                      ['Algeria', 25, 0, 14, 0.7, 'Africa'],
                      ['Andorra', 245, 138, 312, 12.4, 'Eurpoe'],
                      ['Angola', 217, 57, 45, 5.9, 'Africa']], 
                      columns=['country', 
                               'beer-servings',
                               'spirit_servings',
                               'wine_servings',
                               'total_liters_of_pure_alcohol',
                               'continent'])

In [3]:
drinks

Unnamed: 0,country,beer-servings,spirit_servings,wine_servings,total_liters_of_pure_alcohol,continent
0,Afghanistan,0,0,0,0.0,Asia
1,Albania,89,132,54,4.9,Eurpoe
2,Algeria,25,0,14,0.7,Africa
3,Andorra,245,138,312,12.4,Eurpoe
4,Angola,217,57,45,5.9,Africa


In [4]:
drinks.drop('continent', axis=1)

Unnamed: 0,country,beer-servings,spirit_servings,wine_servings,total_liters_of_pure_alcohol
0,Afghanistan,0,0,0,0.0
1,Albania,89,132,54,4.9
2,Algeria,25,0,14,0.7
3,Andorra,245,138,312,12.4
4,Angola,217,57,45,5.9


In [13]:
# 相当于
drinks.drop('continent', axis="columns")

Unnamed: 0,country,beer-servings,spirit_servings,wine_servings,total_liters_of_pure_alcohol
0,Afghanistan,0,0,0,0.0
1,Albania,89,132,54,4.9
2,Algeria,25,0,14,0.7
3,Andorra,245,138,312,12.4
4,Angola,217,57,45,5.9


In [5]:
drinks.drop(1)

Unnamed: 0,country,beer-servings,spirit_servings,wine_servings,total_liters_of_pure_alcohol,continent
0,Afghanistan,0,0,0,0.0,Asia
2,Algeria,25,0,14,0.7,Africa
3,Andorra,245,138,312,12.4,Eurpoe
4,Angola,217,57,45,5.9,Africa


##### 0 is the row axis, and 1 is the column axis. When you drop with axis=1, that means drop a column. When you take the mean with axis=1, that means the operation should “move across” the column axis, which produces row means.
##### 指的就是一种更加容易理解的方式，“0就是行的axis，1就是列的axis，当以axis=1来drop，那么就是drop一个column，而axis=1 来取mean，那么就是这个操作‘穿越’了列的axis，产生了行上的mean”。

#### 那么，mean操作呢？

In [6]:
# axis 0得出了每一列的均值
drinks.mean(axis=0)

beer-servings                   115.20
spirit_servings                  65.40
wine_servings                    85.00
total_liters_of_pure_alcohol      4.78
dtype: float64

In [7]:
# axis 1得出了则是每一行的均值
drinks.mean(axis=1)

0      0.000
1     69.975
2      9.925
3    176.850
4     81.225
dtype: float64

#### 在Numpy里呢 ?

In [8]:
import numpy as np

In [10]:
b = np.arange(12).reshape(3,4)
b

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [11]:
# axis为1的时候得出的是每行的sum
b.sum(axis=1)

array([ 6, 22, 38])

In [12]:
# axis为0的时候得出了每列的sum
b.sum(axis=0)

array([12, 15, 18, 21])

###### 由此可见，axis为1代表水平方向上的操作，axis为0代表垂直方向上的操作，比如axis为1的sum得出的就是每一行的和

In [14]:
b[0,1]

1

In [15]:
b[1,0]

4