## Numpy Arrays

### Getting started
numpy数组相较于list更加灵活和高效（并行计算）
下面的例子将list转为numpy数组

In [11]:
height = [1.87,  1.87, 1.82, 1.91, 1.90, 1.85]
weight = [81.65, 97.52, 95.25, 92.98, 86.18, 88.45]

import numpy as np

np_height = np.array(height)
np_weight = np.array(weight)
np_weight, np_height

(array([81.65, 97.52, 95.25, 92.98, 86.18, 88.45]),
 array([1.87, 1.87, 1.82, 1.91, 1.9 , 1.85]))

numpy数组的类型：numpy.ndarray

In [3]:
type(np_height)

numpy.ndarray

### 对应元素逐个相乘（Element-wise calculations）
只需一个等式即可计算整个数组，并且速度相当可观

In [5]:
# 计算bmi
bmi = np_weight / np_height ** 2
bmi

array([23.34925219, 27.88755755, 28.75558507, 25.48723993, 23.87257618,
       25.84368152])

### 子集（Subsetting）
选出整个数组中bmi值大于25的元素只需要一句代码

In [8]:
# 真值数组
bmi > 25

array([False,  True,  True,  True, False,  True])

In [14]:
# 只显示对应True的，等价于 bmi[[False,  True,  True,  True, False,  True]]
bmi[bmi > 25]

array([27.88755755, 28.75558507, 25.48723993, 25.84368152])

## Pandas Basics

### Pandas DataFrames
pandas基于numpy，其核心数据结构是DataFrame。DataFrame能轻松存储和操纵表格数据（行和列）。

创建DataFrame有许多方式，其一是从字典构建

In [15]:
dict = {"country": ["Brazil", "Russia", "India", "China", "South Africa"],
       "capital": ["Brasilia", "Moscow", "New Dehli", "Beijing", "Pretoria"],
       "area": [8.516, 17.10, 3.286, 9.597, 1.221],
       "population": [200.4, 143.5, 1252, 1357, 52.98] }

import pandas as pd
brics = pd.DataFrame(dict)
brics

Unnamed: 0,country,capital,area,population
0,Brazil,Brasilia,8.516,200.4
1,Russia,Moscow,17.1,143.5
2,India,New Dehli,3.286,1252.0
3,China,Beijing,9.597,1357.0
4,South Africa,Pretoria,1.221,52.98


pandas为每个国家添加了对应的索引（0-4），也可以改变这个key

In [16]:
# 设置索引
brics.index = ["BR", "RU", "IN", "CH", "SA"]

brics

Unnamed: 0,country,capital,area,population
BR,Brazil,Brasilia,8.516,200.4
RU,Russia,Moscow,17.1,143.5
IN,India,New Dehli,3.286,1252.0
CH,China,Beijing,9.597,1357.0
SA,South Africa,Pretoria,1.221,52.98


创建DataFrame的另一种方式是从csv文件导入。

In [17]:
import pandas as pd

cars = pd.read_csv('cars.csv')
cars

Unnamed: 0,cars_per_cap,country,drives_right
US,809,United States,True
AUS,731,Australia,False
JAP,588,Japan,False
IN,18,India,False
RU,200,Russia,True
MOR,70,Morocco,True
EG,45,Egypt,True


### 从DataFrames中取元素
最简单的一种是使用中括号

In [20]:
import pandas as pd
# index_col = 0表示以第0列为索引（key）
cars = pd.read_csv('cars.csv', index_col = 0)

#  Pandas Series
print(cars['cars_per_cap'])
print("========================")
#  Pandas DataFrame
print(cars[['cars_per_cap']])
print("========================")
# 指定多列
print(cars[['cars_per_cap', 'country']])

US     809
AUS    731
JAP    588
IN      18
RU     200
MOR     70
EG      45
Name: cars_per_cap, dtype: int64
     cars_per_cap
US            809
AUS           731
JAP           588
IN             18
RU            200
MOR            70
EG             45
     cars_per_cap        country
US            809  United States
AUS           731      Australia
JAP           588          Japan
IN             18          India
RU            200         Russia
MOR            70        Morocco
EG             45          Egypt


中括号同样能获取行

In [22]:
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# 获取前4条数据 [0，4）
print(cars[0:4])
print("========================================")
# [4，5）
print(cars[4:6])

     cars_per_cap        country  drives_right
US            809  United States          True
AUS           731      Australia         False
JAP           588          Japan         False
IN             18          India         False
     cars_per_cap  country  drives_right
RU            200   Russia          True
MOR            70  Morocco          True


使用 `loc` 和 `iloc` 选择任意行或列，`loc` 基于标签名字， `iloc` 基于索引数字。

In [26]:
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# 第3行数据（Pandas Series）
print(cars.iloc[2])
print()
# Australia and Egypt
print(cars.loc[['AUS', 'EG']])
print()
# 选取前两列, 上述两种操作省略了列
print(cars.iloc[:, [0,1]])

cars_per_cap      588
country         Japan
drives_right    False
Name: JAP, dtype: object

     cars_per_cap    country  drives_right
AUS           731  Australia         False
EG             45      Egypt          True

     cars_per_cap        country
US            809  United States
AUS           731      Australia
JAP           588          Japan
IN             18          India
RU            200         Russia
MOR            70        Morocco
EG             45          Egypt
