## Pandas
----
- 常被使用來做資料整理、分析的套件
- [user_guide](https://pandas.pydata.org/docs/user_guide/index.html)
- [API Reference](https://pandas.pydata.org/pandas-docs/stable/reference/index.html)
- [參考資料](https://github.com/victorgau/KHPY20180820)

In [None]:
import pandas as pd

In [None]:
%matplotlib inline

import numpy as np
from datetime import datetime

In [None]:
pd.__version__

- Serial

In [None]:
s1 = pd.Series([1, 2, 3, 4, 5])
s1

In [None]:
s2 = pd.Series([1, 2, 3, 4, 5])
s2

In [None]:
s1+s2

- DataFrame

In [None]:
data = np.random.randn(10, 4)

In [None]:
df = pd.DataFrame(data)
df

In [None]:
df.columns = ['No1', 'No2', 'No3', 'No4']
df

In [None]:
df.index = pd.date_range('2016-01-01', periods=10)
df

In [None]:
df.loc['2016-01-06']

In [None]:
df['No1']

In [None]:
df.iloc[1]

## 使用 Pandas 作資料存取

### 使用 read_csv() 讀取資料
- [ref](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)

In [None]:
#df = pd.read_csv("2018_kh_data.csv", encoding="utf8", header=0)
df = pd.read_csv("2018_kh_data.csv", encoding="Big5", header=0)

In [None]:
df.head()

In [None]:
df.tail()

In [None]:
df.drop(df.index[len(df.index)-1], inplace=True)
df

## 資料視覺化
- [ref](https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html)

In [None]:
s = pd.Series(np.random.randn(10), index=np.arange(10))
s.plot()

In [None]:
s.plot(kind="bar")

In [None]:
df = pd.DataFrame(np.random.randn(10, 4), columns=list('ABCD'))
df.plot()

In [None]:
df.plot(kind='bar')

### test

In [None]:
df = pd.read_csv("stock_data.csv", encoding="Big5", header=0)

In [None]:
df

In [None]:
df.plot()

In [None]:
df['close'].plot()

### 資料分組與聚合

In [None]:
data = np.random.randn(10, 4)

In [None]:
df = pd.DataFrame(data, columns=['No1', 'No2', 'No3', 'No4'])
df

In [None]:
df['Category'] = np.NaN
df

In [None]:
clist = ['C1'] * 3 + ['C2'] * 5 + ['C3'] * 2

In [None]:
clist

In [None]:
df['Category'] = np.random.permutation(clist)

In [None]:
df

In [None]:
groups = df.groupby('Category')

In [None]:
groups

In [None]:
groups.mean()

### Pandas進階補充
- [可以參考](https://github.com/victorgau/khpy_pandas_intro)

## Pandas: csv資料處理 (以公共腳踏車為例)

## Pandas: 資料庫