# 建立外部應用程式與資料庫的連結

## 郭耀仁

## 觀念

- 我們現在在做的事情是什麼？
    - 建立外部應用程式（External Applications）與資料庫的連結
- 這門課用的外部應用程式：
    - MySQL Workbench
    - R
    - Python
- 這門課用的資料庫系統與主機
    - MySQL@AWS RDS

## 觀念（2）

- 不同的外部應用程式對應不同的連結方式
- 不同的主機對應不同的連結方式
- 我們用的連結方式：

|外部應用程式|連結方式|
|----------|-------|
|MySQL Workbench|任意 IP 位址|
|R|任意 IP 位址|
|Python| 任意 IP 位址|

## 觀念（3）

- 我們現在所使用的連結方式（任意 IP 位址）**不推薦**在正式環境使用
- 公司內的連線設定需要 Case by case，沒有辦法一招打通關
    - 商業資料庫系統可以聯繫廠商提供技術支援（Tech Support）
    - 開源資料庫系統可以查找文件、委請資訊部門或外包

## 前置作業

- 調整 AWS RDS 的 Inbound Rule
- 0.0.0.0/0 指的是任意 IPv4 位址

![](img/chapter0304.png)

## 前置作業（2）

與資料庫連線的四要素

- 位址：rsqltrain.ced04jhfjfgi.ap-northeast-1.rds.amazonaws.com
- 通訊埠：3306
- 帳號：trainstudent
- 密碼：csietrain

# MySQL Workbench

## 如何安裝 MySQL Workbench

<https://dev.mysql.com/downloads/workbench/?utm_source=tuicool>

- Windows 使用者需要先安裝：
    - [Microsoft .NET Framework 4.5](https://www.microsoft.com/en-us/download/details.aspx?id=30653)
    - [Visual C++ Redistributable for Visual Studio 2015](https://www.microsoft.com/en-us/download/details.aspx?id=48145)

## 新增連線

![](img/workbench_01.png)

## 填寫主機位址與使用者名稱

![](img/workbench_02.png)

## 開啟連線

![](img/workbench_03.png)

## 輸入密碼

![](img/workbench_04.png)

## 連線成功

![](img/workbench_05.png)

## Hello World!

![](img/workbench_06.png)

# R

## 如何安裝 R 的開發環境

- [R](https://cran.r-project.org/)
- [RStudio](https://www.rstudio.com/products/rstudio/download/)

## 下載與載入套件

- 打開 RStudio，下載 `RMySQL` 套件
- 載入 `DBI` 套件

```r
install.packages("RMySQL")
library(DBI)
```

```r
# 利用 `dbConnect()` 建立連線
con <- dbConnect(RMySQL::MySQL(), 
                 dbname = "world",
                 host = "rsqltrain.ced04jhfjfgi.ap-northeast-1.rds.amazonaws.com",
                 port = 3306,
                 user = "trainstudent",
                 password = "csietrain")
```

```r
# 列出資料庫中所有的表格
dbListTables(con)

# 讀取資料庫中的一個表格
country <- dbReadTable(con, "country")
dbDisconnect(con) # 重要！
View(country)
```

## 檢視表格

- `country` 現在是一個 R 語言的資料框（DataFrame）物件
- 除了 `View()` 還可以透過一些 R 語言的函數觀察它：

```r
head(country)
tail(country)
nrow(country)
ncol(country)
dim(country)
names(country)
summary(country)
str(country)
```

## 讀取資料庫表格的部分資料

```r
library(DBI)

con <- dbConnect(RMySQL::MySQL(), 
                 dbname = "world",
                 host = "rsqltrain.ced04jhfjfgi.ap-northeast-1.rds.amazonaws.com",
                 port = 3306,
                 user = "trainstudent",
                 password = "csietrain")

twn <- dbGetQuery(con, statement = "SELECT * FROM country WHERE Name = 'Taiwan'")
twn

dbDisconnect(con)
```

# Python

## 如何建立 Python 的開發環境

- 使用 Google Colab 服務
- 註冊一個 Gmail 帳號

## 如何使用 Google Colab？

- 開啟 Google 雲端硬碟
- New
  - More
    - Connect more apps
      - Google Colaboratory

![替代文字](https://storage.googleapis.com/pyprg/new.png)

![替代文字](https://storage.googleapis.com/pyprg/connect_more.png)

![替代文字](https://storage.googleapis.com/pyprg/connect_colab.png)

## Google Colab 設定

- Runtime 請設為 Python 3
- 執行 Cell 的快捷鍵：Shift + Enter

## 用兩個模組連線

- `pymysql`
- `sqlalchemy`

```python
import pymysql
from sqlalchemy import create_engine
```

## 用 pandas 讀資料

```python
import pandas as pd
```

In [14]:
import pymysql
import pandas as pd

host = "rsqltrain.ced04jhfjfgi.ap-northeast-1.rds.amazonaws.com"
port = 3306
user = "trainstudent"
passwd = "csietrain"
db_name = "world"

conn = pymysql.connect(host, port=port, user=user, passwd=passwd, db=db_name)

In [9]:
df = pd.read_sql('SELECT * FROM country', con=conn)
df.head()

Unnamed: 0,Code,Name,Continent,Region,SurfaceArea,IndepYear,Population,LifeExpectancy,GNP,GNPOld,LocalName,GovernmentForm,HeadOfState,Capital,Code2
0,ABW,Aruba,North America,Caribbean,193.0,,103000,78.4,828.0,793.0,Aruba,Nonmetropolitan Territory of The Netherlands,Beatrix,129.0,AW
1,AFG,Afghanistan,Asia,Southern and Central Asia,652090.0,1919.0,22720000,45.9,5976.0,,Afganistan/Afqanestan,Islamic Emirate,Mohammad Omar,1.0,AF
2,AGO,Angola,Africa,Central Africa,1246700.0,1975.0,12878000,38.3,6648.0,7984.0,Angola,Republic,José Eduardo dos Santos,56.0,AO
3,AIA,Anguilla,North America,Caribbean,96.0,,8000,76.1,63.2,,Anguilla,Dependent Territory of the UK,Elisabeth II,62.0,AI
4,ALB,Albania,Europe,Southern Europe,28748.0,1912.0,3401200,71.6,3205.0,2500.0,Shqipëria,Republic,Rexhep Mejdani,34.0,AL


In [12]:
from sqlalchemy import create_engine

engine = create_engine('mysql+mysqldb://{}:{}@{}:{}/{}'.format(user, passwd, host, port, dbname))

In [13]:
df = pd.read_sql('SELECT * FROM country', con=engine)
df.head()

Unnamed: 0,Code,Name,Continent,Region,SurfaceArea,IndepYear,Population,LifeExpectancy,GNP,GNPOld,LocalName,GovernmentForm,HeadOfState,Capital,Code2
0,ABW,Aruba,North America,Caribbean,193.0,,103000,78.4,828.0,793.0,Aruba,Nonmetropolitan Territory of The Netherlands,Beatrix,129.0,AW
1,AFG,Afghanistan,Asia,Southern and Central Asia,652090.0,1919.0,22720000,45.9,5976.0,,Afganistan/Afqanestan,Islamic Emirate,Mohammad Omar,1.0,AF
2,AGO,Angola,Africa,Central Africa,1246700.0,1975.0,12878000,38.3,6648.0,7984.0,Angola,Republic,José Eduardo dos Santos,56.0,AO
3,AIA,Anguilla,North America,Caribbean,96.0,,8000,76.1,63.2,,Anguilla,Dependent Territory of the UK,Elisabeth II,62.0,AI
4,ALB,Albania,Europe,Southern Europe,28748.0,1912.0,3401200,71.6,3205.0,2500.0,Shqipëria,Republic,Rexhep Mejdani,34.0,AL
