# Access MySQL with R

This notebook shows how to access a MySQL database when using R.

This notebook runs on R with Spark 2.0.

## Table of contents

1. [Setup](#Setup)
1. [Import the *RMySQL* library](#Import-the-RMySQL-library)
1. [Confirm that MySQL is running](#Confirm-that-MySQL-is-running)
1. [Identify and enter the database connection credentials](#Identify-and-enter-the-database-connection-credentials)
1. [Create the database connection](#Create-the-database-connection)
1. [Create a table](#Create-a-table)
1. [Insert data into a table](#Insert-data-into-a-table)
1. [Query data](#Query-data)
1. [Close the database connection](#Close-the-database-connection)
1. [Summary](#Summary)



## Setup

Before beginning you will need access to a *MySQL* database. MySQL is an open-source relational database management system (RDBMS) that is widely used as a client–server model RDBMS. To learn more, see the [MySQL website](https://www.mysql.com/).

You should have a MySQL instance installed and running in the cloud. You can use [Amazon RDS (Relational Database Service)](http://aws.amazon.com/rds/mysql/) to set up, operate, and scale a MySQL instance.  
__Note:__ if you are using an Amazon RDS service, make sure it accept connections from every IP.


## Import the RMySQL library

__RMySQL__ is the R package that enables you to interact with MySQL (and MariaDB) databases. Run the commands below to install and import the RMySQL package:

In [1]:
install.packages("RMySQL")

Installing package into ‘/gpfs/global_fs01/sym_shared/YPProdSpark/user/s778-bfb6f75aebc10f-9bb95b1f072f/R/libs’
(as ‘lib’ is unspecified)


In [2]:
library(RMySQL)

Loading required package: DBI


## Confirm that MySQL is running

You can test to see if MySQL is installed by opening your terminal and typing “mysql”. If you receive an error that MySQL cannot connect, it means that MySQL is installed, but not running.

## Identify and enter the database connection credentials

Connecting to MySQL database requires the following information:
* Database name 
* Host DNS name or IP address 
* Host port
* User ID
* User password

All of this information must be captured in a connection string in a subsequent step. Provide the MySQL connection information as shown:

In [3]:
#Enter the values for you database connection
dsn_database = " " # for example  "BLUDB"
dsn_hostname = " " # for example  "mydbinstance.cz6pjylrdjko.us-east-1.rds.amazonaws.com"
dsn_port =         # for example  3306 without quotation marks
dsn_uid = " "      # for example  "user1"
dsn_pwd = " "      # for example  "7dBZ3jWt9xN6$o0JiX!m"

## Create the database connection

The following code snippet creates a connection object, `conn`:

In [4]:
conn = dbConnect(MySQL(), user=dsn_uid, password=dsn_pwd, host=dsn_hostname, port=dsn_port)
conn

<MySQLConnection:0,0>

## Create a table

Create a test table named Cars. The code below drops the Cars table if it already exists, and then creates the new table:

In [5]:
create_command <- paste("CREATE DATABASE IF NOT EXISTS", dsn_database, sep=" ");
use_command <- paste("USE", dsn_database, sep=" ");
dbSendQuery(conn, create_command);
dbSendQuery(conn, use_command);
dbSendQuery(conn, 'DROP TABLE IF EXISTS Cars')
dbSendQuery(conn, 'CREATE TABLE Cars(Id INTEGER PRIMARY KEY, Name VARCHAR(20), Price INT)')

<MySQLResult:78795528,0,0>

<MySQLResult:2,0,1>

<MySQLResult:88868600,0,2>

<MySQLResult:2,0,3>

## Insert data into a table

Run the following commands to create records in the new Cars table:

In [6]:
dbSendQuery(conn,"INSERT INTO Cars VALUES(1,'Audi',52642)")
dbSendQuery(conn,"INSERT INTO Cars VALUES(2,'Mercedes',57127)")
dbSendQuery(conn,"INSERT INTO Cars VALUES(3,'Skoda',9000)")
dbSendQuery(conn,"INSERT INTO Cars VALUES(4,'Volvo',29000)")
dbSendQuery(conn,"INSERT INTO Cars VALUES(5,'Bentley',350000)")
dbSendQuery(conn,"INSERT INTO Cars VALUES(6,'Citroen',21000)")
dbSendQuery(conn,"INSERT INTO Cars VALUES(7,'Hummer',41400)")
dbSendQuery(conn,"INSERT INTO Cars VALUES(8,'Volkswagen',21600)")

<MySQLResult:2,0,4>

<MySQLResult:23975032,0,5>

<MySQLResult:0,0,6>

<MySQLResult:22598968,0,7>

<MySQLResult:22519656,0,8>

<MySQLResult:40242264,0,9>

<MySQLResult:1,0,10>

<MySQLResult:22397144,0,11>

## Query data

You can now use the connection object `conn` to query the database:

In [7]:
query = "SELECT * FROM Cars";
rs = dbSendQuery(conn, query);
df = fetch(rs, -1);
df

Id,Name,Price
1,Audi,52642
2,Mercedes,57127
3,Skoda,9000
4,Volvo,29000
5,Bentley,350000
6,Citroen,21000
7,Hummer,41400
8,Volkswagen,21600


## Close the database connection

It is good practice to close your database connection after work is done:

In [8]:
dbDisconnect(conn)

“Closing open result sets”

## Summary

This notebook demonstrated how to establish a connection to a MySQL database from R using the RMySQL library.

## Want to learn more?
### Free courses on <a href="https://bigdatauniversity.com/courses/?utm_source=tutorial-dashdb-python&utm_medium=github&utm_campaign=bdu/" rel="noopener noreferrer" target="_blank">Big Data University</a>: <a href="https://bigdatauniversity.com/courses/?utm_source=tutorial-dashdb-python&utm_medium=github&utm_campaign=bdu" rel="noopener noreferrer" target="_blank"><img src = "https://ibm.box.com/shared/static/xomeu7dacwufkoawbg3owc8wzuezltn6.png" width=600px> </a>

### Authors

**Saeed Aghabozorgi**, PhD, is a Data Scientist in IBM with a track record of developing enterprise-level applications that substantially increases clients' ability to turn data into actionable knowledge. He is a researcher in the data mining field and an expert in developing advanced analytic methods like machine learning and statistical modelling on large data sets.

**Polong Lin** is a Data Scientist at IBM in Canada. Under the Emerging Technologies division, Polong is responsible for educating the next generation of data scientists through Big Data University. Polong is a regular speaker in conferences and meetups, and holds an M.Sc. in Cognitive Psychology.

Copyright © 2016 Big Data University. This notebook and its source code are released under the terms of the <a href="https://bigdatauniversity.com/mit-license/" rel="noopener noreferrer" target="_blank">MIT License</a>.