# Reading from External Databases

Learn how to ingest data from external databases using SQLAlchemy, PyMySQL, pandas, and V3IO Frames.

- [Overview](#readextdb-overview)
  - [Creating an SQLAlchemy Engine](#readextdb-sqlalchemy-engine-create)
- [Initialization](#readextdb-init)
  - [Importing Packages](#readextdb-import-packages)
  - [Creating a Frames Client](#readextdb-create-frames-client)
- [Reading from a MySQL Database](#readextdb-mysql)
  - [Reading Data in Chunks Using SQLAlchemy](#readextdb-mysql-chunks-sqlalchemy)
  - [Reading a Table as a Bulk Operation Using PyMySQL](#readextdb-mysql-bulk-pymysql)
  - [Reading Data in Chunks Using PyMySQL](#readextdb-mysql-chunks-pymysql)
- [Reading from a PostgreSQL Database](#readextdb-postgresql)
- [Reading from an Oracle Database](#readextdb-oracle)
- [Reading from a Microsoft SQL Server Database](#readextdb-mssql)
- [Cleanup](#readextdb-cleanup)

<a id="readextdb-overview"></a>
## Overview

This tutorial walks you through the steps for

- <a id="readextdb-overview-db-connect"></a>**Connecting to an external database** using either of the following methods:
  - <a id="readextdb-sqlalchemy"></a>Creating an SQLAlchemy engine.
    [SQLAlchemy](https://www.sqlalchemy.org/) is a Python SQL toolkit and Object Relational Mapper, which gives application developers the full power and flexibility of SQL.
    For more information, see the [SQLAlchemy documentation](https://docs.sqlalchemy.org).
  - <a id="readextdb-pymysql"></a>Creating a database connection using the [PyMySQL](https://github.com/PyMySQL/PyMySQL) client library (**pymysql**).
    For more information, see the [PyMySQL documentation](https://pymysql.readthedocs.io/).


- <a id="readextdb-overview-load-data"></a>**Loading data from the external database** to pandas DataFrames.
  You can read from an external database using the pandas [`read_sql_query` method](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql_query.html) or [`read_sql` wrapper](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql.html) (which calls `read_sql_query` when called with an SQL query, and `read_sql_table` when called with a database table name).
  Both methods accept an SQL query object, a database object (either an SQLAlchemy engine or a PyMySQL connection), and an optional chunk-size parameter for chunked reads (see more information in the [MySQL](#readextdb-mysql) section).


- <a id="readextdb-overview-ingest-to-platform"></a>**Ingesting the data** into the Iguazio Data Science Platform (**"the platform"**) using V3IO Frames (**"Frames"**).
  [Frames](https://github.com/v3io/frames) is a multi-model open-source data-access library, developed by Iguazio, which provides a unified high-performance DataFrame API for working with data in the platform's data store.
  Frames currently supports the NoSQL (key-value) and time-series (TSDB) data models via its `nosql`|`kv` and `tsdb` backends.
  The tutorial demonstrates how to use Frames to write the database data from the pandas DataFrames to the platform's data containers, thus persisting the data in the platform's data store.
  The tutorial also shows how to use Frames to read data from the platform's data store into pandas DataFrames.
  For more information about Frames, see to the [**frames**](frames.ipynb) tutorial.

<a id="readextdb-sqlalchemy-engine-create"></a>
### Creating an SQLAlchemy Engine

Most of the examples in this tutorial use SQLAlchemy to connect to the external database.
This requires calling the SQLAlchemy `create_engine` function to create an `Engine` object for a specific database URL:
```python
engine = create_engine('<database URL>')
```

The typical form of a database URL that's used by the SQLAlchemy engine is `<dialect>+<driver>://<username>:<password>@<host>:<port>/<database>`.
The engine creates a `Dialect` object that's tailored towards a database instance (`DB_INSTANCE`) for the specified database URL.
The engine also creates a `Pool` object for establishing a DBAPI connection at the database IP address (`<host>:<port>` &mdash; `DB_HOST:DB_PORT`) when a connection request is first received.
The connection to the DBAPI is done using the driver whose name is specified in the database URL (`<driver>`); the driver name can be omitted to use the default driver for the specified dialect.

The following example creates a MySQL SQLAlchemy engine for the database URL`mysql+pymysql://scott:tiger@localhost/foo`:
```python
engine = create_engine('mysql+pymysql://scott:tiger@localhost/foo')
```

For more information, see the SQLAlchemy [Engine Configuration](https://docs.sqlalchemy.org/en/latest/core/engines.html#engine-configuration) documentation, and especially the [Database URLs](https://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls) section.

<a id="readextdb-init"></a>
## Initialization

Start out by performing initialization steps that are required for executing all code examples in this tutorial &mdash; importing required packages and creating a Frames client.

<a id="readextdb-import-packages"></a>
### Importing Packages

Import Python packages used in this tutorial.

In [1]:
import os
import pandas as pd
import v3io_frames as v3f
from sqlalchemy.engine import create_engine

<a id="readextdb-create-frames-client"></a>
### Creating a Frames Client

Create a Frames `Client` object (named `client`), which references data in the platform's "users" data container.<br>
(User authentication is done by using the platform access key that's stored in the `V3IO_ACCESS_KEY` environment variable, which is automatically defined for the platform's Jupyter Notebook service.)

In [2]:
%time
client = v3f.Client('framesd:8081', container='users')

CPU times: user 4 µs, sys: 0 ns, total: 4 µs
Wall time: 7.87 µs


<a id="readextdb-mysql"></a>
## Reading from a MySQL Database

This section demonstrates different ways of ingesting data from an external MySQL database into the platform's data store.
All the examples use a pandas SQL query to read data from a MySQL database into pandas DataFrames, and Frames to write the data to the platform's NoSQL store and then read it back for verification.
The examples differ in the methods used for performing the read and write operations:

- <a id="readextdb-mysql-read-chunks"></a>**Chunked reads ("data streaming")** &mdash; the [first](#readextdb-mysql-chunks-sqlalchemy) and [third](#readextdb-mysql-chunks-pymysql) examples both read the data from the MySQL database into a pandas DataFrames iterator in chunks, essentially streaming the data.
  One example connects to the database using an SQLAlchemy engine and the other using a PyMySQL connection.
  Many pandas data engines (including SQL, CSV, and the Frames NoSQL backend) support data chunking.
  With chunking, the driver forms a continuous iterator for reading or writing data sequentially in chunks.
  Working in chunks is a common approach for handling big data sets that cannot fit into the available memory resources, such as pandas DataFrames.
  Chunked reads are performed by setting the `chunksize` parameter of the pandas SQL-query method (`read_sql_query` or `read_sql`) to the number of rows (items) to read in each chunk (the chunk size).
  When this parameter is set, the read method returns the data in a pandas DataFrames iterator.
  This iterator can be passed as-is to a DataFrame write command, such as the Frames `write` method.
- <a id="readextdb-mysql-read-bulk"></a>**Bulk reads** &mdash; the [second example](#readextdb-read-mysql-bulk-pymysql) reads a table from the MySQL database into a single pandas DataFrame as a bulk operation.

All the examples use a public MySQL database named [Rfam](https://rfam.readthedocs.io/en/latest/database.html); the database URL is `mysql+pymysql://rfamro@mysql-rfam-public.ebi.ac.uk:4497/Rfam`.

<a id="readextdb-mysql-init"></a>
### Initialization

<a id="readextdb-mysql-create-query-obj"></a>
#### Create an SQL Query Object

Start out by creating a query object that identifies the data to download from the database.
This object is used in all of the MySQL database examples.

In [3]:
%time

mysql_query = 'select rfam_acc,rfam_id,auto_wiki,description,author,seed_source FROM family'

CPU times: user 3 µs, sys: 1 µs, total: 4 µs
Wall time: 9.3 µs


<a id="readextdb-mysql-create-pymysql-db-connection"></a>
#### Create a PyMySQL Connection to the MySQL Database

Install the PyMySQL Python MySQL client library (**pymysql**), and then use the library's `connect` method to create a database connection to the MySQL database.
This step is required for the examples that don't use the SQLAlchemy engine.

> **AWS Note:** If you're running the code in an AWS cloud with the persisted data and software package for eventual consistency, note that the creation of the database connection might take a bit of time to complete.

In [4]:
import pymysql
%time

conn = pymysql.connect(
    host=os.getenv('DB_HOST','mysql-rfam-public.ebi.ac.uk'),
    port=int(4497),
    user=os.getenv('DB_USER','rfamro'),
    passwd=os.getenv('DB_PASSWORD',''),
    db=os.getenv('DB_NAME','Rfam'),
    charset='utf8mb4')

CPU times: user 2 µs, sys: 0 ns, total: 2 µs
Wall time: 5.72 µs


<a id="readextdb-mysql-chunks-sqlalchemy"></a>
### Reading Data in Chunks Using SQLAlchemy

This example reads data from a MySQL database into a pandas DataFrame iterator, in chunks, using an SQLAlchemy database engine; writes the DataFrames to a platform NoSQL table, in chunks, using Frames; and reads from the NoSQL table using Frames.

<a id="readextdb-mysql-chunks-sqlalchemy-create-engine"></a>
#### Create a MySQL SQLAlchemy Engine

Create an SQLAlchemy engine using the [MySQL dialect](https://docs.sqlalchemy.org/en/latest/core/engines.html#mysql).

The MySQL dialect uses `mysql-python` as the default DBAPI.
However, the following example sets the driver in the database URL to `pymysql` to use the PyMySQL DBAPI (database URL = `mysql+pymysql://rfamro@mysql-rfam-public.ebi.ac.uk:4497/Rfam`).

In [5]:
import pymysql
%time

mysql_URL = 'mysql+pymysql://rfamro@mysql-rfam-public.ebi.ac.uk:4497/Rfam'
engine = create_engine(mysql_URL)

CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 5.25 µs


<a id="readextdb-mysql-chunks-sqlalchemy-read-from-db"></a>
#### Read from the MySQL Database in Chunks ("Streaming")

Use the pandas DataFrames `read_sql` method with the query and SQLAlchemy engine objects that you created in the previous steps to read data from the MySQL database, in chunks ("streaming"), into a pandas DataFrames iterator.

In [6]:
%time

CHUNK_SIZE = 100000

all_df = pd.read_sql(mysql_query,engine,chunksize = CHUNK_SIZE)

CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 5.72 µs


<a id="readextdb-mysql-chunks-sqlalchemy-frames-write"></a>
#### Write the pandas DataFrames to the Platform's NoSQL Store Using Frames

Use Frames to write the pandas DataFrames that were read in the previous step to the platform's NoSQL store as individual data chunks.

In [7]:
%time

mysql_table = os.path.join(os.getenv('V3IO_USERNAME')+'/examples/family')
backend = 'kv'

for df in all_df:
    df = df.reset_index()
    out = client.write(backend,mysql_table, df)

CPU times: user 3 µs, sys: 1e+03 ns, total: 4 µs
Wall time: 5.72 µs


<a id="readextdb-mysql-chunks-sqlalchemy-read-from-platform"></a>
#### Read from the Platform's NoSQL Store Using Frames

Use Frames to read the platform NoSQL table that you created in the previous step.

In [8]:
%time

client.read(backend, mysql_table)

CPU times: user 4 µs, sys: 0 ns, total: 4 µs
Wall time: 8.34 µs


Unnamed: 0_level_0,author,auto_wiki,description,index,rfam_acc,rfam_id,seed_source
idx,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2787,Argasinska J,2554,Bacillus sRNA 1,2787,RF02888,BtsR1,Argasinska J
366,Moxon SJ,1644,Insertion sequence IS1222 ribosomal frameshift...,366,RF00383,IS1222_FSE,PMID:15126494
362,Moxon SJ,2397,ydaO/yuaA leader,362,RF00379,ydaO-yuaA,"Barrick JE, Breaker RR"
1915,Boursnell C,1287,microRNA mir-1253,1915,RF02006,mir-1253,Predicted; ClustalW2
881,Wilkinson A,1287,microRNA mir-582,881,RF00927,mir-582,miRBase; Wilkinson A
...,...,...,...,...,...,...,...
2305,"Osuch I, Eberhardt R",1269,Pseudomonas sRNA P34,2305,RF02405,P34,INFERNAL
997,Gardner PP,1322,Saccharomyces telomerase,997,RF01050,Sacc_telomerase,Published; PMID:15242611
2327,Eberhardt R,2341,Streptococcus sRNA SpF10,2327,RF02427,SpF10_sRNA,Eberhardt R
1284,Wilkinson A,1265,CRISPR RNA direct repeat element,1284,RF01352,CRISPR-DR43,Predicted; WAR; Wilkinson A


<a id="readextdb-mysql-bulk-pymysql"></a>
### Reading a Table as a Bulk Operation Using PyMySQL

This example reads a table from the MySQL database into a pandas DataFrame as a bulk operation, using a PyMySQL database connection; writes the contents of the DataFrame to a platform NoSQL table using Frames; and reads from the NoSQL table using Frames.

<a id="readextdb-mysql-bulk-pymysql-read-from-db"></a>
#### Read from the Database as a Bulk Operation

Use the pandas DataFrames `read_sql_query` method with the query and PyMySQL database-connection objects that you created in the previous steps to read a table from the MySQL database, as a bulk operation, into a pandas DataFrame.

In [9]:
%time

# Read from the external database
df = pd.read_sql_query(mysql_query, conn) 

CPU times: user 2 µs, sys: 0 ns, total: 2 µs
Wall time: 5.72 µs


In [10]:
%time

# Display the last 10 row (items) of the read DataFrame
df.tail(10)

CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 5.72 µs


Unnamed: 0,rfam_acc,rfam_id,auto_wiki,description,author,seed_source
3014,RF03116,aCoV-5UTR,2707,Alphacoronavirus 5'UTR,Lamkiewicz K,Lamkiewicz K
3015,RF03117,bCoV-5UTR,2707,Betacoronavirus 5'UTR,Lamkiewicz K,Lamkiewicz K
3016,RF03118,gCoV-5UTR,2707,Gammacoronavirus 5'UTR,Lamkiewicz K,Lamkiewicz K
3017,RF03119,dCoV-5UTR,2707,Deltacoronavirus 5'UTR,Lamkiewicz K,Lamkiewicz K
3018,RF03120,Sarbecovirus-5UTR,2707,Sarbecovirus 5'UTR,Lamkiewicz K,Lamkiewicz K
3019,RF03121,aCoV-3UTR,2708,Alphacoronavirus 3'UTR,Lamkiewicz K,Kevin Lamkiewicz
3020,RF03122,bCoV-3UTR,2708,Betacoronavirus 3'UTR,Lamkiewicz K,Lamkiewicz K
3021,RF03123,gCoV-3UTR,2708,Gammacoronavirus 3'UTR,Lamkiewicz K,Lamkiewicz K
3022,RF03124,dCoV-3UTR,2708,Deltacoronavirus 3'UTR,Lamkiewicz K,Lamkiewicz K
3023,RF03125,Sarbecovirus-3UTR,2708,Sarbecovirus 3'UTR,Lamkiewicz K,Lamkiewicz K


<a id="readextdb-mysql-bulk-pymysql-write-to-platform"></a>
#### Write the Table to the Platform's NoSQL Store Using Frames

Use Frames to write the pandas DataFrame that was read in the previous step to the platform's NoSQL store.

In [11]:
%time

mysql_table_1 = os.path.join(os.getenv('V3IO_USERNAME')+'/examples/family1')

client.write(backend, mysql_table_1, df)

CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 5.48 µs


<a id="readextdb-mysql-bulk-pymysql-read-from-platform"></a>
#### Read from the Platform's NoSQL Store Using Frames

Use Frames to read the platform NoSQL table that you created in the previous step.

In [12]:
%time

client.read(backend, mysql_table_1)

CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 7.39 µs


Unnamed: 0_level_0,author,auto_wiki,description,rfam_acc,rfam_id,seed_source
idx,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2852,Weinberg Z,2601,DUF3577 RNA,RF02954,DUF3577,Weinberg Z
2433,Eberhardt R,2348,Hepatitis A virus (HAV) cis-acting replication...,RF02533,HAV_CRE,PMID:18684812
267,Moxon SJ,1550,Small nucleolar RNA SNORD49,RF00277,SNORD49,"Moxon SJ, INFERNAL"
825,Wilkinson A,1287,microRNA mir-BART17,RF00863,mir-BART17,miRBase; Wilkinson A
2640,Argasinska J,2518,Soft rot Enterobacteriaceae Fwd 6 3'UTR,RF02740,Fwd6_3p_UTR,Argasinska J
...,...,...,...,...,...,...
1814,Boursnell C,1287,microRNA mir-142,RF01896,mir-142,Predicted; ClustalW2
2818,"Corbino K, Weinberg Z",2571,Fusobacteriales-1 RNA,RF02920,Fusobacteriales-1,Weinberg Z
2682,Argasinska J,2437,CpoB/ybgF ICR thermometer,RF02782,CpoB_ybgF_thermometer,Argasinska J
2323,Eberhardt R,2451,Burkholderia sRNA Bp1_Cand871_SIPHT,RF02423,Bp1_781,Predicted; CMfinder


<a id="readextdb-mysql-chunks-pymysql"></a>
### Reading Data in Chunks Using PyMySQL

This example reads data from a MySQL database into a pandas DataFrame iterator in chunks, using a PyMySQL database connection; writes the DataFrames iterator to a platform NoSQL table using Frames; and reads from the NoSQL table using Frames.

<a id="readextdb-mysql-chunks-pymysql-read-from-db"></a>
#### Read from the MySQL Database in Chunks ("Streaming")

Use the pandas DataFrames `read_sql` method with the query and PyMySQL database-connection objects that you created in the previous steps to read data from the MySQL database, in chunks ("streaming"), into a pandas DataFrames iterator.

In [13]:
%time

mysql_table_2 = os.path.join(os.getenv('V3IO_USERNAME')+'/examples/family2')
CHUNK_SIZE = 1000

dfIterator = pd.read_sql(mysql_query, conn, chunksize=CHUNK_SIZE)

CPU times: user 2 µs, sys: 1e+03 ns, total: 3 µs
Wall time: 5.72 µs


<a id="readextdb-mysql-chunks-pymysql-write-to-platform"></a>
#### Write the pandas DataFrames Iterator to the Platform's NoSQL Store Using Frames

Use Frames to write the pandas DataFrames iterator that was read in the previous step to the platform's NoSQL store.

In [14]:
client.write(backend, mysql_table_2, dfIterator)

<a id="readextdb-mysql-chunks-pymysql-read-from-platform"></a>
#### Read from the Platform's NoSQL Store Using Frames

Use Frames to read the platform NoSQL table that you created in the previous step.

In [15]:
%time

client.read(backend, mysql_table_2)

CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 5.48 µs


Unnamed: 0_level_0,author,auto_wiki,description,rfam_acc,rfam_id,seed_source
idx,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1529,Osuch I,1264,small nucleolar RNA snoR23,RF01598,snoR23,INFERNAL
496,Barrick JE,1258,ybhL leader,RF00520,ybhL,Barrick JE
320,"Moxon SJ, Daub J",1600,Small nucleolar RNA SNORA3/SNORA45 family,RF00334,SNORA3,INFERNAL
851,Wilkinson A,1986,microRNA MIR854,RF00893,MIR854,miRBase; Wilkinson A
1368,Osuch I,1264,small nucleolar RNA snR10,RF01437,S_pombe_snR10,INFERNAL
...,...,...,...,...,...,...
1052,Wilkinson A,1651,Pseudoknot of upstream pseudoknot domain (UPD)...,RF01111,SBWMV2_UPD-PKk,Pseudobase
1804,Eberhardt R,2228,Heat shock gene hsromega conserved region 1,RF01885,HSR-omega_1,Predicted; WAR; Eberhardt R
1015,Wilkinson A,1651,Pseudoknot of upstream pseudoknot domain (UPD)...,RF01072,TMV_UPD-PK3,Pseudobase
368,Moxon SJ,1646,Infectious bronchitis virus D-RNA,RF00385,IBV_D-RNA,PMID:11119581


<a id="readextdb-postgresql"></a>
## Reading from a PostgreSQL Database

This example uses a pandas `read_sql` query with an SQLAlchemy engine for the [PostgreSQL dialect](https://docs.sqlalchemy.org/en/latest/core/engines.html#postgresql) to load data from an external PostgreSQL database into a pandas DataFrame; and then uses Frames to write the data to the platform's NoSQL store and read it back for verification.

The PostgreSQL dialect uses `psycopg2` as the default DBAPI, so you can optionally omit the name of the DBAPI driver from the database URL that you pass in the `create_engine` call.
For example:
```python
# Default DBAPI
engine = create_engine('postgresql://scott:tiger@localhost/mydatabase')

# DBAPI = psycopg2
engine = create_engine('postgresql+psycopg2://scott:tiger@localhost/mydatabase')
```

> **Note:** Before you run the code, you must edit the definition of the database-URL variable (`postgresql_URL`) to use a valid PostgreSQL database.
> You can also optionally edit the SQL query (`postgresql_query`) to use a custom read filter.

<a id="readextdb-postgresql-sqlalchemy-create-engine"></a>
#### Create a PostgreSQL SQLAlchemy Engine

Create an SQLAlchemy engine using the PostgreSQL dialect.

In [None]:
import psycopg2

# !! TODO !! Replace the <...> placeholders in the database URL
postgresql_URL = 'postgresql+psycopg2://<username>:<password>@<host>:<port>/<database>'
engine = create_engine(postgresql_URL)

<a id="readextdb-postgresql-sqlachemy-read-from-db"></a>
#### Read from the PostgreSQL Database

Create an SQL query object and use the pandas DataFrames `read_sql` method with this object and the SQLAlchemy engine object that you created in the previous step to read data from the PostgreSQL database into a pandas DataFrame.

In [None]:
# Create an SQL query object
postgresql_query = 'select * from table'

In [None]:
# Read from the external database
df = pd.read_sql(postgresql_query,engine)

In [None]:
# Display the contents of the read DataFrame
print(df)

<a id="readextdb-postgresql-sqlalchemy-write-to-platform"></a>
#### Write the Data to the Platform's NoSQL Store Using Frames

Use Frames to write the pandas DataFrame that was read in the previous step to the platform's NoSQL store.

In [None]:
postgresql_table = os.path.join(os.getenv('V3IO_USERNAME')+'/examples/postgresql_table')
backend = 'kv'
client.write(backend, postgresql_table, df)

<a id="readextdb-postgresql-sqlalchemy-read-from-platform"></a>
#### Read from the Platform's NoSQL Store Using Frames

Use Frames to read the platform NoSQL table that you created in the previous step.

In [None]:
client.read(backend, postgresql_table)

<a id="readextdb-oracle"></a>
## Reading from an Oracle Database

This example uses a pandas `read_sql` query with an SQLAlchemy engine for the [Oracle dialect](https://docs.sqlalchemy.org/en/latest/core/engines.html#oracle) to load data from an external Oracle database into a pandas DataFrame; and then uses Frames to write the data to the platform's NoSQL store and read it back for verification.

The Oracle dialect uses `cx_oracle` as the default DBAPI, so you can optionally omit the name of the DBAPI driver from the database URL that you pass in the `create_engine` call.
For example:
```python
# Default DBAPI
engine = create_engine('oracle://scott:tiger@localhost/mydatabase')

# DBAPI = cx_oracle
engine = create_engine('oracle+cx_oracle://scott:tiger@localhost/mydatabase')
```

> **Note:** Before you run the code, you must edit the definition of the database-URL variable (`oracle_URL`) to use a valid Oracle database.
> You can also optionally edit the SQL query (`oracle_query`) to use a custom read filter.

<a id="readextdb-oracle-sqlalchemy-create-engine"></a>
#### Create an Oracle SQLAlchemy Engine

Create an SQLAlchemy engine using the Oracle dialect.

In [None]:
import cx_oracle

# !! TODO !! Replace the <...> placeholders in the database URL
oracle_URL = 'oracle://<username>:<password>@<host>:<port>/<database>'
engine = create_engine(oracle_URL)

<a id="readextdb-oracle-sqlachemy-read-from-db"></a>
#### Read from the Oracle Database

Create an SQL query object and use the pandas DataFrames `read_sql` method with this object and the SQLAlchemy engine object that you created in the previous step to read data from the Oracle database into a pandas DataFrame.

In [None]:
# Create an SQL query object
oracle_query = 'select * from table'

In [None]:
# Read from the external database
df = pd.read_sql(oracle_query,engine)

In [None]:
# Display the contents of the read DataFrame
print(df)

<a id="readextdb-oracle-sqlalchemy-write-to-platform"></a>
#### Write the Data to the Platform's NoSQL Store Using Frames

Use Frames to write the pandas DataFrame that was read in the previous step to the platform's NoSQL store.

In [None]:
oracle_table = os.path.join(os.getenv('V3IO_USERNAME')+'/examples/oracle_table')
backend = 'kv'
client.write(backend, oracle_table, df)

<a id="readextdb-oracle-sqlalchemy-read-from-platform"></a>
#### Read from the Platform's NoSQL Store Using Frames

Use Frames to read the platform NoSQL table that you created in the previous step.

In [None]:
client.read(backend, oracle_table)

<a id="readextdb-mssql"></a>
## Reading from a Microsoft SQL Server Database

This example uses a pandas `read_sql` query with an SQLAlchemy engine for the [Microsoft SQL Server dialect](https://docs.sqlalchemy.org/en/latest/core/engines.html#microsoft-sql-server) to load data from an external SQL server database into a pandas DataFrame; and then uses Frames to write the data to the platform's NoSQL store and read it back for verification.

The Microsoft SQL Server dialect uses `pymssql` as the default DBAPI, so you can optionally omit the name of the DBAPI driver from the database URL that you pass in the `create_engine` call.
For example:
```python
# Default DBAPI
engine = create_engine('mssql://scott:tiger@localhost/mydatabase')

# DBAPI = pymssql
engine = create_engine('mssql+pymssql://scott:tiger@localhost/mydatabase')
```

> **Note:** Before you run the code, you must edit the definition of the database-URL variable (`mssql_URL`) to use a valid Microsoft SQL Server database.
> You can also optionally edit the SQL query (`mssql_query`) to use a custom read filter.

<a id="readextdb-mssql-sqlalchemy-create-engine"></a>
#### Create a Microsoft SQL Server SQLAlchemy Engine

Create an SQLAlchemy engine using the Microsoft SQL Server dialect.

In [None]:
import pymssql

# !! TODO !! Replace the <...> placeholders in the database URL
mssql_URL = 'mssql+pymssql://<username>:<password>@<host>:<port>/<database>'
engine = create_engine(mssql_URL)

<a id="readextdb-mssql-sqlachemy-read-from-db"></a>
#### Read from the SQL Server Database

Create an SQL query object and use the pandas DataFrames `read_sql` method with this object and the SQLAlchemy engine object that you created in the previous step to read data from the Oracle database into a pandas DataFrame.

In [None]:
# Create an SQL query object
mssql_query = 'select * from table'

In [None]:
# Read from the external database
df = pd.read_sql(mssql_query,engine)

In [None]:
# Display the contents of the read DataFrame
print(df)

<a id="readextdb-mssql-sqlalchemy-write-to-platform"></a>
#### Write the Data to the Platform's NoSQL Store Using Frames

Use Frames to write the pandas DataFrame that was read in the previous step to the platform's NoSQL store.

In [None]:
mssql_table = os.path.join(os.getenv('V3IO_USERNAME')+'/examples/mssql_table')
backend = 'kv'
client.write(backend, mssql_table, df)

<a id="readextdb-mssql-sqlalchemy-read-from-platform"></a>
#### Read from the Platform's NoSQL Store Using Frames

Use Frames to read the platform NoSQL table that you created in the previous step.

In [None]:
client.read(backend, mssql_table)

<a id="readextdb-cleanup"></a>
## Cleanup

Use the Frames `delete` client method to delete data that was ingested into the platform's NoSQL store as part of this tutorial.

In [16]:
from v3io_frames import frames_pb2 as fpb

In [17]:
client.delete(backend, mysql_table, if_missing=fpb.IGNORE)

In [18]:
client.delete(backend, mysql_table_1, if_missing=fpb.IGNORE)

In [19]:
client.delete(backend, mysql_table_2, if_missing=fpb.IGNORE)

In [None]:
client.delete(backend, postgresql_table, if_missing=fpb.IGNORE)

In [None]:
client.delete(backend, oracle_table, if_missing=fpb.IGNORE)

In [None]:
client.delete(backend, mssql_table, if_missing=fpb.IGNORE)