In [None]:
# Upgrade Oracle ADS to pick up latest features and maintain compatibility with Oracle Cloud Infrastructure.

!pip install -U oracle-ads

Oracle Data Science service sample notebook.

Copyright (c) 2020, 2022 Oracle, Inc. All rights reserved. Licensed under the [Universal Permissive License v 1.0](https://oss.oracle.com/licenses/upl).

---

# <font color="red">Introduction to SQL Magic</font>
<p style="margin-left:10%; margin-right:10%;">by the <font color="teal">Oracle Cloud Infrastructure Data Science Service.</font></p>

---

# Overview:

This notebook demonstrates how to use SQL Magic to work with a database.  Magic commands are a set of functions which are not valid Python code but can be run and executed in Jupyter Notebooks.  There are two types of magic commands, line magics and cell magics.  Line magics start with `%` and operate on a single line of input. Cell magics start with `%%`, and they work on multiple lines in a call.  

IPython SQL magic extension allows you to directly write SQL queries in Jupyter notebook cells.

Compatible conda pack: [General Machine Learning](https://docs.oracle.com/en-us/iaas/data-science/using/conda-gml-fam.htm) for CPU on Python 3.8 (version 1.0)

## Contents:

 - <a href='#setup'>Setting Up `ipython-sql`</a>
 - <a href='#sql_DML'>Data Manipulation Language Commands</a>
 - <a href='#sql_DQL'>Data Query Language Commands</a>
 - <a href='#sql_var'>Variable Bindings</a>
 - <a href='#sql_viz'>Data Visualizations</a>
 - <a href='#reference'>References</a>

---


Datasets are provided as a convenience.  Datasets are considered third-party content and are not considered materials 
under your agreement with Oracle.

---


In [None]:
import pandas as pd
from IPython import get_ipython

<a id='setup'></a>
# Setting Up `ipython-sql`

`ipython-sql` uses a number of ipython magic commands to interact directly with the database. The following sections will cover the following magic commands
* `%config SqlMagic`
* `load_ext sql`
* `%sql`
* `%%sql`

In the following cell, the `ipython-sql` package is loaded with the magic command `%load_ext sql`. Note, it is not loaded with an `import` statement. `ipython-sql` supports a variety of databases. The command `%sql sqlite://` makes a connection to an in-memory SQLite database. This database is now bound to the notebook and future `%sql` command will be performed on this database.

In [None]:
%load_ext sql
%sql sqlite://

You can configure the database using the `%config SqlMagic` magic command. This prints the current configuration information, descriptions of the options, current values, and what values can be set.

In [None]:
%config SqlMagic

The `%config SQLMagic` command also allows options to be set. These would be in the form of `%config SQLMagic.<option>=<value>` where `<option>` is the name of the option that is to be set and `<value>` is the value that is to be set. The `%config SQLMagic` command lists the options and valid values.

The command `%config SQLMagic.<option>` will return the current value of the option.

In [None]:
%config SqlMagic.autocommit

In [None]:
%config SqlMagic.autocommit=False
%config SqlMagic.autocommit

<a id='sql_DML'></a>
# Data Manipulation Language Commands

A data manipulation language (DML) command can be issued with the `%%sql` command once a database is bound to the `ipython-sql` module. The DML statements in the next cell create a table called `writer` and populates it with three authors.

In [None]:
%%sql
DROP TABLE IF EXISTS author;
CREATE TABLE author (given_name, family_name, year_of_death);
INSERT INTO author VALUES ('William', 'Shakespeare', 1616);
INSERT INTO author VALUES ('Bertold', 'Brecht', 1956);
INSERT INTO author VALUES ('Virginia', 'Woolf', 1941);

The `--persists <variable>` can be used to copy a dataset into a new table. The name of the table will be the same as the same of the variable. In the following cells, a Pandas DataFrame will be created. Then several `ipython-sql` commands will be issued. The first one will drop the table animals, if it exists. If animals already exists it will create an error. Then the `--persists` command will be used to copy the DataFrame into the database as a new table. The final command will query all the records in the newly created animals table.

In [None]:
animals = pd.DataFrame(
    {
        "num_legs": [2, 4, 8, 0],
        "num_wings": [2, 0, 0, 0],
        "num_specimen_seen": [10, 2, 1, 8],
    },
    index=["falcon", "dog", "spider", "fish"],
)

In [None]:
%sql DROP TABLE IF EXISTS 'animals'
%sql --persist animals
%sql SELECT * FROM animals

<a id='sql_DQL'></a>
# Data Query Language Commands

A data query language (DQL) command can be used to obtain records from the database. 
If your query is short, you can use oneline of code:

In [None]:
%sql SELECT * FROM author WHERE year_of_death >=1950;

The previous cell printed the results of the query into the notebook. It is also possible to capture the results into a Python object. If the query can fit on a single line then the `<variable> = %sql <DQL>` command can be used. This will store the results in the specified variable. In the following cell, this approach is used to obtain authors that died before 1950.

In [None]:
%config SqlMagic.autopandas=False
old_author = %sql SELECT * FROM author WHERE year_of_death < 1950;
old_author

For longer SQL commands use
```
%%sql <variable> << 
<DQL>
``` 
The result is stored in the `<variable>` variable. 

In [None]:
%config SqlMagic.autopandas=False

In [None]:
%%sql author << 
SELECT given_name, family_name, year_of_death 
FROM author;

In [None]:
author

In the preceding cell, `author` is an object of class `sql.run.ResultSet`. It can be converted to a Pandas DataFrame using the `DataFrame()` method.

In [None]:
df = author.DataFrame()
type(df)

To have `ipython-sql` return record sets in a Pandas DataFrame, set the `autopandas` option to `True`.

In [None]:
%config SqlMagic.autopandas=True
author = %sql SELECT given_name, family_name, year_of_death FROM author
type(author)

<a id='sql_var'></a>
# Variable Bindings

Python variables can be bound to the SQL commands with the `:<variable>`, `'{variable}'` or `$variable` syntax. In the next cell, the variable `name` is set to William. The command is issued to return any results where the `first_name` is equal to the value of `name`.

In [None]:
first_name = "William"
last_name = "Shakespeare"
death_century = 1600

In [None]:
%%sql 
SELECT * 
FROM author 
WHERE 
    given_name LIKE :first_name 
    AND family_name LIKE '{last_name}'
    AND year_of_death >= CAST('$death_century' AS INT)

<a id='sql_viz'></a>
# Data Visualization

Record sets that are of the class `sql.run.ResultSet` have the methods `.plot()`, `.pie()`, and `.bar()`. These are convient for making quick plots.

In [None]:
old_author.bar()

In [None]:
old_author.plot()

In [None]:
old_author.pie()

<a id="reference"></a>
# References

- [ADS Library Documentation](https://accelerated-data-science.readthedocs.io/en/latest/index.html)
- [Data Science YouTube Videos](https://www.youtube.com/playlist?list=PLKCk3OyNwIzv6CWMhvqSB_8MLJIZdO80L)
- [ipython-sql](https://pypi.org/project/ipython-sql/)
- [OCI Data Science Documentation](https://docs.cloud.oracle.com/en-us/iaas/data-science/using/data-science.htm)
- [Oracle Data & AI Blog](https://blogs.oracle.com/datascience/)
- [SQLite Tutorial](https://www.sqlitetutorial.net/)