<div style="background-color: #1B1A21; text-align: right; margin-bottom: -1px">
    <img src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/singlestore-banner.png" style="padding: 0px; padding-right: 20px; margin: 0px; padding-top: 20px; height: 60px"/>
    <img src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/banner-colors.png" style="width:100%; height: 50px; padding: 0px; margin: 0px; margin-bottom: -8px"/>
</div>

# Developing in a SingleStoreDB notebook

Prototyping applications or analyzing datasets using notebooks in SingleStoreDB Cloud follows the same general principles as developing with a Jupyter Notebook. SingleStoreDB Cloud supports internal and external datasources. Internal datasources are databases that exist within your workspace. An external datasource could be an AWS S3 bucket for example. In this Notebook we cover:

1. Connecting to a SingleStoreDB instance
2. Connecting to an external datasource including firewall Settings
3. Using SQL in a cell
4. Using Python in a cell
5. Using both SQL & Python
6. Installing Libraries
7. Using Magic Commands 

*To learn more about working with SingleStoreDB notebooks check out our [docs](https://docs.singlestore.com/managed-service/en/developer-resources/notebooks.html)!*

## 1. Connecting to SingleStoreDB

Once you select a workspace, you can access all of the databases attached to that workspace. You cannot connect to databases that are not attached to the workspace you are using.

First select a workspace and the `information_schema` database from the drop-down menu at the top of this notebook.First select a workspace and the `information_schema` database from the drop-down menu at the top of this notebook.

<img src=https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/notebooks/Notebook%20Basics/images/select-workspace-and-database.png style="width: 500px; border: 1px solid darkorchid"/>

With the database selected, the `connection_url` variable in the Python enviroment is now updated with that information
and we can use the `%%sql` magic command to query the selected database.

In [5]:
%%sql
SELECT * FROM users
    LIMIT 3;

USER,HOST,TYPE,CONNECTIONS,IS_DELETED,LAST_UPDATED,DEFAULT_RESOURCE_POOL,IS_LOCAL,CREATED,PASSWORD_UPDATED,EFFECTIVE_FAILED_LOGIN_ATTEMPTS,EFFECTIVE_PASSWORD_LOCK_TIME,ACCOUNT_STATUS,PASSWORD_EXPIRATION
acomet@singlestore.com,%,JWT,0,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
d3505f24-d8d4-4258-bb60-fea6cc07931d,%,JWT,10,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
admin,%,NATIVE,0,0,2023-06-22 22:44:56,,0,2023-02-16 05:48:22,2023-05-12 15:38:57,0,0,,


When running SQL commands against a different database explicitly, you can specify the database in your 
SQL code with the `USE` command:

In [6]:
%%sql
USE information_schema;

SELECT * FROM users
    LIMIT 3;

USER,HOST,TYPE,CONNECTIONS,IS_DELETED,LAST_UPDATED,DEFAULT_RESOURCE_POOL,IS_LOCAL,CREATED,PASSWORD_UPDATED,EFFECTIVE_FAILED_LOGIN_ATTEMPTS,EFFECTIVE_PASSWORD_LOCK_TIME,ACCOUNT_STATUS,PASSWORD_EXPIRATION
acomet@singlestore.com,%,JWT,0,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
d3505f24-d8d4-4258-bb60-fea6cc07931d,%,JWT,10,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
admin,%,NATIVE,0,0,2023-06-22 22:44:56,,0,2023-02-16 05:48:22,2023-05-12 15:38:57,0,0,,


Alternatively, you can specify the database prefix on the table in the query itself.

In [8]:
%%sql
SELECT * FROM information_schema.users
    LIMIT 3;

USER,HOST,TYPE,CONNECTIONS,IS_DELETED,LAST_UPDATED,DEFAULT_RESOURCE_POOL,IS_LOCAL,CREATED,PASSWORD_UPDATED,EFFECTIVE_FAILED_LOGIN_ATTEMPTS,EFFECTIVE_PASSWORD_LOCK_TIME,ACCOUNT_STATUS,PASSWORD_EXPIRATION
acomet@singlestore.com,%,JWT,0,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
d3505f24-d8d4-4258-bb60-fea6cc07931d,%,JWT,11,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
admin,%,NATIVE,0,0,2023-06-22 22:44:56,,0,2023-02-16 05:48:22,2023-05-12 15:38:57,0,0,,


## Connecting with SQLAlchemy

You can also connect to your SingleStoreDB datasource using Python and SQLAlchemy. As mentioned above, 
the `connection_url` variable is automatically populated by the notebook environment when selecting a
database in the drop-down menu at the top of the notebook.

In [16]:
import sqlalchemy as sa

db_connection = sa.create_engine(connection_url).connect()

You can also explicitly define a URL using the individual connection components.

In [17]:
database_name = "information_schema"

connection_url2 = f"singlestoredb://{connection_user}:{connection_password}@{connection_host}:{connection_port}/{database_name}"

db_connection2 = sa.create_engine(connection_url2).connect()

Using db_connection, we can run our queries much like the `%%sql` command.

In [18]:
query1 = 'SELECT * FROM users LIMIT 3;'

for row in db_connection2.execute(query1):
    print(row)

('acomet@singlestore.com', '%', 'JWT', 0, 0, datetime.datetime(2023, 6, 21, 16, 25, 32), '', 0, datetime.datetime(2023, 2, 16, 5, 48, 22), datetime.datetime(2023, 2, 16, 5, 48, 22), 0, 0, 'n/a', None)
('d3505f24-d8d4-4258-bb60-fea6cc07931d', '%', 'JWT', 8, 0, datetime.datetime(2023, 6, 21, 16, 25, 32), '', 0, datetime.datetime(2023, 2, 16, 5, 48, 22), datetime.datetime(2023, 2, 16, 5, 48, 22), 0, 0, 'n/a', None)
('admin', '%', 'NATIVE', 0, 0, datetime.datetime(2023, 6, 22, 22, 50, 47), '', 0, datetime.datetime(2023, 2, 16, 5, 48, 22), datetime.datetime(2023, 5, 12, 15, 38, 57), 0, 0, 'n/a', None)


# 2. Connecting to an external datasource

You can securely connect to external endpoints from your SingleStoreDB notebooks. By default, connections are limited to SingleStoreDB databases; however, you can enable and disable connections to other external endpoints via the allowlist. To add or remove endpoints from the allowlist:

1. In the left navigation, select Notebooks.
2. Select the Firewall tab in the main window.
3. Select Edit to add new endpoints:

<img src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/notebooks/Notebook%20Basics/images/new-endpoints.png">

4. In the Edit Allowlist dialog, you can add a Fully Qualified Domain Name (FQDN) or select from a list of suggested FQDNs (for example `pypi.org` or `github.com`). You can provide wildcard access to an endpoint by using the `*` character. For example, to access an AWS S3 endpoints, you can use the following syntax:  `*.s3.*.amazonaws.com`
5. Select Save.

<img src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/notebooks/Notebook%20Basics/images/connect-to-an-external-datasource.png" style="width: 500px">

# 3. Using SQL
The default language for SingleStoreDB Cloud notebooks is Python. However, the `%%sql` magic command can be used to
submit SQL code for an entire cell.

In [24]:
%%sql
SELECT * FROM users
    LIMIT 3;

USER,HOST,TYPE,CONNECTIONS,IS_DELETED,LAST_UPDATED,DEFAULT_RESOURCE_POOL,IS_LOCAL,CREATED,PASSWORD_UPDATED,EFFECTIVE_FAILED_LOGIN_ATTEMPTS,EFFECTIVE_PASSWORD_LOCK_TIME,ACCOUNT_STATUS,PASSWORD_EXPIRATION
acomet@singlestore.com,%,JWT,0,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
d3505f24-d8d4-4258-bb60-fea6cc07931d,%,JWT,12,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
admin,%,NATIVE,0,0,2023-06-22 22:50:47,,0,2023-02-16 05:48:22,2023-05-12 15:38:57,0,0,,


By default, the results are displayed as a table. We can also store the result in a variable for use later in the
notebook. The following code includes the `result1 <<` syntax which indicates that the output of the SQL code
should be stored in the `result` variable in the Python environment.

In [25]:
%%sql result1 <<
SELECT * FROM users
    LIMIT 3;

We now have access to the `result` variable and can convert it to a DataFrame!

In [26]:
import pandas as pd

df = pd.DataFrame(result1)
df

Unnamed: 0,USER,HOST,TYPE,CONNECTIONS,IS_DELETED,LAST_UPDATED,DEFAULT_RESOURCE_POOL,IS_LOCAL,CREATED,PASSWORD_UPDATED,EFFECTIVE_FAILED_LOGIN_ATTEMPTS,EFFECTIVE_PASSWORD_LOCK_TIME,ACCOUNT_STATUS,PASSWORD_EXPIRATION
0,acomet@singlestore.com,%,JWT,0,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
1,d3505f24-d8d4-4258-bb60-fea6cc07931d,%,JWT,12,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
2,admin,%,NATIVE,0,0,2023-06-22 22:50:47,,0,2023-02-16 05:48:22,2023-05-12 15:38:57,0,0,,


## 4. Using Python in a code cell

By default, Python is the language for code cells. In the cell below, we are using a SQLAlchemy connection to execute
the same query as the previous example. The result of this query can be converted into a DataFrame in the same manner
as above

In [27]:
result = db_connection.execute('SELECT * FROM users LIMIT 3;')

df = pd.DataFrame(result)
df

Unnamed: 0,USER,HOST,TYPE,CONNECTIONS,IS_DELETED,LAST_UPDATED,DEFAULT_RESOURCE_POOL,IS_LOCAL,CREATED,PASSWORD_UPDATED,EFFECTIVE_FAILED_LOGIN_ATTEMPTS,EFFECTIVE_PASSWORD_LOCK_TIME,ACCOUNT_STATUS,PASSWORD_EXPIRATION
0,acomet@singlestore.com,%,JWT,0,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
1,d3505f24-d8d4-4258-bb60-fea6cc07931d,%,JWT,8,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
2,admin,%,NATIVE,0,0,2023-06-22 22:50:47,,0,2023-02-16 05:48:22,2023-05-12 15:38:57,0,0,,


## 5. Using both SQL & Python in a code cell

We can use a single line of SQL within a Python cell using a single `%sql` call. Below we combine SQL and 
Python in the same cell to capture the output in the `result` variable. We then convert it to a DataFrame 
as in previous examples.

In [28]:
result = %sql SELECT * FROM users LIMIT 3;

df = pd.DataFrame(result)
df

Unnamed: 0,USER,HOST,TYPE,CONNECTIONS,IS_DELETED,LAST_UPDATED,DEFAULT_RESOURCE_POOL,IS_LOCAL,CREATED,PASSWORD_UPDATED,EFFECTIVE_FAILED_LOGIN_ATTEMPTS,EFFECTIVE_PASSWORD_LOCK_TIME,ACCOUNT_STATUS,PASSWORD_EXPIRATION
0,acomet@singlestore.com,%,JWT,0,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
1,d3505f24-d8d4-4258-bb60-fea6cc07931d,%,JWT,11,0,2023-06-21 16:25:32,,0,2023-02-16 05:48:22,2023-02-16 05:48:22,0,0,,
2,admin,%,NATIVE,0,0,2023-06-22 22:50:47,,0,2023-02-16 05:48:22,2023-05-12 15:38:57,0,0,,


## 6. Preinstalled libraries

By default, a SingleStoreDB notebook has a large number of preinstalled libraries. Run the cell below to see what libraries are already installed!

In [29]:
!pip list

Package                       Version
----------------------------- -----------
aiofiles                      22.1.0
aiosqlite                     0.18.0
alembic                       1.10.2
anyio                         3.6.2
argon2-cffi                   21.3.0
argon2-cffi-bindings          21.2.0
arrow                         1.2.3
asttokens                     2.2.1
async-generator               1.10
attrs                         22.2.0
Babel                         2.12.1
backcall                      0.2.0
backoff                       2.2.1
backports.functools-lru-cache 1.6.4
beautifulsoup4                4.11.2
bleach                        6.0.0
blinker                       1.5
brotlipy                      0.7.0
build                         0.10.0
certifi                       2022.12.7
certipy                       0.1.3
cffi                          1.15.1
charset-normalizer            2.1.1
click                         8.1.3
colorama                      0.4.6
comm     

Our notebooks support libraries available from https://pypi.org/. For example, run the cell below to install the [Kaggle open dataset library](https://pypi.org/project/opendatasets/) to install the `opendatasets` package.

In [30]:
!pip3 install opendatasets

Collecting opendatasets
  Downloading opendatasets-0.1.22-py3-none-any.whl (15 kB)
Collecting kaggle
  Downloading kaggle-1.5.13.tar.gz (63 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m63.3/63.3 kB[0m [31m16.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25ldone
Collecting python-slugify
  Downloading python_slugify-8.0.1-py2.py3-none-any.whl (9.7 kB)
Collecting text-unidecode>=1.3
  Downloading text_unidecode-1.3-py2.py3-none-any.whl (78 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m78.2/78.2 kB[0m [31m28.0 MB/s[0m eta [36m0:00:00[0m
Building wheels for collected packages: kaggle
  Building wheel for kaggle (setup.py) ... [?25ldone
[?25h  Created wheel for kaggle: filename=kaggle-1.5.13-py3-none-any.whl size=77717 sha256=408b603709857a9b678b5adda6a96ffbbcb0ba5c4afb2a76cdd7aecd2e59f06c
  Stored in directory: /home/jovyan/.cache/pip/wheels/f3/16/ff/34e7d368370d4fd68bb749a59f1d2639ed66f3c14358e340a1
Succe

You can even upgrade versions of a preinstalled library. Run the cell below to get the new version of Plotly.

In [31]:
!pip3 install plotly --upgrade

Collecting plotly
  Downloading plotly-5.15.0-py2.py3-none-any.whl (15.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.5/15.5 MB[0m [31m97.7 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
Installing collected packages: plotly
  Attempting uninstall: plotly
    Found existing installation: plotly 5.13.0
    Uninstalling plotly-5.13.0:
      Successfully uninstalled plotly-5.13.0
Successfully installed plotly-5.15.0


## 7. Magic commands

Magic commands in Jupyter Notebook are special commands that allow you to perform various tasks that are not part of the standard Python language. We have demonstrated two of the included magic commands already: `%%sql` for submitting entire cells of
SQL code and `%sql` for submitting a single query in the context of a Python code cell.

There are many other magic commands as well for everything from file system access to debugging your Python code.
For information about teh full list of magic commands available, run the code cell below.

In [70]:
%quickref


IPython -- An enhanced Interactive Python - Quick Reference Card

obj?, obj??      : Get help, or more help for object (also works as
                   ?obj, ??obj).
?foo.*abc*       : List names in 'foo' containing 'abc' in them.
%magic           : Information about IPython's 'magic' % functions.

Magic functions are prefixed by % or %%, and typically take their arguments
without parentheses, quotes or even commas for convenience.  Line magics take a
single % and cell magics are prefixed with two %%.

Example magic function calls:

%alias d ls -F   : 'd' is now an alias for 'ls -F'
alias d ls -F    : Works if 'alias' not a python name
alist = %alias   : Get list of aliases to 'alist'
cd /usr/share    : Obvious. cd -<tab> to choose from visited dirs.
%cd??            : See help AND source for magic %cd
%timeit x=10     : time the 'x=10' statement with high precision.
%%timeit x=2**100
x**100           : time 'x**100' with a setup of 'x=2**100'; setup code is not
                   co

**Learn more about SingleStoreDB notebooks [here](https://docs.singlestore.com/managed-service/en/developer-resources/notebooks.html) and get started with your first notebook!**

<img src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/banner-colors-reverse.png" style="width: 100%; margin: 0; padding: 0"/>