# Introduction to Analytical SQL Cell

For the last decade, the data ecosystem has mainly focused on the technologies to store and process big datasets — the bigger the better. Later on, Modern Data Stack emerged as a cloud-native suite of products used for data integration and data analytics by the more technology-forward companies. Data warehouses are now a default piece of the Modern Data Stack and Snowflake’s rapid rise has been the poster child of this trend.

But in real life, most analytical workloads aren’t massive. Users prefer easy and fast answers to their questions instead of waiting for the cloud to spin. Instead of a distributed database in the cloud, most analyses can be handled with an optimized engine on our laptop and the cloud can be leveraged when needed.

In addition, the current warehouse paradigm is designed for a client-server use case which is cumbersome to integrate with Jupyter Notebook, the lingua franca of the data scientist community. An entire portion of enthusiasts and practitioners who prefer processing analytical workloads with SQL are underserved in Jupyter Notebook. They’ll however have to take the learning curve of Python/R to fulfill their daily work or abide by the low performance of transactional database solutions.

But SQL is more viable than ever before. According to Stack Overflow Developer Survey 2022, SQL is the third most used programming language among all.

To fill the gap, we are thrilled to announce Analytical SQL Cell, a free and open source Jupyter Widget to offer a Personal Data Lake experience for Jupyter Notebook users.

# Installation

You can install Analytical SQL Cell using `pip`:

In [1]:
%pip install asqlcell --upgrade

Note: you may need to restart the kernel to use updated packages.


# Usage

Then you can import the module and load magic extension of analytical sql with `%load_ext`:

In [2]:
import asqlcell

%load_ext asqlcell

Analytical SQL Cell supports *line* magic `%sql` for loading data with SQL statement and a Pandas dataframe will be returned:

In [3]:
data = %sql SELECT * FROM 'introduction.csv'

You can further explore `data` as follows:

In [4]:
data.head()

Unnamed: 0,id,price,normal
0,0,94.830479,True
1,1,72.969358,True
2,2,59.427536,True
3,3,8.669261,True
4,4,11.028436,True


Analytical SQL Cell also supports *block* magic `%%sql` so that you can fill SQL statement in the whole cell. Notice that you'll need to specify the variable name to host the result set. In the following example, the result set will be stored in the variable named `data` specified by the cell magic:

In [5]:
%%sql data

SELECT
*
FROM 'introduction.csv'
WHERE price > 50

SqlCellWidget(cache='{"tabValue": "table"}', chart_config='{"type": null, "x": {"label": null, "field": null, …

As usual, you can further explore `data` as follows:

In [6]:
data.head()

Unnamed: 0,id,price,normal
0,0,94.830479,True
1,1,72.969358,True
2,2,59.427536,True


Please find the docstring of the magic as follows:

In [7]:
%sql?

[0;31mDocstring:[0m
::

  %execute [-o OUT] [-c CON] [-e EXPLAIN] [-d DB]
               [output] [line [line ...]]

Execute the magic extension. This could be a line magic or a cell magic.

positional arguments:
  output                The variable name for the result dataframe.
  line                  The SQL statement.

optional arguments:
  -o OUT, --out OUT     The variable name for the result dataframe.
  -c CON, --con CON     The variable name for database connection.
  -e EXPLAIN, --explain EXPLAIN
                        Return Sql Explain or not.
  -d DB, --db DB        Use database.
[0;31mFile:[0m      ~/miniconda3/envs/asqlcell/lib/python3.8/site-packages/asqlcell/magic.py

# Tutorials

Please refer to the tutorials for further information:

[World Development in Numbers](gapminder.ipynb)