# FugueSQL Integrations

[FugueSQL](https://fugue-tutorials.readthedocs.io/tutorials/fugue_sql/index.html) is a related project that aims to provide a unified SQL interface for a variety of different computing frameworks, including Dask.
While it offers a SQL engine with a larger set of supported commands, this comes at the cost of slower performance when using Dask in comparison to dask-sql.
In order to offer a "best of both worlds" solution, dask-sql can easily be integrated with FugueSQL, using its faster implementation of SQL commands when possible and falling back on FugueSQL's implementation when necessary.

## Setup

FugueSQL offers the cell magic `%%fsql`, which can be used to define and execute queries entirely in SQL, with no need for external Python code!

To use this cell magic, users must install [fugue-jupyter](https://pypi.org/project/fugue-jupyter/), which will additionally provide SQL syntax highlighting (note that the kernel must be restart after installing):

In [None]:
!pip install fugue-jupyter

And run `fugue_jupyter.setup()` to register the magic:

In [1]:
from fugue_jupyter import setup

setup()

We will also start up a Dask client, which can be specified as an execution engine for FugueSQL queries:

In [2]:
from dask.distributed import Client

client = Client()

## dask-sql as a FugueSQL execution engine

When dask-sql is installed, its `DaskSQLExecutionEngine` is automatically registered as the default engine for FugueSQL queries ran on Dask.
We can then use it to run queries with the `%%fsql` cell magic, specifying `dask` as the execution engine to run the query on:

In [3]:
%%fsql dask

CREATE [["xyz"], ["xxx"]] SCHEMA a:str
SELECT * WHERE a LIKE '%y%'
PRINT

Unnamed: 0,a
0,xyz


We can also use the `YIELD` keyword to register the results of our queries into Python objects:

In [4]:
%%fsql dask
src = CREATE [["xyz"], ["xxx"]] SCHEMA a:str

a = SELECT a AS b WHERE a LIKE '%y%'
    YIELD DATAFRAME AS test

b = SELECT CONCAT(a, '-') AS b FROM src WHERE a LIKE '%xx%'
    YIELD DATAFRAME AS test1

SELECT * FROM a UNION SELECT * FROM b
PRINT

Unnamed: 0,b
0,xyz
1,xxx-


Which can then be interacted with outside of SQL:

In [5]:
test.native  # a Dask DataFrame

Unnamed: 0_level_0,b
npartitions=2,Unnamed: 1_level_1
,object
,...
,...


In [6]:
test1.native.compute()

Unnamed: 0,b
1,xxx-


We can also run the equivalent of these queries in python code using `fugue_sql.fsql`, passing the Dask client into its `run` method to specify Dask as an execution engine:

In [7]:
from fugue_sql import fsql

fsql("""
CREATE [["xyz"], ["xxx"]] SCHEMA a:str
SELECT * WHERE a LIKE '%y%'
PRINT
""").run(client)

Unnamed: 0,a
0,xyz


DataFrames()

In [8]:
result = fsql("""
CREATE [["xyz"], ["xxx"]] SCHEMA a:str
SELECT * WHERE a LIKE '%y%'
YIELD DATAFRAME AS test2
""").run(client)

result["test2"].native  # a Dask DataFrame

Unnamed: 0_level_0,a
npartitions=2,Unnamed: 1_level_1
,object
,...
,...
