# BlazingContext API

## BlazingContext()

### Single GPU
[Docs](https://docs.blazingdb.com/docs/blazingcontext) | [BlazingSQL Notebooks](https://app.blazingsql.com/jupyter/user-redirect/lab/workspaces/auto-b/tree/Welcome_to_BlazingSQL_Notebooks/docs/blazingsql.ipynb#BlazingContext)

Create basic (single GPU) BlazingContext instance.

In [None]:
from blazingsql import BlazingContext

In [None]:
bc = BlazingContext()

Configure some options when creating a BlazingContext instance.

In [None]:
bc = BlazingContext(dask_client=None,
                    network_interface=None,
                    allocator='managed',
                    pool=False,
                    initial_pool_size=None,
                    enable_logging=False,
                    config_options={})

### Multi-GPU

Distribute BlazingSQL query execution across multiple GPUs with Dask. Read more with our blog post "[Distributed SQL with Dask](https://blog.blazingdb.com/distributed-sql-with-dask-2979262acc8a?source=friends_link&sk=077319064cd7d9e18df8c0292eb5d33d)".

#### Single Node Multi-GPU
[Docs](https://docs.blazingdb.com/docs/distributed) | [BlazingSQL Notebooks](query_tables.ipynb#Distributed-Queries)

Distribute BlazingSQL query execution across all GPUs within the same system. 

In [None]:
from dask_cuda import LocalCUDACluster
from dask.distributed import Client

cluster = LocalCUDACluster()
client = Client(cluster)

In [None]:
from blazingsql import BlazingContext

bc = BlazingContext(dask_client=client, network_interface='lo')

#### Mulit Node Multi-GPU (MNMG)
[Docs](https://docs.blazingdb.com/docs/distributed#multiple-node---multiple-gpu-with-dask-scheduler-and-dask-worker)

Distribute BlazingSQL query execution across multiple GPUs within multiple . 

In [None]:
# # on one terminal
# conda activate bsql
# CUDA_VISIBLE_DEVICES=0 dask-worker 127.0.0.1:8786

# # on another terminal
# conda activate bsql
# CUDA_VISIBLE_DEVICES=1 dask-worker 127.0.0.1.123:8786

# # repeat for other GPUs

In [None]:
# from blazingsql import BlazingContext
# from dask.distributed import Client

# client = Client('127.0.0.1:8786')

# bc = BlazingContext(dask_client=client, network_interface='eth0')

### .create_table()
[Docs](https://docs.blazingdb.com/docs/create_table) | [BlazingSQL Notebooks](create_tables.ipynb)


In [None]:
from blazingsql import BlazingContext
bc = BlazingContext()

Create a table from a local file.

In [None]:
bc.create_table('iris', '../../../data/iris.csv')

Create a table from a `cudf.DataFrame`.

In [None]:
import cudf
df = cudf.read_csv('https://raw.githubusercontent.com/BlazingDB/Welcome_to_BlazingSQL_Notebooks/branch-0.15/data/iris.csv')

bc.create_table('iris', df)

Create a table from a AWS S3 bucket.

In [None]:
bc.s3('bsql', bucket_name='blazingsql-colab')

bc.create_table('taxi', 's3://bsql/yellow_taxi/1_0_0.parquet')

### .drop_table()
[Docs](https://docs.blazingdb.com/docs/drop_table) | [BlazingSQL Notebooks](https://app.blazingsql.com/jupyter/user-redirect/lab/workspaces/auto-b/tree/Welcome_to_BlazingSQL_Notebooks/docs/blazingsql/blazingcontext_api/blazingcontext.ipynb#.drop_table())                                                       

Drop a BlazingSQL table.

In [None]:
bc.drop_table('taxi')

### .sql()
[Docs](https://docs.blazingdb.com/docs/bcsql) | [BlazingSQL Notebooks](query_tables.ipynb)
                                                                              
Query a BlazingSQL table.

In [None]:
bc.sql('SELECT * FROM iris')

In [None]:
query = '''
        SELECT
            petal_width,
            sepal_length,
            target
        FROM
            iris
            '''
bc.sql(query)

In [None]:
bc.sql(query).to_pandas().plot(kind='scatter', x='sepal_length', y='petal_width')

### .explain()
[Docs](https://docs.blazingdb.com/docs/explain)

To better understand what's going on, BlazingContext's .explain() method can be called to break down a given query's Logical Relational Algebra plan. Note that internally at runtime, the Logical Relational Algebra will be converted to a Physical Relational Algebra plan that takes some of the Logical Relational Algebra steps and breaks them down further into more steps.


In [None]:
from blazingsql import BlazingContext
bc = BlazingContext()

bc.create_table('iris', '../../../data/iris.csv')

In [None]:
query = '''
        SELECT
            CAST(sepal_length AS INT) int_length,
            CASE 
                WHEN sepal_width >= 1 THEN 1 
                ELSE 0 END case_width
        FROM
            iris
            '''
bc.explain(query)

In [None]:
exp = bc.explain(query)

print(exp)

In [None]:
bc.sql(query)

### .log()
[Docs](https://docs.blazingdb.com/docs/explain) | [BlazingSQL Notebooks](../../../intro_notebooks/bsql_logs.ipynb)

BlazingSQL has an internal log that records events from every node from all queries run. The events include runtime query step execution information, performance timings, errors and warnings. The logs table is called bsql_logs. You can query the logs as if it were any other table, except you use the `.log()` function, instead of the `sql()`.

In [None]:
from blazingsql import BlazingContext
bc = BlazingContext()

How long did each successfully run query take?

In [None]:
bc.log("SELECT log_time, query_id, duration FROM bsql_logs WHERE info = 'Query Execution Done' ORDER BY log_time DESC")

### .partition()
[Docs](https://docs.blazingdb.com/docs/unify_partitions)

BlazingSQL allows you to partition a dask_cudf DataFrame based on one or more columns. This can be useful as a preparation step before running a function on these partitions using dask, where you expect each partition that you run the function on to be partitioned on certain columns. The partitioning is done using a hash partition algorithm.

In [None]:
from blazingsql import BlazingContext
from dask.distributed import Client
from dask_cuda import LocalCUDACluster

cluster = LocalCUDACluster()
client = Client(cluster)
bc = BlazingContext(dask_client=client)

bc.create_table('product_reviews', "product_reviews/*.parquet")

query_1= """
        SELECT pr_item_sk, 
            pr_review_content, 
            pr_review_sk
        FROM product_reviews 
        where pr_review_content IS NOT NULL
    """

product_reviews_df = bc.sql(query_1)
product_reviews_df = bc.partition(product_reviews_df, 
                                  by=["pr_item_sk", 
                                      "pr_review_content", 
                                      "pr_review_sk"])

sentences = product_reviews_df.map_partitions(create_sentences_from_reviews)


### Version Info
[Docs](https://docs.blazingdb.com/docs/blazingsql-version-info)

In [None]:
import blazingsql

In [None]:
blazingsql.__version__

In [None]:
blazingsql.__info__()

# BlazingSQL Docs
**[Table of Contents](../TABLE_OF_CONTENTS.ipynb) | [Issues (GitHub)](https://github.com/BlazingDB/Welcome_to_BlazingSQL_Notebooks/issues)**