In [0]:
%run ./Setup

In [0]:
spark.conf.set("spark.datasource.singlestore.ddlEndpoint", cluster)
spark.conf.set("spark.datasource.singlestore.user", "admin")
spark.conf.set("spark.datasource.singlestore.password", password)
spark.conf.set("spark.datasource.singlestore.disablePushdown", "false")

In [0]:
df = (spark.read
      .format("singlestore")
      .load("weather.temperatures_all"))

In [0]:
df.createOrReplaceTempView("temperatures")

In [0]:
def convert_to_c(f):
  c = (f - 32) * (5 / 9)
  return round(c, 2)

spark.udf.register("convert_to_c", convert_to_c)

In [0]:
spark.sql(
  "SELECT Date, convert_to_c(Max) as Max_C, convert_to_c(Min) as Min_C FROM temperatures WHERE City = 'San Francisco'"
).explain()

Using ```.explain()``` we can see the execution plan. The final plan shows a single projection on top of a scan.

The SingleStore Connector was able to Pushdown the following to SingleStore:<br/>
```SELECT Date, ...```<br/>
and<br/>
```WHERE City = 'San Francisco'```<br/>

Evaluation of the UDF on the fields <strong>Max</strong> and <strong>Min</strong> was left to Spark, since that is where the UDF lives.

In [0]:
display(spark.sql(
  "SELECT Date, convert_to_c(Max) as Max_C, convert_to_c(Min) as Min_C FROM temperatures WHERE City = 'San Francisco'"
))

Benefits of SingleStore Connector:
- Implemented as a native Spark SQL plugin.
- Accelerates ingest from Spark via compression.
- Supports data loading and extraction from database tables and Spark Dataframes.
- Integrates with the Catalyst query optimiser and supports robust SQL Pushdown.
- Accelerates ML workloads.