# Spark SQL Views

## Overview

The previous section introduced SQL tables in Spark. Tables persist beyond the Spark session that they were created. 
In this section, we will look into Spark views. A view in Spark refers to a table and can be either global i.e.
accross all SparkSessions on a given cluster or session-scoped [1]. However, views are temporary i.e. once the Spark
application terminates, they disappear [1].

## Spark SQL views

Creating a view in Spark has a similar syntax to creating a table. The difference between a table and a view
is that a view does not actually hold any data [1]. Once a view is created, we cab query just like we
can query a table. Creating a view of a table is simple. This is shown below


```
query = "CREATE OR REPLACE GLOBAL TEMP VIEW iris_sepal_patal_width_view AS SELECT sepal_width, petal_width FROM iris_dataset"
spark.sql(query)
```

or a local view using


```
query = CREATE OR REPLACE TEMP VIEW iris_sepal_patal_width_view AS SELECT sepal_width, petal_width FROM iris_dataset
spark.sql(query)

```

The same effect can be achived using the API exposed by _DataFrame_ as shown below


```
df.createOrReplaceGlobalTempView("iris_sepal_patal_width_view")
df.createOrReplaceTempView("iris_sepal_patal_width_view")
```

Note that when using a a global temporary view we must use the prefix ```gloabl_temp.<your-view-name>```. This is
because Spark creates gloabl temporary views in a global temporary database called ```global_temp``` [1]. For example,

```
SELECT * FROM global_temp.<your-view-name>
```


Similarly, we can drop a view using either SQL or the API exposed by the _DataFrame_ class.

```
DROP VIEW IF EXISTS <your-view-name>
spark.catalog.dropGlobalTempView("<your-view-name>")
spark.catalog.dropTempView("<your-view-name>")
```



## Summary

In this section we discussed views in Spark. Views are like tables but they are temporary i.e. after the 
_SparkSession_ that created the view exits. We can have either global or local views. The former can be accessed by
every _SparkSession_ that is active on the Spark cluster whereas the latter only by the specific session that created it.

## References

1. Jules S. Damji, Brooke Wenig, Tathagata Das, Deny Lee, _Learning Spark. Lighting-fasts data analytics_, 2nd Edition, O'Reilly.