
<div  style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://raw.githubusercontent.com/derar-alhussein/Databricks-Certified-Data-Engineer-Associate/main/Includes/images/bookstore_schema.png" alt="Databricks Learning" style="width: 600">
</div>

In [0]:
%run ../Includes/GenData


## Reading Stream

In [0]:
(spark.readStream
      .table("book")
      .createOrReplaceTempView("books_streaming_tmp_vw")
)


## Displaying Streaming Data

Antes de mostrar datos, debes crear un volumen para almacenar los puntos de control.
Ejecuta estos comandos para crear un catalogo llamado demo, dentro un esquema llamado dwh y dentro un volumen llamado temporal

```sql
create catalog demo;
create database demo.dwh;
create volume IF NOT EXISTS demo.dwh.temporal
```

In [0]:
df = spark.table("books_streaming_tmp_vw")
display(
    df,
    checkpointLocation="/Volumes/demo/dwh/temporal/chkpoint/read01"
)

## Applying Transformations

Deberás cambiar el punto de control para que funcione

In [0]:
df = spark.table("books_streaming_tmp_vw")
agg_df = (
    df.groupBy("category")
    .count()
    .withColumnRenamed("count", "total_books")
)
display(
    agg_df,
    checkpointLocation="/Volumes/demo/dwh/temporal/chkpoint/readgb"
)


## Unsupported Operations

In [0]:
display(
  spark.table("books_streaming_tmp_vw").orderBy("author"),
  checkpointLocation="/Volumes/demo/dwh/temporal/chkpoint/readau"
)


## Persisting Streaming Data

In [0]:
%sql
CREATE OR REPLACE TEMP VIEW author_counts_tmp_vw AS (
  SELECT author, count(book_id) AS total_books
  FROM books_streaming_tmp_vw
  GROUP BY author
)

In [0]:
spark.table("author_counts_tmp_vw")\
    .writeStream\
    .trigger(availableNow=True)\
    .outputMode("complete")\
    .option("checkpointLocation", "/Volumes/demo/dwh/temporal/chkpoint/readvw")\
    .table("author_counts")

In [0]:
%sql
SELECT *
FROM author_counts

## Adding New Data

In [0]:
%sql
INSERT INTO books
values ("B19", "Introduction to Modeling and Simulation", "Mark W. Spong", "Computer Science", 25),
        ("B20", "Robot Modeling and Control", "Mark W. Spong", "Computer Science", 30),
        ("B21", "Turing's Vision: The Birth of Computer Science", "Chris Bernhardt", "Computer Science", 35)

## Streaming in Batch Mode 

In [0]:
%sql
INSERT INTO books
values ("B16", "Hands-On Deep Learning Algorithms with Python", "Sudharsan Ravichandiran", "Computer Science", 25),
        ("B17", "Neural Network Methods in Natural Language Processing", "Yoav Goldberg", "Computer Science", 30),
        ("B18", "Understanding digital signal processing", "Richard Lyons", "Computer Science", 35)

In [0]:
(spark.table("author_counts_tmp_vw")                               
      .writeStream           
      .trigger(availableNow=True)
      .outputMode("complete")
      .option("checkpointLocation", "/Volumes/demo/dwh/temporal/chkpoint/readvw/wr")
      .table("author_counts")
      .awaitTermination()
)

In [0]:
%sql
SELECT *
FROM author_counts