## Preview Data using console

Let us preview the data consumed from Kafka topic using Spark Structured Streaming APIs.
* We can either use `console` or `memory` as part of `writeStream.format` to preview the data.
* First we will see how to preview the data using `console`. For that we need to launch the Pyspark CLI and run the relevant code.

Launch Pyspark using below commands and run Spark Structured Streaming Code.

**Using Pyspark2**

```shell
export PYSPARK_PYTHON=python3

pyspark2 \
    --master yarn \
    --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1 \
    --conf spark.ui.port=0 \
    --conf spark.sql.warehouse.dir=/user/${USER}/warehouse
```

**Using Pyspark3**

```shell
export PYSPARK_PYTHON=python3

pyspark3 \
    --master yarn \
    --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1 \
    --conf spark.ui.port=0 \
    --conf spark.sql.warehouse.dir=/user/${USER}/warehouse
```

* Let us create data frame and preview the schema.

```python
import getpass
username = getpass.getuser()

kafka_bootstrap_servers = 'w01.itversity.com:9092,w02.itversity.com:9092'

df = spark. \
  readStream. \
  format('kafka'). \
  option('kafka.bootstrap.servers', kafka_bootstrap_servers). \
  option('subscribe', f'{username}_retail'). \
  load()

df.printSchema()
```

We can use below code snippet using CLI to preview the data using `console`.
```python
df.selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)"). \
    writeStream. \
    format("console"). \
    option('truncate', 'false'). \
    trigger(processingTime='5 seconds'). \
    start()
```