# Processing Orders

## Create the required topics

Assuming we are using the `curso800` account, let's start by creating `orders.curso800` and `manufacturing.curso800` topics in kafka. 

The first topic will receive the customer orders from the producer script and the second topic will be where the streaming app will generate the manufacturing orders.

```bash
module load kafka

export BROKER="10.133.29.20:9092"

kafka-topics.sh --bootstrap-server $BROKER --topic orders.curso800 --create --partitions 1 --replication-factor 1

kafka-topics.sh --bootstrap-server $BROKER --topic manufacturing.curso800 --create --partitions 1 --replication-factor 1
```

## Review the consumer app code and launch it

Review the consumer app code: [Unit_8_processing_orders_lab.py](exercises/Unit_8_processing_orders_lab.py)

**Update the topic names from `orders.curso800` and `manufacturing.curso800` to the appropriate values.**

Remove the checkpoint dir
Launch the app:

- Option 1: Using Spark 3.4.3:

```bash
module load spark/3.4.3
export BROKER="<broker_ip_address>:9092"

# Remove previous checkpoint dir (in case it exists)
hdfs dfs -rm -r -f orders_checkpoint_dir

spark-submit --conf spark.dynamicAllocation.enabled=false --num-executors 2 --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.4.3 Unit_8_processing_orders_lab.py
```


- Option 2: Using Spark 2.4.0:

```bash
export BROKER="<broker_ip_address>:9092"

# Remove previous checkpoint dir (in case it exists)
hdfs dfs -rm -r -f orders_checkpoint_dir

spark-submit --conf spark.dynamicAllocation.enabled=false --num-executors 2 --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.0 Unit_8_processing_orders_lab.py
```

NOTE: It is important to **remove the previous checkpoint dir** (in case it exists) if you modify the app or if you want to restart from the beginning. In other case you will get strange error messages.

## Review the producer code and launch it

Review the producer code: [Unit_8_orders_producer_kafka-python.py](exercises/Unit_8_orders_producer_kafka-python.py)

As you can see it is a standalone python program that uses `kafka-python`.

**Update the topic name from `orders.curso800` to the appropriate value.**


Launch the producer script:
```bash
module load anaconda3/2024.02-1
python Unit_8_orders_producer_kafka-python.py
```

## See how new manufacturing orders are generated in real-time

With a console consumer we will now see how new producing orders are being generated by the streaming app and send to the `manufacturing.curso800` topic:

```bash
module load kafka
export BROKER="10.133.29.20:9092"
export TOPIC="manufacturing.curso800"
kafka-console-consumer.sh --bootstrap-server $BROKER --topic $TOPIC --from-beginning
```