[Reference](https://levelup.gitconnected.com/mlops-building-a-real-time-data-pipeline-with-kafka-two-projects-a-step-by-step-guide-d93bace2676c)

# Installation Steps:

## 1. Install Prerequisites:

In [2]:
# brew install openjdk@11

In [3]:
# echo 'export PATH="/usr/local/opt/openjdk@11/bin:$PATH"' >> ~/.zshrc

## 2. Install Kafka:

In [4]:
# brew install kafka

## 3. Start ZooKeeper:

In [5]:
# zookeeper-server-start /usr/local/etc/kafka/zookeeper.properties

# 4. Start Kafka Server:

In [6]:
# kafka-server-start /usr/local/etc/kafka/server.properties

# Data Flow:

![Data flow](https://miro.medium.com/v2/resize:fit:2000/format:webp/1*2M4FggXUsHHy1iJ9d4VajA.png)

# Topic & Producer:

## Task1 : Create a Kafka Producer for Simulated Weather Data:


In [7]:
# Create a topic named "weather_forecasts"
kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic weather_forecasts

## Task 2: Import the Kafka Producer and Necessary Libraries

In [8]:
# from kafka import KafkaProducer
# import json  # If you need to send JSON-formatted data
# import time  # For any time-related tasks like sleep
# import random  # If you're simulating data and need random values

ModuleNotFoundError: ignored

## Task 3: Create the Producer in Python
## Task 4: Generate Random Weather Data
## Task 5: Send Weather Data to the Topic

In [9]:
from kafka import KafkaProducer
import time
import random

producer = KafkaProducer(bootstrap_servers='localhost:9092')

WEATHER_STATES = ['Sunny', 'Rainy', 'Windy', 'Cloudy', 'Snowy']

while True:
    weather = random.choice(WEATHER_STATES)
    temperature = random.randint(-5, 35)  # temperatures from -5°C to 35°C
    humidity = random.randint(10, 90)  # humidity percentage

    message = f"Weather: {weather}, Temperature: {temperature}°C, Humidity: {humidity}%"
    producer.send('weather_forecasts', value=message.encode('utf-8'))
    time.sleep(5)  # Send every 5 seconds

# Consumer:

## Task 6: Import the Kafka Consumer and Necessary Libraries
## Task 7: Create the Consumer in Python
## Task 8: Decode and Visualize the Weather Data

In [15]:
from kafka import KafkaConsumer
import matplotlib.pyplot as plt

consumer = KafkaConsumer('weather_forecasts', bootstrap_servers='localhost:9092')

temperatures, humidities = [], []

plt.ion()  # Turn on interactive mode

while True:
    for message in consumer:
        data = message.value.decode()
        temp = int(data.split("Temperature: ")[1].split("°C")[0])
        humidity = int(data.split("Humidity: ")[1].split("%")[0])

        temperatures.append(temp)
        humidities.append(humidity)

        if len(temperatures) > 50:
            temperatures.pop(0)
            humidities.pop(0)

        plt.clf()  # clear the plot
        plt.subplot(2, 1, 1)
        plt.plot(temperatures, label="Temperature (°C)")
        plt.legend()

        plt.subplot(2, 1, 2)
        plt.plot(humidities, label="Humidity (%)")
        plt.legend()

        plt.pause(5)  # refresh every 5 seconds

# Execution of the Pipeline:

1. Complete the 1–4 steps from installation prerequisite to Start Kafha server.
2. Create the Kafka topic — mentioned above.
3. Set up a Python environment:Install virtualenv (If not installed): Virtual environments are helpful to manage Python dependencies per project.

4. Create a virtual environment:

In [13]:
virtualenv kafka_env

5. Activate the virtual environment:

In [14]:
source kafka_env/bin/activate  # macOS/Linux
kafka_env\Scripts\activate     # Windows

6.Install necessary Python libraries:

In [11]:
pip install kafka-python, matplotlib

7. Write the Kafka Producer script: <br>
Copy the Python script (e.g., producer.py) and implement the code to produce simulated weather data.

8.Run the Kafka Producer script

In [16]:
python producer.py

9. Run the Consumer Script in another terminal <br>
Navigate to the directory where your consumer script (consumer.py or any other name you've chosen) is saved.

Activate your virtual environment if it’s not already activated:

In [17]:
source venv_kafka/bin/activate
python consumer.py

In [18]:
kafka-server-stop.sh
zookeeper-server-stop.sh

#Deactivate the virtual environment:
deactivate