# Kafka Producer — Stock Market Data Stream

This notebook acts as a **Kafka producer**: it reads stock market data from a processed CSV (`indexProcessed.csv`), samples one row per second, and publishes each row as a JSON message to the Kafka topic `stock-market-index`. Use it to simulate a real-time stream of stock index data into your Kafka cluster. Ensure the Kafka broker is running and the topic exists before executing. Run the consumer notebook in parallel (or on another machine) to read the same stream.

In [None]:
%pip install kafka-python pandas

In [None]:
import pandas as pd # type: ignore
from kafka import KafkaProducer # type: ignore
from time import sleep
from json import dumps
import json

In [None]:
# Configuration — change these to match your environment
BOOTSTRAP_SERVERS = ['localhost:9092']  # e.g. ['<EC2_IP>:9092'] for remote broker
TOPIC_NAME = 'stock-market-index'
CSV_PATH = 'indexProcessed.csv'

In [None]:
producer = KafkaProducer(
    bootstrap_servers=BOOTSTRAP_SERVERS,
    value_serializer=lambda x: dumps(x).encode('utf-8')
)

In [None]:
# Optional: send a test message to verify connectivity
producer.send(TOPIC_NAME, value={'status': 'ok', 'source': 'producer'})

In [None]:
df = pd.read_csv(CSV_PATH)

In [None]:
df.head()

In [None]:
while True:
    dict_stock = df.sample(1).to_dict(orient="records")[0]
    producer.send(TOPIC_NAME, value=dict_stock)
    sleep(1)

In [None]:
# Ensure all buffered messages are delivered before stopping
producer.flush()