# Real-Time Prediction with Apache Kafka and Neural Networks

In this notebook, we build a real-time data processing system using Apache Kafka and integrate a neural network to predict next day stock prices. We simulate the data, train a simple NN model using TensorFlow/Keras, and implement a producer-consumer architecture where the producer streams data and the consumer processes it in real time.

In [None]:
# Install required libraries (uncomment if necessary)
# !pip install kafka-python tensorflow

import json
import time
import random
import threading
import uuid
import datetime

from kafka import KafkaProducer, KafkaConsumer

import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

print('Libraries imported successfully!')

## Global Configuration

Here we define our Kafka broker URL, topics, and other parameters.

In [None]:
# Kafka configuration
broker_url = 'localhost:9092'
input_topic = 'input-data'
pred_topic = 'predictions'

# Other configurations
producer_interval = 1  # seconds between messages

print('Global configuration set.')

## Neural Network Training

We simulate a dataset for stock prices. For simplicity, today's price is used as input and the next day's price (with some noise) is the target. Then we build and train a small feedforward neural network.

In [None]:
# Simulate training data
np.random.seed(42)
prices = np.linspace(100, 200, num=1000)  # simulated prices
noise = np.random.normal(0, 2, prices.shape)
next_day_prices = prices + noise  # next day prices with noise

X = prices.reshape(-1, 1)
y = next_day_prices.reshape(-1, 1)

print('Training data created.')

# Build a simple neural network model
model = Sequential([
    Dense(64, activation='relu', input_dim=1),
    Dense(32, activation='relu'),
    Dense(1)  # output layer for regression
])

model.compile(optimizer='adam', loss='mean_squared_error')

print('Training the model...')
model.fit(X, y, epochs=50, batch_size=32, verbose=0)
print('Model training complete.')

# Save the trained model to disk so that the consumer can load it
model.save('stock_model.h5')
print('Model saved as stock_model.h5')

## Producer Implementation

The producer simulates real-time streaming by publishing messages with a unique ID, stock price, and a timestamp to the `input-data` Kafka topic.

In [None]:
def start_producer():
    producer = KafkaProducer(
        bootstrap_servers=broker_url,
        value_serializer=lambda v: json.dumps(v).encode('utf-8')
    )
    
    while True:
        # Simulate a stock price
        price = random.uniform(100, 200)
        data = {
            'id': str(uuid.uuid4()),
            'features': {'price': price},
            'timestamp': datetime.datetime.utcnow().isoformat()
        }
        producer.send(input_topic, value=data)
        print('Produced:', data)
        time.sleep(producer_interval)

# To test the producer by itself, uncomment the following line:
# start_producer()

## Consumer Implementation with Neural Network Prediction

The consumer subscribes to the `input-data` topic, processes incoming messages, loads the pre-trained neural network, and uses it to predict the next day's price. The result (including the original price, prediction, and timestamp) is then published to the `predictions` topic.

In [None]:
def start_consumer():
    # Load the pre-trained model
    model = keras.models.load_model('stock_model.h5')
    
    consumer = KafkaConsumer(
        input_topic,
        bootstrap_servers=broker_url,
        auto_offset_reset='earliest',
        value_deserializer=lambda m: json.loads(m.decode('utf-8'))
    )
    
    producer = KafkaProducer(
        bootstrap_servers=broker_url,
        value_serializer=lambda v: json.dumps(v).encode('utf-8')
    )
    
    for msg in consumer:
        data = msg.value
        price = data['features']['price']
        
        # Preprocess the data (for our simple example, no scaling is applied)
        input_array = np.array([[price]])
        
        # Predict the next day price using the model
        pred = model.predict(input_array)
        prediction = pred[0, 0]
        
        result = {
            'id': data['id'],
            'original_price': price,
            'predicted_next_day_price': float(prediction),
            'prediction_timestamp': datetime.datetime.utcnow().isoformat()
        }
        
        # Publish the prediction to the predictions topic
        producer.send(pred_topic, value=result)
        print('Consumed and Predicted:', result)

# To test the consumer by itself, uncomment the following line:
# start_consumer()

## Running the Producer and Consumer

For demonstration purposes, we run both the producer and consumer in separate threads. In production, these would typically be separate services.

In [None]:
# Start producer and consumer in separate daemon threads
producer_thread = threading.Thread(target=start_producer, daemon=True)
consumer_thread = threading.Thread(target=start_consumer, daemon=True)

producer_thread.start()
consumer_thread.start()

print('Producer and Consumer threads started. Press Ctrl+C to stop.')

# Keep the notebook running
try:
    while True:
        time.sleep(1)
except KeyboardInterrupt:
    print('Stopping threads...')

## Real-Time Monitoring

Logs printed in the notebook show the number of messages produced and consumed, along with prediction details. In a production system, you might integrate a dashboard to monitor metrics like message throughput, latency, and prediction accuracy.