# Kafka Producer Basics

This notebook demonstrates how to create and use Kafka producers to send messages to topics.

## Topics Covered:
- Connecting to Kafka cluster
- Creating a simple producer
- Sending synchronous messages
- Sending asynchronous messages with callbacks
- Error handling

## 1. Setup and Configuration

In [None]:
import os
import json
import time
from datetime import datetime
from kafka import KafkaProducer
from kafka.errors import KafkaError

# Kafka cluster connection
KAFKA_SERVERS = os.getenv('KAFKA_BOOTSTRAP_SERVERS', 'kafka1:29092,kafka2:29093,kafka3:29094')
print(f"Connecting to Kafka at: {KAFKA_SERVERS}")

## 2. Create a Simple Producer

In [None]:
# Create producer with JSON serialization
producer = KafkaProducer(
    bootstrap_servers=KAFKA_SERVERS.split(','),
    value_serializer=lambda v: json.dumps(v).encode('utf-8'),
    key_serializer=lambda k: k.encode('utf-8') if k else None,
    acks='all',  # Wait for all replicas to acknowledge
    retries=3,
    max_in_flight_requests_per_connection=1
)

print("✓ Producer created successfully!")

## 3. Send Messages Synchronously

In [None]:
topic_name = 'test-topic'

# Send a single message
message = {
    'timestamp': datetime.now().isoformat(),
    'message': 'Hello from Kafka Playground!',
    'type': 'test'
}

try:
    # Send synchronously and wait for acknowledgment
    future = producer.send(topic_name, value=message, key='msg-1')
    record_metadata = future.get(timeout=10)
    
    print(f"✓ Message sent successfully!")
    print(f"  Topic: {record_metadata.topic}")
    print(f"  Partition: {record_metadata.partition}")
    print(f"  Offset: {record_metadata.offset}")
except KafkaError as e:
    print(f"✗ Error sending message: {e}")

## 4. Send Multiple Messages

In [None]:
# Send 10 messages
for i in range(10):
    message = {
        'id': i,
        'timestamp': datetime.now().isoformat(),
        'data': f'Message number {i}',
        'value': i * 10
    }
    
    future = producer.send(topic_name, value=message, key=f'msg-{i}')
    record_metadata = future.get(timeout=10)
    print(f"Sent message {i} to partition {record_metadata.partition} at offset {record_metadata.offset}")
    
    time.sleep(0.1)  # Small delay

print("\n✓ All messages sent!")

## 5. Asynchronous Sending with Callbacks

In [None]:
# Define callback functions
def on_send_success(record_metadata):
    print(f"✓ Success! Topic: {record_metadata.topic}, Partition: {record_metadata.partition}, Offset: {record_metadata.offset}")

def on_send_error(excp):
    print(f"✗ Error: {excp}")

# Send messages asynchronously
for i in range(5):
    message = {
        'id': i,
        'timestamp': datetime.now().isoformat(),
        'message': f'Async message {i}'
    }
    
    producer.send(topic_name, value=message).add_callback(on_send_success).add_errback(on_send_error)

# Flush to ensure all messages are sent
producer.flush()
print("\n✓ All async messages flushed!")

## 6. Send to Specific Partition

In [None]:
# Send message to a specific partition
message = {
    'timestamp': datetime.now().isoformat(),
    'message': 'Message for partition 0'
}

future = producer.send(topic_name, value=message, partition=0)
record_metadata = future.get(timeout=10)

print(f"✓ Sent to partition: {record_metadata.partition}")

## 7. Batch Sending for Performance

In [None]:
# Create a producer optimized for batching
batch_producer = KafkaProducer(
    bootstrap_servers=KAFKA_SERVERS.split(','),
    value_serializer=lambda v: json.dumps(v).encode('utf-8'),
    batch_size=16384,  # Batch size in bytes
    linger_ms=10,      # Wait up to 10ms to batch messages
    compression_type='gzip'
)

start_time = time.time()

# Send 100 messages in batch
for i in range(100):
    message = {
        'id': i,
        'timestamp': datetime.now().isoformat(),
        'data': f'Batch message {i}'
    }
    batch_producer.send(topic_name, value=message)

batch_producer.flush()
elapsed = time.time() - start_time

print(f"\n✓ Sent 100 messages in {elapsed:.2f} seconds")
print(f"  Throughput: {100/elapsed:.2f} messages/second")

batch_producer.close()

## 8. Cleanup

In [None]:
# Close the producer
producer.close()
print("✓ Producer closed")

## Key Takeaways

1. **acks='all'**: Ensures all replicas acknowledge (most durable)
2. **Batching**: Improves throughput significantly
3. **Compression**: Reduces network bandwidth
4. **Async sending**: Better performance for high-throughput scenarios
5. **Partitioning**: Messages with same key go to same partition

## Next Steps

Try the Consumer notebook (02_kafka_consumer_basics.ipynb) to read these messages!