# Install Python Kafka client

To install the **Python Kafka client**, you'll use the popular library `kafka-python`.



## ✅ Install `kafka-python`

### 🔧 Using pip:

```bash
pip install kafka-python
```

If you're using Python 3 and `pip` points to Python 2, use:

```bash
pip3 install kafka-python
```



## 🧪 Verify Installation

In a Python shell:

```python
from kafka import KafkaProducer, KafkaConsumer
```

No error = ✅ installed successfully.



## 🛠️ Optional: Create Virtual Environment (Recommended)

```bash
python3 -m venv kafka-env
source kafka-env/bin/activate
pip install kafka-python
```




## 🧩 Problem Statement:

**"Send temperature sensor data (e.g., temperature readings from an IoT device) to a Kafka topic named `temperature-readings`."**

Each message should include:

* `sensor_id`
* `temperature`
* `timestamp`



## ✅ Kafka Producer in Python

```python
from kafka import KafkaProducer
import json
from datetime import datetime
import random
import time

# Initialize Kafka Producer
producer = KafkaProducer(
    bootstrap_servers='localhost:9092',
    value_serializer=lambda v: json.dumps(v).encode('utf-8')
)

# Send 10 temperature readings
for i in range(10):
    data = {
        "sensor_id": f"sensor_{random.randint(1, 3)}",
        "temperature": round(random.uniform(20.0, 35.0), 2),
        "timestamp": datetime.now().isoformat()
    }

    producer.send('temperature-readings', value=data)
    print(f"✅ Sent: {data}")
    time.sleep(1)

# Close producer
producer.flush()
producer.close()
```



### 🛠️ Before Running:

1. Ensure Kafka broker is running (`localhost:9092`)
2. Create topic:

   ```bash
   bin/kafka-topics.sh --create \
     --topic temperature-readings \
     --bootstrap-server localhost:9092 \
     --partitions 1 --replication-factor 1
   ```



-

## ✅ Kafka Consumer in Python

```python
from kafka import KafkaConsumer
import json

# Initialize Kafka Consumer
consumer = KafkaConsumer(
    'temperature-readings',
    bootstrap_servers='localhost:9092',
    auto_offset_reset='earliest',  # read from beginning
    enable_auto_commit=True,
    group_id='temp-consumer-group',
    value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)

print("🚀 Listening for temperature data...\n")

# Read and print messages
for message in consumer:
    data = message.value
    print(f"🌡️ Sensor: {data['sensor_id']}, Temp: {data['temperature']}°C, Time: {data['timestamp']}")
```



## 📝 Before Running:

1. Ensure **Kafka Broker** is running.
2. The topic `temperature-readings` must exist.
3. Run this **after starting the producer**, or restart the producer to see fresh data.



### 🔍 Serialization & Deserialization in Kafka (with JSON in Python)



### 🔑 What is Serialization?

**Serialization** is the process of converting Python objects (like dictionaries) into a format that can be transmitted or stored — e.g., **JSON string or bytes**.

Kafka **requires messages to be sent as bytes** → so we must serialize before producing.



### 🔑 What is Deserialization?

**Deserialization** is the reverse process: converting byte data (received from Kafka) **back into Python objects** so they can be used in code.



## 🔁 Example Flow in Kafka with JSON

| Component       | Action                     | Format                         |
| --------------- | -------------------------- | ------------------------------ |
| Python Producer | Python dict → JSON → bytes | `{"temp": 30}` → b'...json...' |
| Kafka Topic     | Stores message in bytes    | binary                         |
| Python Consumer | bytes → JSON → Python dict | b'...json...' → `{"temp": 30}` |



## ✅ Serialization in Kafka Producer (Python)

```python
from kafka import KafkaProducer
import json

# Serialize Python dict to JSON bytes
producer = KafkaProducer(
    bootstrap_servers='localhost:9092',
    value_serializer=lambda v: json.dumps(v).encode('utf-8')  # <-- Serialization
)
```

This tells the producer to **automatically convert** any dictionary to JSON and then to bytes before sending.



## ✅ Deserialization in Kafka Consumer (Python)

```python
from kafka import KafkaConsumer
import json

# Deserialize JSON bytes to Python dict
consumer = KafkaConsumer(
    'topic-name',
    bootstrap_servers='localhost:9092',
    auto_offset_reset='earliest',
    value_deserializer=lambda m: json.loads(m.decode('utf-8'))  # <-- Deserialization
)
```

This tells the consumer to **automatically decode** bytes and convert the JSON string back to a Python dictionary.



## 🧪 Test Example:

```python
# Serialized (producer sends)
data = {"sensor_id": "123", "temp": 29}
json_str = json.dumps(data)                # Convert to JSON string
json_bytes = json_str.encode("utf-8")      # Convert to bytes

# Deserialized (consumer receives)
decoded_str = json_bytes.decode("utf-8")   # bytes to string
final_data = json.loads(decoded_str)       # string to dict
```



## ⚠️ Why It's Important

Kafka doesn’t care what your data **means** — it just stores **bytes**.

To make your applications interoperable and efficient:

* Use **JSON** for human-readability
* Use **Avro**/**Protobuf** for performance (binary)

