# **Kafka Consumer to Amazon S3 Writer**

## **1. Importing Required Libraries**

Importing the necessary libraries.

In [None]:
import json
from time import sleep
from json import dumps, loads
from s3fs import S3FileSystem
from kafka import KafkaConsumer

- `json`: For handling JSON data.
- `time.sleep`: To add delays if necessary.
- `s3fs.S3FileSystem`: To interact with Amazon S3 storage.
- `kafka.KafkaConsumer`: To consume messages from Kafka topics.

## **2. Setting Up the Kafka Consumer**

Configure the Kafka consumer to subscribe to the `'kafka_test'` topic.

In [None]:
consumer = KafkaConsumer(
    'kafka_test',
    bootstrap_servers=['<your_ip>:9092'],
    value_deserializer=lambda x: loads(x.decode('utf-8'))
)

- `bootstrap_servers`: The Kafka server address.
- `value_deserializer`: Deserializes the incoming messages from JSON.

**Note**: Replace `<your_ip>` with the IP address of your Kafka server.

## **Optional: Printing Consumed Messages**

If you wish to print the messages to the console for debugging, uncomment the following lines:


In [None]:
# for message in consumer:
#     print(message.value)

## **Connecting to Amazon S3**

Initialize the S3 filesystem object to interact with your S3 bucket.

In [None]:
s3 = S3FileSystem()

- Make sure you have AWS credentials configured for `s3fs` to access your S3 bucket.

## **Consuming Messages and Writing to S3**

Iterate over the consumed messages and write each one to a separate JSON file in S3.

In [None]:
for count, message in enumerate(consumer):
    s3_path = f"s3://your-bucket-name/stock_market_{count}.json"
    with s3.open(s3_path, 'w') as file:
        json.dump(message.value, file)

- `enumerate(consumer)`: Provides a counter `count` for each message.
- `s3.open()`: Opens a file in S3 for writing.
- `json.dump()`: Writes the JSON data to the file.

**Notes**: Replace `your-bucket-name` with the name of your S3 bucket.

**Ensure that you have the necessary permissions and AWS credentials set up to write to your S3 bucket.**