# SPCS Networking Connectivity Test: Azure Confluent Kafka dedicated clusters

**Note: This Notebook should be run in an SPCS Container for testing to be valid**

## Purpose

This notebook tests SPCS networking connectivity to Confluent Azure Kafka dedicated clusters in preparation for configuring Snowflake Openflow Kafka connectors:

- **[Openflow Connector for Kafka](https://docs.snowflake.com/en/user-guide/data-integration/openflow/connectors/kafka/about)** - Ingests real-time events from Kafka topics into Snowflake tables using Snowpipe Streaming
- **[Openflow Connector for Snowflake to Kafka](https://docs.snowflake.com/en/user-guide/data-integration/openflow/connectors/snowflake-to-kafka/about)** - Replicates Snowflake tables to Kafka using CDC for real-time insights distribution

Both connectors require External Access Integration (EAI) configuration to enable network connectivity from SPCS to your Kafka brokers. This notebook validates that connectivity before deploying Openflow connectors.

## Supported Platforms

- **Confluent Cloud** on Azure for dedicated Clusters (single or multi AZ)

## Steps

1. Configure your Kafka bootstrap servers URLs and authentication details
2. **(Optional)** Set up PyPI access if confluent-kafka library needs to be installed
3. Install the Confluent Kafka Python client library
4. Run the connectivity test to verify network access
5. If tests fail, create and attach the Kafka External Access Integration (EAI)
6. Restart the notebook session and retest
7. Once successful, proceed with Openflow connector configuration


## Step 1: Configure Kafka Connection Settings

Update the configuration below with your actual Kafka cluster details.

In [None]:
# Kafka Connectivity Test Configuration on Azure Confluent
# Update these values with your actual Kafka cluster details

# ============================================================================
# KAFKA BOOTSTRAP SERVER CONFIGURATION
# ============================================================================
KAFKA_BOOTSTRAP_SERVERS = ["<cluster_id>.az1.<id>.<region>.azure.confluent.cloud:9092",
                           "<cluster_id>.az2.<id>.<region>.azure.confluent.cloud:9092",
                           "<cluster_id>.az3.<id>.<region>.azure.confluent.cloud:9092"]

# ============================================================================
# AUTHENTICATION CONFIGURATION
# ============================================================================
KAFKA_SASL_USERNAME = "Your API key"
KAFKA_SASL_PASSWORD = "Your API key secret"

# SASL Mechanism
# - Options: PLAIN is the only method supported by Confluent 
KAFKA_SASL_MECHANISM = "PLAIN"

# Security Protocol
# - Most production clusters: "SASL_SSL" (SASL over TLS/SSL)
# - Options: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL
KAFKA_SECURITY_PROTOCOL = "SASL_SSL"

# ============================================================================
# SNOWFLAKE ROLE CONFIGURATION
# ============================================================================
# This role will be used to create the EAI and other objects if necessary
IMPLEMENTATION_ROLE = "ACCOUNTADMIN"
OPENFLOW_RUNTIME_ROLE = "OPENFLOW_ADMIN"

# ============================================================================
# AUTO-EXTRACT CONFIGURATION FOR NETWORK RULES
# ============================================================================
import re

# Extract hostname and port from bootstrap servers
first_server = KAFKA_BOOTSTRAP_SERVERS[0].strip()
match = re.match(r'([^:]+):(\d+)', first_server)
if match:
    KAFKA_HOST = match.group(1)
    KAFKA_PORT = match.group(2)
else:
    KAFKA_HOST = first_server
    KAFKA_PORT = "9092"

print("=" * 70)
print("KAFKA CONFIGURATION SUMMARY")
print("=" * 70)
print(f"Bootstrap Server(s): {KAFKA_BOOTSTRAP_SERVERS}")
print(f"SASL Mechanism: {KAFKA_SASL_MECHANISM}")
print(f"Security Protocol: {KAFKA_SECURITY_PROTOCOL}")
print(f"\nNetwork Rule Configuration:")
print(f"  Primary Host: {KAFKA_HOST}")
print(f"  Primary Port: {KAFKA_PORT}")
print("=" * 70)
print("\n‚úì Configuration loaded. Ready to test connectivity...")


## Step 2a: PyPI Setup (Optional)

Run these cells if you need to install the confluent-kafka library from PyPI. This creates the necessary network rules and External Access Integration for PyPI access.

**Skip this section if you already have confluent-kafka installed or have PyPI access configured.**


In [None]:
-- Create Network Rule and External Access Integration for PyPI
-- Run this cell to enable installing Python packages from PyPI

USE ROLE {{IMPLEMENTATION_ROLE}};

CREATE OR REPLACE NETWORK RULE pypi_network_rule
  MODE = EGRESS
  TYPE = HOST_PORT
  VALUE_LIST = ('pypi.org', 'pypi.python.org', 'pythonhosted.org', 'files.pythonhosted.org');

CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION pypi_access_integration
  ALLOWED_NETWORK_RULES = (pypi_network_rule)
  ENABLED = true
  COMMENT = 'External Access Integration for PyPI package installation';

-- Grant usage on the integration
GRANT USAGE ON INTEGRATION pypi_access_integration TO ROLE {{IMPLEMENTATION_ROLE}};

SHOW EXTERNAL ACCESS INTEGRATIONS LIKE 'pypi_access_integration';

In [None]:
-- Apply PyPI integration to this notebook
-- Run this after creating the PyPI integration above

ALTER NOTEBOOK EAI_KAFKA
  SET EXTERNAL_ACCESS_INTEGRATIONS = ('pypi_access_integration', 'CONFLUENT_KAFKA_EAI');

-- Restart your Notebook session after applying an EAI

## Step 2b: Install Confluent Kafka Client Library

Make sure PyPI access is configured first if you get connection errors.
You can run this cell twice; the first to install the library, the second to confirm it is imported.


In [None]:
# Install the Confluent Kafka Python client library
# Make sure PyPI access is configured first if you get connection errors
# You can run this cell twice; the first to install the library, the second to confirm it is imported

try:
    from confluent_kafka import Producer
    print("‚úÖ confluent-kafka already available")
except ImportError:
    print("üì¶ Installing confluent-kafka...")
    %pip install confluent-kafka
    print("‚úÖ confluent-kafka installed")


## Step 3: Connectivity Tests

Run these test cells to verify network connectivity and authentication to your Kafka cluster.


In [None]:
### Test 3a: Socket Connectivity

# Test basic network connectivity to the Kafka broker
import socket

print("=" * 60)
print("TEST 3a: SOCKET CONNECTIVITY")

def test_socket_connection(host, port, timeout=10):
    """Try to connect to a host:port and return True if successful."""
    try:
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.settimeout(timeout)
        result = sock.connect_ex((host, int(port)))
        sock.close()
        return result == 0
    except Exception:
        return False

# ==============================================================
# RUN TESTS
# ==============================================================
results = {}

for server in KAFKA_BOOTSTRAP_SERVERS:
    host, port = server.split(":")
    success = test_socket_connection(host, port)
    results[server] = success

# ==============================================================
# SUMMARIZE RESULTS
# ==============================================================
print("=" * 60)
if all(results.values()):
    print("‚úÖ SUCCESS: Socket connection to all AZs established")
else:
    print("‚ùå FAILED: One or more AZ connections could not be established")
print("-" * 60)

# Print per-broker results
for server, ok in results.items():
    icon = "‚úÖ" if ok else "‚ùå"
    print(f"{icon} {server}")

print("=" * 60)

In [None]:
### Test 3b: Kafka Producer & Metadata

# Test Kafka client connection and fetch cluster metadata
from confluent_kafka import Producer, KafkaException
import random

print("=" * 60)
print("TEST 3b: KAFKA PRODUCER & METADATA")
print("=" * 60)
print(f"\nConnecting to Kafka cluster...")

try:
    # Randomly select one bootstrap server from the list
    BOOTSTRAP_SERVER = random.choice(KAFKA_BOOTSTRAP_SERVERS)
    print(f"  Using bootstrap server: {BOOTSTRAP_SERVER}")

    # Kafka configuration
    producer_conf = {
        'bootstrap.servers': BOOTSTRAP_SERVER,
        'security.protocol': KAFKA_SECURITY_PROTOCOL,
        'sasl.mechanism': KAFKA_SASL_MECHANISM,
        'sasl.username': KAFKA_SASL_USERNAME,
        'sasl.password': KAFKA_SASL_PASSWORD
    }

    # Create producer instance
    producer = Producer(producer_conf)

    # Fetch cluster metadata to verify connection
    print(f"  Fetching cluster metadata (timeout: 10s)...")
    metadata = producer.list_topics(timeout=10)

    if metadata and metadata.brokers:
        print(f"\n‚úÖ SUCCESS: Connected to Kafka cluster")
        print(f"   Cluster ID: {getattr(metadata, 'cluster_id', 'N/A')}")
        print(f"   Number of brokers: {len(metadata.brokers)}")
        print(f" List of brokers: {metadata.brokers}")
        print(f"   Number of topics: {len(metadata.topics)}")
    else:
        print(f"\n‚ùå FAILED: No broker information received")
        print(f"   Action: Verify network connectivity and broker configuration")

except KafkaException as e:
    print(f"\n‚ùå FAILED: Kafka error")
    print(f"   Error: {e}")
    print(f"   Action: Verify credentials and SASL configuration")

except Exception as e:
    print(f"\n‚ùå FAILED: Unexpected error")
    print(f"   Error: {e}")
    print(f"   Action: Check configuration and network access")

print("=" * 60)



In [None]:
# ==============================================================
# CREATE TOPIC ON CONFLUENT CLOUD
# ==============================================================

from confluent_kafka.admin import AdminClient, NewTopic
from confluent_kafka import KafkaException
import random

# Randomly select one bootstrap server from the list
BOOTSTRAP_SERVER = random.choice(KAFKA_BOOTSTRAP_SERVERS)
print(f"  Using bootstrap server: {BOOTSTRAP_SERVER}")

admin_conf = {
    'bootstrap.servers': BOOTSTRAP_SERVER,
    'security.protocol': KAFKA_SECURITY_PROTOCOL,
    'sasl.mechanism': KAFKA_SASL_MECHANISM,
    'sasl.username': KAFKA_SASL_USERNAME,
    'sasl.password': KAFKA_SASL_PASSWORD
}

topic_name = "example_new_topic2"

try:
    admin = AdminClient(admin_conf)

    # Check if topic already exists
    metadata = admin.list_topics(timeout=10)
    if topic_name in metadata.topics:
        print(f"‚úÖ Topic '{topic_name}' already exists.")
    else:
        print(f"üõ†Ô∏è Creating topic '{topic_name}'...")
        new_topic = NewTopic(topic=topic_name, num_partitions=3, replication_factor=3)
        fs = admin.create_topics([new_topic])

        # Wait for operation to finish
        for topic, f in fs.items():
            try:
                f.result()  # raises exception if creation failed
                print(f"‚úÖ Topic '{topic}' created successfully!")
            except Exception as e:
                print(f"‚ùå Failed to create topic {topic}: {e}")

except KafkaException as e:
    print(f"Kafka error: {e}")



In [None]:
# ==============================================================
# READ FROM EXISTING TOPIC
# ==============================================================

from confluent_kafka import Consumer, KafkaException
import random

topic_name = "topic1"  # <---- set your topic name

# Randomly select one bootstrap server from the list
BOOTSTRAP_SERVER = random.choice(KAFKA_BOOTSTRAP_SERVERS)
print(f"  Using bootstrap server: {BOOTSTRAP_SERVER}")

consumer_conf = {
    'bootstrap.servers': BOOTSTRAP_SERVER,
    'security.protocol': KAFKA_SECURITY_PROTOCOL,
    'sasl.mechanism': KAFKA_SASL_MECHANISM,
    'sasl.username': KAFKA_SASL_USERNAME,
    'sasl.password': KAFKA_SASL_PASSWORD,
    'group.id': 'dummygroupid',
    'auto.offset.reset': 'latest'  # start from beginning if no offsets committed
}

consumer = Consumer(consumer_conf)
consumer.subscribe([topic_name])

print(f"üì° Listening for messages on topic '{topic_name}'...")

message_count = 0
max_messages = 5

try:
    while message_count < max_messages:
        msg = consumer.poll(timeout=1.0)
        if msg is None:
            continue
        if msg.error():
            print(f"‚ö†Ô∏è Consumer error: {msg.error()}")
            continue

        print(f"üß© Received message: {msg.value().decode('utf-8')} (partition {msg.partition()})")
        message_count += 1

except KeyboardInterrupt:
    print("üõë Stopping consumer...")

finally:
    print(f"‚úÖ Processed {message_count} messages. Closing consumer.")
    consumer.close()


## Step 4: Restart and Retest

After creating and setting the EAI on the Notebook:
1. **Restart your Notebook session** (this is required for the EAI to take effect)
2. Re-run the configuration cell (Step 1)
3. Re-run the connectivity test (Step 3)

The tests should now pass if the EAI was configured correctly.
