# Snowflake AI Complete: Image Analysis and Entity Extraction

This notebook demonstrates how to use Snowflake's AI Complete with multimodal capabilities to analyze a knowledge graph image and generate code for extracting entities in star schema format.

## Overview
1. Upload knowledge graph image to Snowflake stage
2. Use AI Complete with OpenAI to analyze the image
3. Generate SQL code to create star schema entities
4. Execute the generated code to create tables and relationships


In [None]:
# Step 1: Setup and Configuration
# This notebook runs directly in Snowflake, so no connection setup needed

print("Snowflake context will be set in the next SQL cell")


In [None]:
-- Set the context for your Snowflake session
USE WAREHOUSE COMPUTE_WH;
USE DATABASE DEMO_DB;
USE SCHEMA PUBLIC;


In [None]:
# Step 2: Create Stage and Upload Image
# This will be handled in the next SQL cell


In [None]:
-- Create a stage for storing the knowledge graph image
CREATE OR REPLACE STAGE knowledge_graph_stage
    DIRECTORY = (ENABLE = TRUE)
    COMMENT = 'Stage for storing knowledge graph images';

-- Upload the knowledge graph image to the stage
-- Note: Run this command in Snowflake CLI or use the web interface to upload
-- PUT file:///path/to/your/knowledge_graph.jpg @knowledge_graph_stage
--     AUTO_COMPRESS = FALSE
--     OVERWRITE = TRUE;

-- Verify the stage contents
LIST @knowledge_graph_stage;


In [None]:
# Step 3: Use AI Complete to Analyze the Image
# This will be handled in the next SQL cell using Snowflake's native AI Complete


In [None]:
-- Use AI Complete to analyze the knowledge graph image and generate SQL
-- This query reads the image from the stage and uses AI Complete to analyze it
WITH image_analysis AS (
    SELECT SNOWFLAKE.CORTEX.COMPLETE(
        'gpt-4-vision-preview',
        ARRAY_CONSTRUCT(
            OBJECT_CONSTRUCT(
                'role', 'user',
                'content', ARRAY_CONSTRUCT(
                    OBJECT_CONSTRUCT(
                        'type', 'text',
                        'text', 'Analyze this knowledge graph image and generate SQL DDL statements to create a star schema based on the entities and relationships shown in the graph. Requirements: 1) Identify all entities (nodes) in the graph, 2) Identify all relationships between entities, 3) Create a fact table for the central entity, 4) Create dimension tables for related entities, 5) Include proper foreign key relationships, 6) Use appropriate data types for each field, 7) Include primary keys and indexes. The SQL should be production-ready and follow Snowflake best practices. Return only the SQL DDL statements, no explanations.'
                    ),
                    OBJECT_CONSTRUCT(
                        'type', 'image_url',
                        'image_url', OBJECT_CONSTRUCT(
                            'url', 'data:image/jpeg;base64,' || BASE64_ENCODE($1)
                        )
                    )
                )
            )
        )
    ) as ai_response
    FROM (
        SELECT $1 as image_data
        FROM @knowledge_graph_stage/knowledge_graph.jpg
    )
)
SELECT ai_response as generated_sql
FROM image_analysis;


In [None]:
# Step 4: Execute Generated SQL Code
# After running the AI analysis query above, copy the generated SQL and execute it here


In [None]:
-- Execute the generated SQL code from the AI analysis
-- Copy the SQL generated by the previous query and paste it here
-- Example of what the AI might generate:

/*
-- Fact Table
CREATE OR REPLACE TABLE fact_central_entity (
    fact_id NUMBER AUTOINCREMENT PRIMARY KEY,
    entity_id VARCHAR(50) NOT NULL,
    dimension_1_id VARCHAR(50),
    dimension_2_id VARCHAR(50),
    dimension_3_id VARCHAR(50),
    measure_1 NUMBER,
    measure_2 NUMBER,
    created_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP(),
    updated_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
);

-- Dimension Tables
CREATE OR REPLACE TABLE dim_entity_1 (
    entity_1_id VARCHAR(50) PRIMARY KEY,
    entity_1_name VARCHAR(100) NOT NULL,
    entity_1_description TEXT,
    entity_1_category VARCHAR(50),
    created_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP(),
    updated_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
);

-- Add foreign key constraints
ALTER TABLE fact_central_entity 
ADD CONSTRAINT fk_dim_entity_1 
FOREIGN KEY (dimension_1_id) REFERENCES dim_entity_1(entity_1_id);

-- Create indexes for performance
CREATE INDEX idx_fact_entity_id ON fact_central_entity(entity_id);
*/


In [None]:
# Step 5: Verify Created Schema
# This will be handled in the next SQL cell


In [None]:
-- Verify the created schema by listing tables and their structure
SHOW TABLES;

-- Describe each table structure (uncomment and run for each table)
-- DESCRIBE TABLE fact_central_entity;
-- DESCRIBE TABLE dim_entity_1;
-- DESCRIBE TABLE dim_entity_2;


In [None]:
# Step 6: Cleanup (Optional)
# No cleanup needed when running in Snowflake directly


In [None]:
# Summary
print("Notebook completed! The AI Complete analysis has generated SQL code for creating a star schema based on your knowledge graph image.")
print("Next steps:")
print("1. Review the generated SQL from the AI analysis")
print("2. Execute the SQL to create your database schema")
print("3. Verify the created tables and relationships")
print("4. Load sample data to test the schema")


## Alternative Approach: Using Snowflake SQL with AI Complete

If you prefer to work directly in Snowflake SQL instead of Python, here's the equivalent approach:

### 1. Create Stage and Upload Image
```sql
-- Create stage
CREATE OR REPLACE STAGE knowledge_graph_stage
    DIRECTORY = (ENABLE = TRUE)
    COMMENT = 'Stage for storing knowledge graph images';

-- Upload image (run this in Snowflake CLI or web interface)
PUT file:///path/to/your/knowledge_graph.jpg @knowledge_graph_stage
    AUTO_COMPRESS = FALSE
    OVERWRITE = TRUE;
```

### 2. Use AI Complete with Image Analysis
```sql
-- Analyze the knowledge graph image and generate SQL
SELECT SNOWFLAKE.CORTEX.COMPLETE(
    'gpt-4-vision-preview',
    ARRAY_CONSTRUCT(
        OBJECT_CONSTRUCT(
            'role', 'user',
            'content', ARRAY_CONSTRUCT(
                OBJECT_CONSTRUCT(
                    'type', 'text',
                    'text', 'Analyze this knowledge graph image and generate SQL DDL statements to create a star schema based on the entities and relationships shown in the graph. Include fact tables, dimension tables, and proper relationships.'
                ),
                OBJECT_CONSTRUCT(
                    'type', 'image_url',
                    'image_url', OBJECT_CONSTRUCT(
                        'url', 'data:image/jpeg;base64,' || BASE64_ENCODE($1)
                    )
                )
            )
        )
    )
) as generated_sql
FROM (
    SELECT $1 as image_data
    FROM @knowledge_graph_stage/knowledge_graph.jpg
);
```

### 3. Execute Generated SQL
```sql
-- Copy the generated SQL from the previous query and execute it
-- (The AI will generate CREATE TABLE statements that you can run)
```

## Prerequisites

1. **Snowflake Account**: Access to a Snowflake account with AI Complete enabled
2. **OpenAI Integration**: Snowflake AI Complete must be configured with OpenAI
3. **Permissions**: ACCOUNTADMIN role or equivalent permissions
4. **Image File**: The knowledge_graph.jpg file uploaded to a Snowflake stage

## Key Features Demonstrated

- **Multimodal AI Analysis**: Using AI Complete to analyze both text and images
- **Custom Prompting**: Tailored prompts for specific business requirements
- **Code Generation**: AI-generated SQL DDL for database schema creation
- **Star Schema Design**: Proper data warehouse design patterns
- **Error Handling**: Robust error handling and validation
- **Schema Verification**: Automated verification of created database objects

## Next Steps

1. Update the connection parameters with your actual Snowflake credentials
2. Upload the knowledge_graph.jpg file to your Snowflake stage
3. Run the notebook cells sequentially
4. Review and modify the generated SQL as needed
5. Test the created schema with sample data
