A full-stack demo app for financial analytics demonstrating the integration of Snowflake Cortex AI, PostgreSQL with pgvector, and Streamlit for intelligent data analysis and natural language interactions.
- Real-time Budget Dashboard - Track daily, weekly, and monthly spending with interactive visualizations
- Transaction Management - Approve, decline, or cancel pending transactions with full audit trail
- Financial Data Storage - Robust PostgreSQL backend with comprehensive data models
- Live Data Queries - Dynamic queries with SQLAlchemy ORM
- Natural Language to SQL - Convert plain English questions into PostgreSQL queries using Cortex Complete
- AI-Powered Financial Insights - Get intelligent spending recommendations and budget analysis
- Cortex Analyst Integration - Enterprise-grade natural language query interface
- Snowflake Data Visualization - Display and analyze data from Snowflake tables
- Cortex AI Agent - Interactive chat interface for financial queries
- Context-Aware Responses - Agent remembers conversation history and context
- Data Retrieval & Updates - Agent can both read and write to PostgreSQL
- Subscription Management Demo - Intelligent subscription analysis and cancellation recommendations
Showcase three progressively sophisticated search techniques:
- ILIKE Pattern Matching - Basic SQL substring search
- pg_trgm Fuzzy Search - Typo-tolerant trigram matching
- pgvector Semantic Search - AI-powered contextual search with embeddings
- Embedding Generation - Create vector embeddings for semantic search
- pgvector Storage - Store and query embeddings in PostgreSQL
- Intelligent Search - Find transactions by meaning, not just keywords
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Streamlit UI β βββββββΊ β Python Backend β βββββββΊ β PostgreSQL β
β (Frontend) β β (Application) β β (Database) β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β β β
β βΌ β
β ββββββββββββββββββββ β
βββββββββββββββββββΊβ Snowflake Cortex β β
β AI Services β β
ββββββββββββββββββββ β
β β
βΌ βΌ
ββββββββββββββββββββ βββββββββββββββββββ
β Cortex Analyst β β pgvector β
β Cortex Agent β β (Embeddings) β
β Cortex Complete β βββββββββββββββββββ
ββββββββββββββββββββ β²
β
βββββββββ΄βββββββββ
β OpenAI API β
β (Embeddings) β
ββββββββββββββββββ
- Active Snowflake account with appropriate permissions
- Cortex Analyst enabled on your account
- Cortex AI Agent configured and deployed
- Personal Access Token (PAT) for API authentication
- Warehouse with sufficient compute resources
- PostgreSQL 16+ (cloud or self-hosted)
- pgvector extension installed (for semantic search)
- pg_trgm extension installed (for fuzzy search)
- SSL/TLS connection support recommended
- Python 3.11 or higher
- pip package manager
- Virtual environment (recommended)
- OpenAI API Key (optional, for semantic search with embeddings)
- Get from: https://platform.openai.com/api-keys
- Git for version control
- Text editor or IDE
- Terminal/command line access
git clone <repository-url>
cd cortex-data-analysis-with-postgres# Create virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt-- Connect to your PostgreSQL database
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;# Run the setup script
python3 scripts/setup_transaction_management.py# Copy the template
cp .streamlit/secrets_template.toml .streamlit/secrets.toml
# Edit with your credentials
nano .streamlit/secrets.tomlRequired Configuration:
# PostgreSQL Connection
[postgres]
host = "your-postgres-host.com"
port = 5432
database = "your_database"
user = "your_username"
password = "your_password"
sslmode = "require"
# Snowflake Connection
[connections.snowflake]
account = "YOUR_ACCOUNT"
user = "YOUR_USERNAME"
password = "YOUR_PASSWORD"
role = "ACCOUNTADMIN"
warehouse = "YOUR_WAREHOUSE"
database = "YOUR_DATABASE"
schema = "PUBLIC"
# Snowflake Cortex Agent
[agent]
SNOWFLAKE_PAT = "your-personal-access-token"
SNOWFLAKE_HOST = "YOUR_ACCOUNT.snowflakecomputing.com"
# OpenAI (Optional - for semantic search)
[openai]
api_key = "sk-proj-your-key-here"Choose one of the following options:
# Option 1: Load sample transaction data
python3 scripts/load_sample_data.py
# Option 2: Load expanded dataset
python3 data_loaders/bulk_insert_expanded_data.py
# Option 3: Load from SQL backup (see Sample Data section below)
psql -h your-host -U your-user -d your-database -f docs/sample_data/postgres_sample_data.sqlIf you want to use the pgvector semantic search feature:
# Generate embeddings for existing transactions
python3 scripts/setup_embeddings.pystreamlit run streamlit_app.pyThe app will open in your browser at http://localhost:8501
Load sample financial data into your PostgreSQL database:
python3 scripts/load_sample_data.pyDownload and restore the sample data backup:
# Restore to your database
psql -h your-host -U your-user -d your-database -f docs/sample_data/postgres_sample_data.sqlIncluded Data:
- 500+ sample transactions
- 5 account profiles (Checking, Savings, Credit Card, Investment, Emergency Fund)
- Categories: Groceries, Dining, Shopping, Transportation, Utilities, Entertainment, etc.
- Date range: Last 6 months
- Various transaction statuses: pending, approved, completed, declined
Load sample data into your Snowflake account:
-- Download the Snowflake setup script
-- [Link to snowflake_sample_data.sql will be added here]
-- Run in Snowflake worksheet
USE DATABASE YOUR_DATABASE;
USE SCHEMA PUBLIC;
-- Create and populate transactions table
SOURCE @~/snowflake_sample_data.sql;Included Data:
- Transactions table with 1000+ records
- Monthly aggregations
- Category breakdowns
- Spending trends over time
python3 scripts/snowflake_loader_final.py- Today's Budget Status - Real-time spending vs daily budget
- Weekly Comparison - Current week vs previous week trends
- Monthly Tracking - Visual chart showing budget progress
- Category Breakdown - Spending by category with progress bars
- Smart Insights - AI-powered recommendations based on spending patterns
- Natural Language Interface - Ask questions in plain English
- SQL Generation - Automatic conversion to PostgreSQL queries
- Account Selection - Filter queries by specific accounts
- Query History - Review past queries and results
- Result Visualization - Tables, metrics, and charts
Example Queries:
"How much did I spend on groceries last week?"
"Show me all transactions over $100 this month"
"What's my average daily spending?"
"Which category did I spend the most on?"
- Pending Transactions View - See all transactions awaiting approval
- AI Analysis - Automatic detection of unusual or high-amount transactions
- One-Click Actions - Approve or cancel transactions
- Cancellation Audit Trail - Full history with reasons
- Manual Management - Override AI suggestions when needed
- Interactive Conversation - Natural dialogue with AI agent
- Context Awareness - Agent remembers previous messages
- Data Retrieval - Query PostgreSQL and Snowflake data
- Subscription Management - Identify and cancel unused subscriptions
- Spending Analytics - Get insights from Snowflake aggregations
Three search methods to compare:
-
ILIKE - Traditional SQL pattern matching
- Fast and simple
- Exact substring matches
- Case-insensitive
-
pg_trgm - Fuzzy text search
- Typo-tolerant
- Similarity scoring
- Handles misspellings
-
pgvector - Semantic search
- AI-powered understanding
- Finds conceptually similar results
- Language-agnostic
cortex-data-analysis-with-postgres/
β
βββ streamlit_app.py # Main application entry point (170 lines)
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ .gitignore # Git ignore rules
β
βββ src/ # Application modules
β βββ __init__.py # Package initialization
β βββ postgres_utils.py # PostgreSQL connection utilities
β βββ budget_dashboard.py # Budget tracking interface
β βββ cortex_queries.py # Cortex AI query functionality
β βββ transaction_manager_ui.py # Transaction management UI
β βββ cortex_agent.py # Snowflake agent chat interface
β βββ db_utils.py # Database utility functions
β βββ db.py # Database session management
β βββ models.py # SQLAlchemy data models
β βββ models_finance.py # Financial domain models
β
βββ .streamlit/
β βββ secrets_template.toml # Configuration template
β βββ secrets.toml # Your credentials (git-ignored)
β
βββ pages/
β βββ search.py # Search demo page
β
βββ data_loaders/ # Data loading utilities
β βββ bulk_insert_*.py # Bulk data import scripts
β βββ *.csv # Sample data files
β
βββ scripts/ # Setup and utility scripts
β βββ setup_embeddings.py # Generate pgvector embeddings
β βββ setup_transaction_management.py # Initialize database
β βββ load_sample_data.py # Load sample transactions
β βββ migrate_add_status.py # Database migrations
β βββ *.sql # SQL utility scripts
β
βββ tests/ # Test and debug scripts
β βββ test_*.py # Test files
β βββ debug_*.py # Debug utilities
β
βββ docs/ # Documentation
βββ sample_data/ # SQL backup files
β βββ README.md # Sample data guide
β βββ postgres_sample_data.sql
β βββ snowflake_sample_data.sql
βββ screenshots/ # Application screenshots
Alternatively to secrets.toml, you can use environment variables:
# PostgreSQL
export PG_HOST="your-host"
export PG_PORT="5432"
export PG_DB="your-database"
export PG_USER="your-username"
export PG_PASSWORD="your-password"
export PG_SSLMODE="require"
# OpenAI (optional)
export OPENAI_API_KEY="sk-proj-your-key"- Create Agent in Snowflake:
CREATE OR REPLACE CORTEX AGENT POSTGRES_AGENT
WAREHOUSE = YOUR_WAREHOUSE
DATABASE = YOUR_DATABASE
SCHEMA = AGENTS
PROMPT = 'You are a financial analysis assistant...';-
Generate Personal Access Token:
- Go to Snowflake UI β Profile β Personal Access Tokens
- Click "Generate New Token"
- Copy token to
secrets.toml
-
Configure in secrets.toml:
[agent]
SNOWFLAKE_PAT = "your-token-here"
SNOWFLAKE_HOST = "YOUR_ACCOUNT.snowflakecomputing.com"-- Install required extensions
CREATE EXTENSION IF NOT EXISTS vector; -- For semantic search
CREATE EXTENSION IF NOT EXISTS pg_trgm; -- For fuzzy search
-- Verify installation
SELECT * FROM pg_extension WHERE extname IN ('vector', 'pg_trgm');- Enable PostgreSQL in the sidebar
- Navigate to Budget Dashboard section
- View real-time spending metrics
- Check category breakdowns
- Review AI insights and recommendations
- Go to AI Queries section
- Select an account (optional)
- Type your question: "How much did I spend on dining last month?"
- Click Run Query
- View SQL generation and results
- Go to Transaction Manager
- Click Analyze Pending Transactions
- Review AI-flagged suspicious transactions
- Click cancel button for unwanted transactions
- View confirmation and audit trail
- Navigate to Search Demo page
- Select pgvector Semantic Search
- Enter search term: "morning coffee"
- View contextually similar results
- Compare with ILIKE and pg_trgm results
- Streamlit Documentation
- Snowflake Cortex Documentation
- PostgreSQL Documentation
- pgvector GitHub
- OpenAI API Reference
This project is licensed under the MIT License - see the LICENSE file for details.
Built with β€οΈ using Streamlit, PostgreSQL, Snowflake Cortex, and AI
Last updated: October 2025




