Skip to content

Transform ML workflows with enterprise database integration. Comprehensive demo portfolio showcasing IntegratedML's flexible model integration through 4 progressive examples: Credit Risk, Fraud Detection, Sales Forecasting, and DNA Similarity.

License

Notifications You must be signed in to change notification settings

intersystems-community/integratedml-custom-models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

24 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

IntegratedML Custom Models

Deploy custom Python ML models directly within SQL queries using InterSystems IRIS 2025.2

IRIS 2025.2 Python 3.8+ License EAP Status


πŸš€ Early Access Program (EAP)

Welcome EAP Participants! The IntegratedML Custom Models will be General Availability (GA) for the InterSystems IRIS 2026.1 release. Until then, this repository will be the source of documentation and information about the feature.

Getting Started with EAP

  1. Read the EAP Guide - Understand the program, timeline, and expectations
  2. Install Custom Models - Complete installation in <30 minutes (target)
  3. Check Known Issues - Review current limitations before reporting bugs
  4. Review Roadmap - See what's coming from EAP to GA

How to Provide Feedback

Your feedback directly shapes the final product! Choose your preferred channel:

  • Survey (recommended): Survey links provided by Data Platforms Product Team
  • Email: thomas.dyar@intersystems.com
  • GitHub Issues (if enabled): Technical bugs and feature requests

Response time: 1-2 business days during EAP

For questions or support, see EAP FAQ or email thomas.dyar@intersystems.com.


Table of Contents

Overview

IntegratedML Custom Models extends InterSystems IRIS IntegratedML with a powerful new capability: deploy your own Python models directly within SQL queries. While IntegratedML has provided automated ML for years, this feature gives data scientists full controlβ€”custom preprocessing, any scikit-learn compatible model, and third-party libraries like Prophet or LightGBMβ€”all executing in-database without data movement.

-- Train your custom Python model with a single SQL command
CREATE MODEL CreditRiskAssessment
PREDICTING (default_risk)
FROM CreditApplications
USING {
    "model_name": "CustomCreditRiskClassifier",
    "path_to_classifiers": "/opt/iris/mgr/python/custom_models/classifiers",
    "user_params": {
        "enable_debt_ratio": 1,
        "enable_risk_scoring": 1
    }
}

-- Get predictions instantly
SELECT customer_id,
       PREDICT(CreditRiskAssessment) as risk_score
FROM NewApplications

Key Features

  • SQL Integration: Deploy scikit-learn compatible models directly in SQL
  • Low Latency: Sub-50ms prediction latency for real-time applications
  • Custom Models: Use your own Python models with domain-specific logic
  • In-Database Processing: Train and predict without data exports
  • Scalable: Designed for production workloads

Demo Applications

1. Credit Risk Assessment

Financial risk modeling with custom feature engineering for loan default prediction.

  • Model: Custom ensemble classifier
  • Test Data: 10,000 records
  • Training Time: ~2.3 seconds

2. Fraud Detection

Transaction fraud detection using ensemble methods.

  • Model: Multi-model ensemble (Neural + Rules + Anomaly)
  • Test Data: 25,000 transactions
  • Latency: <50ms per prediction

3. Sales Forecasting

Time-series forecasting combining Prophet with LightGBM.

  • Model: Prophet + LightGBM hybrid
  • Accuracy: 26.9% MAPE
  • Features: Seasonality, holidays

4. DNA Similarity Analysis

Sequence analysis using custom similarity algorithms.

  • Model: K-NN with custom distance metrics
  • Test Data: 5,000 sequences
  • Features: GC content, motif search

Quick Start

Prerequisites

  • InterSystems IRIS 2025.2+
  • Python 3.8+
  • Docker & Docker Compose

Installation

# Clone the repository
git clone https://github.com/intersystems/integratedml-custom-models.git
cd integratedml-custom-models

# Setup environment (installs dependencies + starts IRIS)
make setup

# Run all demos
make demos

Running Individual Demos

# Credit Risk Assessment
make demo-credit

# Fraud Detection
make demo-fraud

# Sales Forecasting
make demo-sales

# DNA Similarity
make demo-dna

Documentation

This project's documentation is organized into three main areas:

πŸ”Ά EAP Documentation (Start Here!)

Essential guides for Early Access Program participants:

πŸ“š Core Documentation (docs/)

Cross-cutting technical documentation for the entire project:

🎯 Demo Applications (demos/)

Working examples with demo-specific setup instructions:

πŸ“‹ Feature Specifications (specs/)

Design documents and implementation plans for new features:

  • Feature specifications with user stories and acceptance criteria
  • Implementation plans with architecture decisions
  • Task breakdowns and validation results

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        SQL Interface                         β”‚
β”‚  CREATE MODEL | TRAIN MODEL | VALIDATE | PREDICT()          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    IntegratedML Engine                       β”‚
β”‚  β€’ Model Management  β€’ Parameter Handling  β€’ Serialization   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Custom Python Models                        β”‚
β”‚  β€’ Scikit-learn Compatible  β€’ Domain-Specific Logic         β”‚
β”‚  β€’ Feature Engineering      β€’ Custom Algorithms             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Testing

# Run all tests
make test

# Run E2E test with enhanced data volumes
python tests/test_all_demos_e2e.py

# Run specific demo tests
pytest demos/credit_risk/tests/ -v
pytest demos/fraud_detection/tests/ -v

Test Results (Latest)

Demo Data Volume Training Time Performance
Credit Risk 10,000 records 2.3s 100% accuracy
Fraud Detection 25,000 transactions 11.7s 192 flagged
Sales Forecasting 365 days Γ— 5 stores 0.4s 26.9% MAPE
DNA Similarity 5,000 sequences 1.7s 50.5% accuracy

Development

Project Structure

integratedml-custom-models/
β”œβ”€β”€ demos/                    # Demo applications
β”‚   β”œβ”€β”€ credit_risk/         # Credit risk assessment
β”‚   β”œβ”€β”€ fraud_detection/     # Fraud detection system
β”‚   β”œβ”€β”€ sales_forecasting/   # Time series forecasting
β”‚   └── dna_similarity/      # DNA sequence analysis
β”œβ”€β”€ shared/                  # Shared components
β”‚   β”œβ”€β”€ models/             # Base model classes
β”‚   β”œβ”€β”€ database/           # IRIS connection utilities
β”‚   └── utils/              # Helper functions
β”œβ”€β”€ docker/                  # Docker configuration
β”œβ”€β”€ notebooks/              # Jupyter notebooks
β”œβ”€β”€ tests/                  # Test suites
└── scripts/                # Utility scripts

Creating Custom Models

  1. Extend the base model class:
from shared.models.base import IntegratedMLBaseModel

class MyCustomModel(IntegratedMLBaseModel):
    def fit(self, X, y, **params):
        # Your training logic
        pass

    def predict(self, X):
        # Your prediction logic
        pass
  1. Deploy to IRIS:
CREATE MODEL MyModel
PREDICTING (target)
FROM MyTable
USING {
    "model_name": "MyCustomModel",
    "path_to_classifiers": "/path/to/models"
}

Contributing

We welcome contributions! Please see our Contributing Guide for details.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For support with IntegratedML Custom Models:

About

Transform ML workflows with enterprise database integration. Comprehensive demo portfolio showcasing IntegratedML's flexible model integration through 4 progressive examples: Credit Risk, Fraud Detection, Sales Forecasting, and DNA Similarity.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •