# Low-Level API: Complete Control

**Goal:** Direct access to all internal operations. Maximum flexibility, maximum responsibility.

This is for advanced users who need complete control over every aspect of the transfer.

## Why This Matters

- **Complete Control**: Access every internal operation
- **Raw Performance**: Bypass all abstractions when needed  
- **Custom Logic**: Implement any transfer pattern imaginable
- **Deep Integration**: Build tools on top of the core primitives

## Real World Usage

Database engineers, tool builders, and performance specialists who need access to the raw machinery.


In [2]:
from snowpark_db_api import LowLevelTransferEngine
from snowpark_db_api.core import DataTransfer
from snowpark_db_api.config import get_config
from snowpark_db_api.snowflake_connection import SnowflakeConnection, ConnectionConfig

# Also import Snowpark directly for raw operations
from snowflake.snowpark import Session
import logging

## 1. Raw Transfer Engine

Direct access to the core transfer machinery. You handle everything explicitly.


In [3]:
config = get_config()
engine = LowLevelTransferEngine(config)

In [4]:
print("Establishing raw connections...")
connections = engine.establish_raw_connections()
print(f"Connection success: {connections['success']}")

Establishing raw connections...
2025-07-03 01:58:03 - snowpark_db_api.core - INFO - Setting up connections
2025-07-03 01:58:05 - snowpark_db_api.core - INFO - All connections established
Connection success: True


In [5]:
raw_query = "SELECT TOP 10 ID, Column0 FROM dbo.RandomDataWith100Columns"
source_results = engine.execute_raw_query(raw_query)
print(f"Source query returned {len(source_results)} rows")

2025-07-03 01:58:48 - snowpark_db_api.connections - INFO - Successfully connected to SQL Server: alingtestserver.database.windows.net
Source query returned 10 rows


In [8]:
# Use a small table first - UserProfile is much smaller
df = engine.create_snowpark_dataframe("dbo.ORDERS")
print(f"DataFrame created from UserProfile table: {type(df)}")

2025-07-03 01:59:17 - snowpark_db_api.connections - INFO - Successfully connected to SQL Server: alingtestserver.database.windows.net
2025-07-03 01:59:18 - snowpark_db_api.connections - INFO - Successfully connected to SQL Server: alingtestserver.database.windows.net
2025-07-03 01:59:20 - snowpark_db_api.connections - INFO - Successfully connected to SQL Server: alingtestserver.database.windows.net
DataFrame created from UserProfile table: <class 'snowflake.snowpark.table.Table'>


In [10]:
df.toPandas().head(10)

Unnamed: 0,O_ORDERKEY,O_CUSTKEY,O_TOTALPRICE,O_ORDERDATE
0,10101,98,3689.52,2024-02-11
1,10102,99,3210.76,2024-02-10
2,10103,100,2956.73,2024-02-09
3,10104,1,1478.52,2024-02-08
4,9101,98,1269.5,2024-11-07
5,9102,99,1087.15,2024-11-06
6,9103,100,1523.31,2024-11-05
7,9104,1,1388.9,2024-11-04
8,9105,2,1861.24,2024-11-03
9,9106,3,3525.5,2024-11-02


In [18]:
print("Testing raw query execution...")
simple_raw_query = "SELECT TOP 5 ID, Column0 FROM dbo.RandomDataWith100Columns"
raw_results = engine.execute_raw_query(simple_raw_query)

Testing raw query execution...
2025-07-03 02:01:33 - snowpark_db_api.connections - INFO - Successfully connected to SQL Server: alingtestserver.database.windows.net
[(1, 77600), (2, 18748), (3, 15938), (4, 20309), (5, 99893)]


In [19]:
engine.cleanup_raw_connections()

## 2. Direct DataTransfer Usage

Use the core DataTransfer class directly. This is what all higher APIs build on.


In [20]:
# Create DataTransfer instance directly
config = get_config()
config.snowflake.database = "DB_API_MSSQL"
config.snowflake.create_db_if_missing = False
# Manually configure the transfer settings
config.transfer.source_table = "dbo.Orders"
config.transfer.destination_table = "DIRECT_ORDERS"
config.transfer.mode = "overwrite"
config.transfer.fetch_size = 1000  # Custom fetch size
config.transfer.max_workers = 4    # Custom parallelism

transfer = DataTransfer(config)
success = transfer.setup_connections()

2025-07-03 02:02:56 - snowpark_db_api.core - INFO - Setting up connections
2025-07-03 02:02:57 - snowpark_db_api.core - INFO - All connections established


In [25]:
source_conn = transfer.source_connection()
cursor = source_conn.cursor()
cursor.execute("SELECT TOP 5 ID FROM dbo.RandomDataWith100Columns")
results = cursor.fetchall()
cursor.close()
print(f"Direct query returned {len(results)} rows")

2025-07-03 02:04:26 - snowpark_db_api.connections - INFO - Successfully connected to SQL Server: alingtestserver.database.windows.net
Direct query returned 5 rows


In [26]:
# Test Snowflake session
sf_result = transfer.session.sql("SELECT CURRENT_DATABASE()").collect()
print(f"Current database: {sf_result[0][0]}")

Current database: DB_API_MSSQL


In [27]:
print(f"Source connection factory: {transfer.source_connection}")
print(f"Snowflake session: {type(transfer.session)}")
print(f"Snowflake connection: {type(transfer.snowflake_connection)}")

Source connection factory: <function create_sqlserver_connection.<locals>.connection_factory at 0xffff757456c0>
Snowflake session: <class 'snowflake.snowpark.session.Session'>
Snowflake connection: <class 'snowpark_db_api.snowflake_connection.SnowflakeConnection'>


In [30]:
# Execute the transfer with custom parameters
transfer_success = transfer.transfer_table(
    query="(SELECT TOP 5 ID FROM dbo.RandomDataWith100Columns) as transfer_test",  # Use table-based transfer
    limit_rows=100  # Custom row limit
)

2025-07-03 02:05:59 - snowpark_db_api.core - INFO - Starting transfer using custom query -> DIRECT_ORDERS
2025-07-03 02:06:00 - snowpark_db_api.connections - INFO - Successfully connected to SQL Server: alingtestserver.database.windows.net
2025-07-03 02:06:01 - snowpark_db_api.connections - INFO - Successfully connected to SQL Server: alingtestserver.database.windows.net
2025-07-03 02:06:02 - snowpark_db_api.connections - INFO - Successfully connected to SQL Server: alingtestserver.database.windows.net
2025-07-03 02:06:05 - snowpark_db_api.core - INFO - Transferring 5 rows
2025-07-03 02:06:06 - snowpark_db_api.core - INFO - Transfer completed: 5 rows in 6.5s


In [31]:
# Access detailed transfer statistics
stats = transfer.transfer_stats
print(f"Transfer statistics:")
print(f"  - Rows transferred: {stats.get('rows_transferred', 'N/A')}")
print(f"  - Duration: {stats.get('duration_seconds', 'N/A')} seconds")
print(f"  - Memory used: {stats.get('memory_used_mb', 'N/A')} MB")
print(f"  - Errors: {stats.get('errors', 'N/A')}")

Transfer statistics:
  - Rows transferred: 5
  - Duration: 6.452975 seconds
  - Memory used: 0.0 MB
  - Errors: 0


In [32]:
transfer.cleanup()

## 3. Custom Configuration & Advanced Patterns

Build your own configuration objects and implement custom transfer logic.


In [36]:
# # Build custom configuration programmatically
# from snowpark_db_api.config import (
#     SourceConfig, SnowflakeConfig, TransferConfig, DatabaseType
# )

# # Create completely custom configuration
# custom_source = SourceConfig(
#     host="custom-sql-server.company.com",
#     port=1433,
#     database="CustomDatabase",
#     username="custom_user",
#     password="custom_password",
#     driver="ODBC Driver 17 for SQL Server"
# )

# custom_snowflake = SnowflakeConfig(
#     account="custom_account",
#     user="custom_user", 
#     password="custom_password",
#     role="CUSTOM_ROLE",
#     warehouse="CUSTOM_WH",
#     database="CUSTOM_DB",
#     db_schema="CUSTOM_SCHEMA",
#     create_db_if_missing=True
# )

# custom_transfer = TransferConfig(
#     source_table="",  # Will be set per transfer
#     destination_table="",  # Will be set per transfer
#     mode="overwrite",
#     fetch_size=5000,  # High performance setting
#     query_timeout=300,  # 5 minute timeout
#     max_workers=8,  # Maximum parallelism
#     save_metadata=True  # Save detailed logs
# )

# # Combine into complete config
# custom_config = Config(
#     database_type=DatabaseType.SQLSERVER,
#     source=custom_source,
#     snowflake=custom_snowflake,
#     transfer=custom_transfer,
#     log_level="INFO"
# )

# print("🔧 Custom configuration created")
# print(f"Source: {custom_config.source.host}:{custom_config.source.port}")
# print(f"Target: {custom_config.snowflake.account}.{custom_config.snowflake.database}")
# print(f"Performance: {custom_config.transfer.max_workers} workers, {custom_config.transfer.fetch_size} fetch size")

# # Use the custom config with raw transfer_data function
# custom_config.transfer.source_table = "dbo.Orders"
# custom_config.transfer.destination_table = "CUSTOM_ORDERS"

# # This is the lowest level - direct function call
# success = transfer_data(
#     config=custom_config,
#     query="(SELECT TOP 50 * FROM dbo.Orders WHERE OrderDate >= '2023-01-01') AS recent_orders_custom"
# )

# print(f"Custom low-level transfer: {success}")


## 4. Direct Snowpark Operations

For the most advanced users: direct access to Snowpark DataFrames and operations.


In [35]:
# # Create a DataTransfer instance to get access to Snowpark session
# config = get_config()
# transfer = DataTransfer(config)
# transfer.setup_connections()

# # Get the raw Snowpark session
# session = transfer.session
# print(f"Snowpark session: {session}")

# # Create DataFrame with complete Snowpark control
# df = session.read.dbapi(
#     transfer.source_connection,
#     query="SELECT TOP 10 * FROM dbo.Orders",
#     fetch_size=100,
#     query_timeout=30,
#     max_workers=2
# )

# # Direct Snowpark DataFrame operations
# print("🔍 Direct Snowpark DataFrame operations:")
# print(f"Schema: {df.schema}")
# print(f"Columns: {df.columns}")

# # Apply Snowpark transformations
# from snowflake.snowpark.functions import col, when, lit

# # Custom transformations using Snowpark directly
# transformed_df = (df
#     .select(
#         col("O_ORDERKEY").alias("order_id"),
#         col("O_CUSTKEY").alias("customer_id"), 
#         col("O_TOTALPRICE").alias("total_price"),
#         when(col("O_TOTALPRICE") > 1000, lit("HIGH"))
#         .when(col("O_TOTALPRICE") > 500, lit("MEDIUM"))
#         .otherwise(lit("LOW")).alias("price_category")
#     )
#     .filter(col("O_TOTALPRICE") > 0)
#     .order_by(col("O_TOTALPRICE").desc())
# )

# print("📊 DataFrame transformed with custom Snowpark logic")

# # Show the results
# print("Sample data:")
# transformed_df.show(5)

# # Write with complete control over Snowpark write options
# write_result = (transformed_df
#     .write
#     .mode("overwrite")
#     .option("compression", "gzip")
#     .save_as_table("ADVANCED_ORDERS_ANALYSIS"))

# print(f"Advanced Snowpark write completed: {write_result}")

# # Clean up
# transfer.cleanup()
# print("⚡ Direct Snowpark operations completed")


## Summary: When to Use Low-Level API

**Use this API when:**
- ✅ You need complete control over every operation
- ✅ You're building tools on top of the core functionality
- ✅ You need custom performance optimizations
- ✅ You want direct access to Snowpark DataFrames
- ✅ You're implementing complex custom logic
- ✅ You need detailed debugging and monitoring

**Don't use this API when:**
- ❌ You just want to transfer data quickly (use High-Level API)  
- ❌ You want productive workflows (use Mid-Level API)
- ❌ You don't need the complexity
- ❌ You're not comfortable with manual resource management

**What you learned:**
- `LowLevelTransferEngine` - Raw access to all transfer operations
- `DataTransfer` - Core transfer class with manual control
- `transfer_data()` - Lowest level transfer function
- Custom `Config` building - Programmatic configuration
- Direct Snowpark operations - Complete DataFrame control
- Manual connection and resource management

**Key Principle:** With great power comes great responsibility. You control everything, but you must handle everything.

**Journey Complete:** You now have three levels of API to choose from based on your needs!
