Skip to content

Migration script generates invalid Python syntax with MySQL/PostgreSQL #3497

@dedsm

Description

@dedsm

Describe the bug

The db_migration.sh script generates migration files with invalid Python syntax when used with MySQL or PostgreSQL databases. The issue occurs during Alembic's autogenerate step when it encounters DynamicPickleType and DynamicJSON custom types.

Alembic generates code like:

type_=google.adk.sessions.database_session_service.DynamicPickleType(
    pickler=<module 'pickle' from '/usr/local/lib/python3.13/pickle.py'>, 
    impl=PickleType()
)

This causes two errors:

  1. SyntaxError: pickler=<module 'pickle'...> is invalid Python syntax
  2. NameError: pickle and PickleType are not imported in the generated file

To Reproduce

  1. Install ADK with MySQL support:

    pip install google-adk pymysql
  2. Set up ADK with MySQL database using DatabaseSessionService:

    from google.adk.sessions.database_session_service import DatabaseSessionService
    
    session_service = DatabaseSessionService(
        db_url="mysql+pymysql://user:pass@localhost:3306/mydb"
    )
  3. Run the migration script:

    curl -fsSL https://raw.githubusercontent.com/google/adk-python/main/scripts/db_migration.sh \
      | sh -s -- "mysql+pymysql://user:pass@localhost:3306/mydb" \
                 "google.adk.sessions.database_session_service"
  4. The script completes successfully but leaves migration files in alembic/versions/*.py

  5. Try to run the migration manually:

    alembic upgrade head
  6. Observe SyntaxError or NameError when Python tries to import the generated migration file

Expected behavior

Generated migration files should have valid Python syntax:

import google.adk.sessions.database_session_service
import pickle
from sqlalchemy import PickleType

def upgrade() -> None:
    op.alter_column('events', 'actions',
        type_=google.adk.sessions.database_session_service.DynamicPickleType(
            pickler=pickle, 
            impl=PickleType()
        ),
        existing_nullable=False)

Desktop (please complete the following information):

  • OS: Linux (also affects macOS/Windows)
  • Python version: 3.13 (also affects 3.11+)
  • ADK version: 1.18.0

Model Information:

N/A - This issue is database-related and doesn't involve LLM models.

Additional context

Root Cause:

This issue is MySQL/PostgreSQL-specific because:

  1. DynamicPickleType uses dialect-specific implementations:

    • SQLite: Uses default PickleType (simple BLOB)
    • MySQL: Uses mysql.LONGBLOB with custom pickling
    • PostgreSQL: Likely similar to MySQL
  2. Alembic's autogenerate tries to preserve the exact type definition for MySQL's LONGBLOB, including constructor parameters

  3. The pickler parameter is set to the pickle module object, which Alembic serializes as <module 'pickle'...> instead of the name pickle

  4. The generated migration files are missing required imports

The original script works perfectly with SQLite because SQLite's type system is simpler and doesn't trigger complex type serialization.

Proposed Solution:

I was able to work around this by modifying the migration script to add custom type rendering logic to alembic/env.py, but the proper fix would be to modify the DynamicPickleType and DynamicJSON classes in src/google/adk/sessions/database_session_service.py to ensure Alembic generates correct code.

Potential approaches:

  1. Add cache_ok = True to both type classes to signal they are cacheable and stateless
  2. Override __repr__ to control how the types are serialized
  3. Make impl a property that returns a properly instantiated type
  4. Provide a render helper that can be imported and used in migration scripts

Workaround:

As a temporary workaround, I've modified the migration script to add custom type rendering. This allows the script to work with MySQL/PostgreSQL until a proper fix is implemented in the ADK codebase.

Modified migration script (click to expand)

The key changes are in sections 4.5 and 7.5:

#!/bin/bash

# This script is to update sessions DB that is created in previous ADK version,
# to schema that current ADK version use. The sample usage is in the samples/migrate_session_db.
#
# Usage:
# ./db_migration.sh "sqlite:///%(here)s/sessions.db" "google.adk.sessions.database_session_service"
# ./db_migration.sh "mysql+pymysql://user:pass@localhost/mydb" "google.adk.sessions.database_session_service"
# First argument is the sessions DB url.
# Second argument is the model import path.

# --- Configuration ---
ALEMBIC_DIR="alembic"
INI_FILE="alembic.ini"
ENV_FILE="${ALEMBIC_DIR}/env.py"

# --- Functions ---
print_usage() {
    echo "Usage: $0 <sqlalchemy_url> <model_import_path>"
    echo "  <sqlalchemy_url>: The full SQLAlchemy connection string."
    echo "  <model_import_path>: The Python import path to your models (e.g., my_project.models)"
    echo ""
    echo "Example:"
    echo "  $0 \"sqlite:///%(here)s/sessions.db\" \"google.adk.sessions.database_session_service\""
}

# --- Argument Validation ---
if [ "$#" -ne 2 ]; then
    print_usage
    exit 1
fi

DB_URL=$1
MODEL_PATH=$2

echo "Setting up Alembic..."
echo "  Database URL: ${DB_URL}"
echo "  Model Path:   ${MODEL_PATH}"
echo ""

# --- Safety Check ---
if [ -f "$INI_FILE" ] || [ -d "$ALEMBIC_DIR" ]; then
    echo "Error: 'alembic.ini' or 'alembic/' directory already exists."
    echo "Please remove them before running this script."
    exit 1
fi

# --- 1. Run alembic init ---
echo "Running 'alembic init ${ALEMBIC_DIR}'..."
alembic init ${ALEMBIC_DIR}
if [ $? -ne 0 ]; then
    echo "Error: 'alembic init' failed. Is alembic installed?"
    exit 1
fi
echo "Initialization complete."
echo ""

# --- 2. Set sqlalchemy.url in alembic.ini ---
echo "Configuring ${INI_FILE}..."
# Use a different delimiter (#) for sed to avoid escaping slashes in the URL
sed -i.bak "s#sqlalchemy.url = driver://user:pass@localhost/dbname#sqlalchemy.url = ${DB_URL}#" "${INI_FILE}"
if [ $? -ne 0 ]; then
    echo "Error: Failed to set sqlalchemy.url in ${INI_FILE}."
    exit 1
fi
echo "  Set sqlalchemy.url"

# --- 3. Set target_metadata in alembic/env.py ---
echo "Configuring ${ENV_FILE}..."

# Edit 1: Uncomment and replace the model import line
sed -i.bak "s/# from myapp import mymodel/from ${MODEL_PATH} import Base/" "${ENV_FILE}"
if [ $? -ne 0 ]; then
    echo "Error: Failed to set model import in ${ENV_FILE}."
    exit 1
fi

# Edit 2: Set the target_metadata to use the imported Base
sed -i.bak "s/target_metadata = None/target_metadata = Base.metadata/" "${ENV_FILE}"
if [ $? -ne 0 ]; then
    echo "Error: Failed to set target_metadata in ${ENV_FILE}."
    exit 1
fi

echo "  Set target_metadata"
echo ""

# --- 4. Clean up backup files ---
echo "Cleaning up backup files..."
rm "${INI_FILE}.bak"
rm "${ENV_FILE}.bak"

# --- 4.5. Patch env.py for custom type rendering ---
echo "Patching ${ENV_FILE} for custom type rendering..."

# Create a Python file with the render_item function
cat > /tmp/render_item_patch.py << 'RENDER_EOF'


# Custom rendering for ADK types to prevent syntax errors in migrations
def render_item(type_, obj, autogen_context):
    """Custom rendering for special SQLAlchemy types.
    
    This prevents Alembic from generating invalid Python syntax when
    encountering ADK's custom types like DynamicPickleType.
    """
    import pickle
    from sqlalchemy import PickleType
    
    # Handle DynamicPickleType with pickle module parameter
    if hasattr(obj, '__class__'):
        class_name = obj.__class__.__name__
        if class_name == 'DynamicPickleType':
            module = obj.__class__.__module__
            return f"{module}.DynamicPickleType(pickler=pickle, impl=PickleType())"
        elif class_name == 'DynamicJSON':
            module = obj.__class__.__module__
            return f"{module}.DynamicJSON()"
    
    # Fall back to default rendering
    return False

RENDER_EOF

# Find the last import line in env.py and insert the render_item function after it
LAST_IMPORT=$(grep -n "^import \|^from " "${ENV_FILE}" | tail -1 | cut -d: -f1)
if [ -z "$LAST_IMPORT" ]; then
    echo "Error: Could not find import statements in ${ENV_FILE}"
    exit 1
fi

# Insert the render_item function after the last import
sed -i "${LAST_IMPORT}r /tmp/render_item_patch.py" "${ENV_FILE}"

# Patch the context.configure() calls to include render_item parameter
sed -i 's/context\.configure(/context.configure(\n        render_item=render_item,/' "${ENV_FILE}"

echo "  Added custom type rendering logic"
rm -f /tmp/render_item_patch.py
echo ""

# --- 5. Run alembic stamp head ---
echo "Running 'alembic stamp head'..."
alembic stamp head
if [ $? -ne 0 ]; then
    echo "Error: 'alembic stamp head' failed."
    exit 1
fi
echo "stamping complete."
echo ""

# --- 6. Run alembic upgrade ---
echo "Running 'alembic revision --autogenerate'..."
alembic revision --autogenerate -m "ADK session DB upgrade"
if [ $? -ne 0 ]; then
    echo "Error: 'alembic revision' failed."
    exit 1
fi
echo "revision complete."
echo ""

# --- 7. Add import statement to version files ---
echo "Adding import statement to version files..."
for f in ${ALEMBIC_DIR}/versions/*.py; do
  if [ -f "$f" ]; then
    # Check if the first line is already the import statement
    FIRST_LINE=$(head -n 1 "$f")
    IMPORT_STATEMENT="import ${MODEL_PATH}"
    if [ "$FIRST_LINE" != "$IMPORT_STATEMENT" ]; then
      echo "Adding import to $f"
      sed -i.bak "1s|^|${IMPORT_STATEMENT}\n|" "$f"
      rm "${f}.bak"
    else
      echo "Import already exists in $f"
    fi
  fi
done
echo "Import statements added."

# --- 7.5. Add pickle and PickleType imports ---
echo "Adding pickle and PickleType imports to migration files..."
for f in ${ALEMBIC_DIR}/versions/*.py; do
  if [ -f "$f" ]; then
    # Add pickle import after the existing import at line 1
    # Check if imports already exist
    if ! grep -q "^import pickle" "$f"; then
      sed -i "2i\\import pickle" "$f"
      echo "  Added 'import pickle' to $f"
    fi
    if ! grep -q "^from sqlalchemy import PickleType" "$f"; then
      sed -i "3i\\from sqlalchemy import PickleType" "$f"
      echo "  Added 'from sqlalchemy import PickleType' to $f"
    fi
  fi
done
echo ""

# --- 8. Run alembic upgrade ---
echo "running 'alembic upgrade'..."
alembic upgrade head
if [ $? -ne 0 ]; then
    echo "Error: 'alembic upgrade' failed. "
    exit 1
fi
echo "upgrade complete."
echo ""

echo "---"
echo "✅ ADK session DB is Updated!"

This workaround demonstrates what's needed to make the script work, but ideally the fix should be in the ADK type definitions themselves so that Alembic naturally generates correct code.

Testing:

It would be beneficial to include MySQL and PostgreSQL in the test matrix for DatabaseSessionService since they are listed as supported databases in the service registry.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions