Skip to content

feat(db): set up Alembic migrations with full initial schema#64

Merged
RegardV merged 1 commit intomainfrom
production-inkypyrus
Feb 22, 2026
Merged

feat(db): set up Alembic migrations with full initial schema#64
RegardV merged 1 commit intomainfrom
production-inkypyrus

Conversation

@RegardV
Copy link
Owner

@RegardV RegardV commented Feb 22, 2026

User description

Summary

  • Completes the empty Alembic setup that existed as a stub
  • alembic/env.py: async engine, imports all 20 models, reads DATABASE_URL from env
  • alembic/script.py.mako: migration template
  • e5c18ff2255d_initial_schema.py: initial migration capturing all 20 tables
  • Existing DB stamped at head — no destructive changes

Usage going forward

# After changing a model:
alembic revision --autogenerate -m "add_column_x_to_table_y"

# Apply on deployment:
alembic upgrade head

# Roll back last migration:
alembic downgrade -1

🤖 Generated with Claude Code


PR Type

Enhancement


Description

  • Complete Alembic setup with async engine and environment configuration

  • Initial migration capturing 20 database tables with full schema

  • Database URL read from environment variable at runtime

  • Migration template and logging configuration for future schema changes


Diagram Walkthrough

flowchart LR
  A["Alembic Configuration"] --> B["env.py: Async Engine Setup"]
  B --> C["Import All 20 Models"]
  C --> D["Initial Migration File"]
  D --> E["20 Database Tables Created"]
  F["alembic.ini: Logging Config"] --> B
  G["script.py.mako: Template"] --> D
Loading

File Walkthrough

Relevant files
Enhancement
env.py
Async engine setup with model imports                                       

journal-platform-backend/alembic/env.py

  • Implements async engine configuration using async_engine_from_config
  • Imports all 20 models to enable Alembic autogenerate detection
  • Reads DATABASE_URL environment variable at runtime for database
    connection
  • Supports both offline and online migration modes with async support
+80/-0   
e5c18ff2255d_initial_schema.py
Initial schema migration for 20 tables                                     

journal-platform-backend/alembic/versions/e5c18ff2255d_initial_schema.py

  • Creates 20 tables: users, themes, projects, journal_entries,
    journal_templates, journal_media, export_jobs, export_files,
    export_history, export_queue, export_templates, kdp_submissions,
    agent_runs, email_verifications, password_resets, refresh_tokens,
    oauth_accounts, login_attempts, security_events,
    inventory_team_activity, inventory_generation_context,
    inventory_quick_actions
  • Defines foreign key relationships between tables
  • Creates indexes on key columns for performance optimization
  • Includes both upgrade and downgrade functions for reversibility
+644/-0 
script.py.mako
Migration file template structure                                               

journal-platform-backend/alembic/script.py.mako

  • Provides template for generating new migration files
  • Includes revision identifiers and dependency tracking
  • Defines upgrade and downgrade function stubs
  • Supports custom imports for migration-specific code
+26/-0   
Configuration changes
alembic.ini
Logging and database configuration                                             

journal-platform-backend/alembic.ini

  • Adds database URL configuration with PostgreSQL async driver
  • Configures logging with root, sqlalchemy, and alembic loggers
  • Sets up console handler with formatted output
  • Enables path separator and sys path configuration
+41/-0   

- Complete alembic.ini with logging config
- alembic/env.py: async engine, imports all 20 models for autogenerate,
  reads DATABASE_URL from environment at runtime
- alembic/script.py.mako: migration file template
- e5c18ff2255d_initial_schema.py: captures full current schema (20 tables)
  users, themes, projects, journal_entries, journal_templates, journal_media,
  export_jobs, export_files, export_history, export_queue, export_templates,
  kdp_submissions, agent_runs, email_verifications, password_resets,
  refresh_tokens, oauth_accounts, login_attempts, security_events,
  inventory_team_activity, inventory_generation_context, inventory_quick_actions

Existing DB stamped at head — no destructive changes to running instance.
Future schema changes: alembic revision --autogenerate -m "description"
Apply migrations: alembic upgrade head

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@RegardV RegardV merged commit a2932a1 into main Feb 22, 2026
3 of 7 checks passed
@qodo-code-review
Copy link

CI Feedback 🧐

A test triggered by this PR failed. Here is an AI-generated analysis of the failure:

Action: review

Failed stage: Set up job [❌]

Failed test name: ""

Failure summary:

The workflow failed during the "Prepare all required actions" step because GitHub Actions could not
download the referenced action qodo-ai/pr-agent@v0.29.0.
- Error: Unable to resolve action
qodo-ai/pr-agent@v0.29.0, unable to find version v0.29.0
- This indicates the tag/ref v0.29.0 does
not exist (or is not accessible) in the qodo-ai/pr-agent repository, so the runner cannot resolve
and fetch that action version.

Relevant error logs:
1:  ##[group]Runner Image Provisioner
2:  Hosted Compute Agent
...

13:  ##[group]Runner Image
14:  Image: ubuntu-24.04
15:  Version: 20260201.15.1
16:  Included Software: https://github.com/actions/runner-images/blob/ubuntu24/20260201.15/images/ubuntu/Ubuntu2404-Readme.md
17:  Image Release: https://github.com/actions/runner-images/releases/tag/ubuntu24%2F20260201.15
18:  ##[endgroup]
19:  ##[group]GITHUB_TOKEN Permissions
20:  Contents: read
21:  Metadata: read
22:  Packages: read
23:  ##[endgroup]
24:  Secret source: Actions
25:  Prepare workflow directory
26:  Prepare all required actions
27:  Getting action download info
28:  ##[error]Unable to resolve action `qodo-ai/pr-agent@v0.29.0`, unable to find version `v0.29.0`

@qodo-code-review
Copy link

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🔴
Hardcoded DB credentials

Description: The Alembic config includes a hardcoded database connection string with embedded
credentials (postgresql+asyncpg://user:password@localhost/journal_platform), which risks
credential leakage if committed or reused outside local development.
alembic.ini [6-8]

Referred Code
# Database URL — overridden at runtime by env.py reading DATABASE_URL env var
sqlalchemy.url = postgresql+asyncpg://user:password@localhost/journal_platform
Sensitive token storage

Description: The initial schema introduces multiple columns intended to store highly sensitive
secrets/tokens in plaintext (e.g., users.openai_api_key, users.verification_token,
users.reset_password_token, refresh_tokens.token, and
oauth_accounts.access_token/oauth_accounts.refresh_token), which increases impact if the
DB is accessed or dumped and typically warrants encryption/at-rest protection and strict
access controls.
e5c18ff2255d_initial_schema.py [47-215]

Referred Code
op.create_table('users',
sa.Column('email', sa.String(length=255), nullable=False),
sa.Column('username', sa.String(length=100), nullable=True),
sa.Column('full_name', sa.String(length=255), nullable=True),
sa.Column('password_hash', sa.String(length=255), nullable=False),
sa.Column('is_verified', sa.Boolean(), nullable=True),
sa.Column('verification_token', sa.String(length=255), nullable=True),
sa.Column('reset_password_token', sa.String(length=255), nullable=True),
sa.Column('reset_password_expires', sa.DateTime(timezone=True), nullable=True),
sa.Column('google_id', sa.String(length=255), nullable=True),
sa.Column('github_id', sa.String(length=255), nullable=True),
sa.Column('oauth_provider', sa.String(length=50), nullable=True),
sa.Column('preferences', sa.JSON(), nullable=True),
sa.Column('timezone', sa.String(length=50), nullable=True),
sa.Column('language', sa.String(length=10), nullable=True),
sa.Column('profile_type', sa.String(length=50), nullable=True),
sa.Column('subscription', sa.String(length=20), nullable=True),
sa.Column('library_access', sa.Boolean(), nullable=True),
sa.Column('openai_api_key', sa.String(length=255), nullable=True),
sa.Column('ai_provider', sa.String(length=50), nullable=True),
sa.Column('is_premium', sa.Boolean(), nullable=True),


 ... (clipped 148 lines)
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Missing failure handling: The async migration path creates and uses an engine without guarding failures (e.g.,
connect/config errors) and without a try/finally to ensure connectable.dispose() executes
on exceptions.

Referred Code
async def run_async_migrations() -> None:
    """Run migrations in 'online' mode using async engine."""
    connectable = async_engine_from_config(
        config.get_section(config.config_ini_section, {}),
        prefix="sqlalchemy.",
        poolclass=pool.NullPool,
    )
    async with connectable.connect() as connection:
        await connection.run_sync(do_run_migrations)
    await connectable.dispose()

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Unstructured log format: The logging formatter is plain text rather than structured (e.g., JSON), which does not
meet the checklist requirement for structured logs for easy auditing.

Referred Code
[formatter_generic]
format = %(levelname)-5.5s [%(name)s] %(message)s

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Audit logging unclear: The PR adds schema (e.g., login_attempts, security_events) but the diff does not
demonstrate that critical actions are actually logged with user ID, timestamp, action
description, and outcome at runtime.

Referred Code
op.create_table('login_attempts',
sa.Column('email', sa.String(length=255), nullable=False),
sa.Column('ip_address', sa.String(length=45), nullable=False),
sa.Column('user_agent', sa.Text(), nullable=True),
sa.Column('success', sa.Boolean(), nullable=False),
sa.Column('failure_reason', sa.String(length=255), nullable=True),
sa.Column('attempted_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
sa.Column('user_id', sa.Integer(), nullable=True),
sa.Column('id', sa.Integer(), autoincrement=True, nullable=False),
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=False),
sa.Column('updated_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=False),
sa.ForeignKeyConstraint(['user_id'], ['users.id'], ),
sa.PrimaryKeyConstraint('id')
)
op.create_index(op.f('ix_login_attempts_email'), 'login_attempts', ['email'], unique=False)
op.create_index(op.f('ix_login_attempts_id'), 'login_attempts', ['id'], unique=False)
op.create_table('oauth_accounts',
sa.Column('user_id', sa.Integer(), nullable=False),
sa.Column('provider', sa.String(length=50), nullable=False),
sa.Column('provider_id', sa.String(length=255), nullable=False),
sa.Column('provider_email', sa.String(length=255), nullable=True),


 ... (clipped 60 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
URL input not validated: The PR reads DATABASE_URL from the environment and applies it directly to Alembic
configuration without demonstrating validation/sanitization or constraints on accepted
schemes/hosts.

Referred Code
# Override sqlalchemy.url with DATABASE_URL env var if set
database_url = os.getenv("DATABASE_URL")
if database_url:
    config.set_main_option("sqlalchemy.url", database_url)

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Consider normalizing the database schema

The database schema is denormalized, particularly the users table which has over
40 columns for different concerns. Refactor the schema into smaller, normalized
tables to improve data integrity and maintainability.

Examples:

journal-platform-backend/alembic/versions/e5c18ff2255d_initial_schema.py [47-94]
    op.create_table('users',
    sa.Column('email', sa.String(length=255), nullable=False),
    sa.Column('username', sa.String(length=100), nullable=True),
    sa.Column('full_name', sa.String(length=255), nullable=True),
    sa.Column('password_hash', sa.String(length=255), nullable=False),
    sa.Column('is_verified', sa.Boolean(), nullable=True),
    sa.Column('verification_token', sa.String(length=255), nullable=True),
    sa.Column('reset_password_token', sa.String(length=255), nullable=True),
    sa.Column('reset_password_expires', sa.DateTime(timezone=True), nullable=True),
    sa.Column('google_id', sa.String(length=255), nullable=True),

 ... (clipped 38 lines)

Solution Walkthrough:

Before:

# alembic/versions/e5c18ff2255d_initial_schema.py

def upgrade():
    op.create_table('users',
        sa.Column('id', ...),
        sa.Column('email', ...),
        sa.Column('password_hash', ...),
        # Profile columns
        sa.Column('full_name', ...),
        sa.Column('bio', ...),
        sa.Column('avatar_url', ...),
        # Subscription columns
        sa.Column('is_premium', ...),
        sa.Column('subscription_id', ...),
        sa.Column('stripe_customer_id', ...),
        # Settings columns
        sa.Column('preferences', sa.JSON(), ...),
        sa.Column('email_notifications', ...),
        # ... and 30+ other columns
    )

After:

# alembic/versions/e5c18ff2255d_initial_schema.py

def upgrade():
    op.create_table('users',
        sa.Column('id', ...),
        sa.Column('email', ...),
        sa.Column('password_hash', ...),
        # ... core user columns
    )
    op.create_table('user_profiles',
        sa.Column('user_id', sa.ForeignKey('users.id'), ...),
        sa.Column('full_name', ...),
        sa.Column('bio', ...),
        sa.Column('avatar_url', ...),
    )
    op.create_table('user_subscriptions',
        sa.Column('user_id', sa.ForeignKey('users.id'), ...),
        sa.Column('is_premium', ...),
        sa.Column('stripe_customer_id', ...),
    )
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a critical architectural issue in the initial database schema, where tables like users are highly denormalized, which will negatively impact future maintainability, performance, and data integrity.

High
Security
Encrypt sensitive API keys at rest

Encrypt the openai_api_key column in the users table before storing it in the
database to prevent exposure of sensitive credentials.

journal-platform-backend/alembic/versions/e5c18ff2255d_initial_schema.py [65]

-sa.Column('openai_api_key', sa.String(length=255), nullable=True),
+sa.Column('openai_api_key_encrypted', sa.LargeBinary(), nullable=True),
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: This suggestion correctly identifies a critical security vulnerability by pointing out that the openai_api_key is stored in plaintext, and it proposes a standard and effective solution.

High
Encrypt OAuth tokens before database storage

Encrypt the access_token and refresh_token columns in the oauth_accounts table
to protect user credentials from database compromise.

journal-platform-backend/alembic/versions/e5c18ff2255d_initial_schema.py [166-172]

 op.create_table('oauth_accounts',
 sa.Column('user_id', sa.Integer(), nullable=False),
 sa.Column('provider', sa.String(length=50), nullable=False),
 sa.Column('provider_id', sa.String(length=255), nullable=False),
 sa.Column('provider_email', sa.String(length=255), nullable=True),
-sa.Column('access_token', sa.Text(), nullable=True),
-sa.Column('refresh_token', sa.Text(), nullable=True),
+sa.Column('access_token_encrypted', sa.LargeBinary(), nullable=True),
+sa.Column('refresh_token_encrypted', sa.LargeBinary(), nullable=True),
 ...

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 9

__

Why: The suggestion correctly points out a significant security risk of storing OAuth tokens in plaintext and recommends encryption, which is the correct mitigation strategy.

High
Store refresh token hashes, not plaintext

Hash the token in the refresh_tokens table instead of storing it in plaintext to
improve security against session hijacking.

journal-platform-backend/alembic/versions/e5c18ff2255d_initial_schema.py [198-214]

 op.create_table('refresh_tokens',
 sa.Column('user_id', sa.Integer(), nullable=False),
-sa.Column('token', sa.String(length=500), nullable=False),
+sa.Column('token_hash', sa.String(length=255), nullable=False),
 sa.Column('is_revoked', sa.Boolean(), nullable=True),
 ...
 sa.PrimaryKeyConstraint('id')
 )
 op.create_index(op.f('ix_refresh_tokens_id'), 'refresh_tokens', ['id'], unique=False)
-op.create_index(op.f('ix_refresh_tokens_token'), 'refresh_tokens', ['token'], unique=True)
+op.create_index(op.f('ix_refresh_tokens_token_hash'), 'refresh_tokens', ['token_hash'], unique=True)

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 9

__

Why: The suggestion proposes a best-practice security enhancement by hashing refresh tokens instead of storing them in plaintext, significantly reducing the risk of session hijacking.

High
Replace hardcoded DB URL

Remove the hardcoded database URL from alembic.ini and rely on the DATABASE_URL
environment variable for better security.

journal-platform-backend/alembic.ini [7]

-sqlalchemy.url = postgresql+asyncpg://user:password@localhost/journal_platform
+# sqlalchemy.url = postgresql+asyncpg://user:password@localhost/journal_platform
+sqlalchemy.url = ${DATABASE_URL}
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: This is a valid security best practice. Although the PR already includes logic to override this value with an environment variable, removing the hardcoded default credentials prevents accidental exposure or usage.

Medium
Possible issue
Remove invalid ini options

Remove or correct the invalid prepend_sys_path and version_path_separator
options in alembic.ini to prevent breaking the Alembic configuration.

journal-platform-backend/alembic.ini [3-4]

-prepend_sys_path = .
-version_path_separator = os
+# Removed invalid options
+# prepend_sys_path = .
+# version_path_separator = os
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: This suggestion correctly identifies that version_path_separator = os is an invalid configuration that will cause Alembic to fail, preventing migrations from running.

Medium
Remove undefined handler args

Remove the args = (sys.stderr,) line from alembic.ini as sys is not available in
this context and will break the configuration.

journal-platform-backend/alembic.ini [37]

-args = (sys.stderr,)
+# args = (sys.stderr,)
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: This suggestion correctly identifies that args = (sys.stderr,) is an invalid configuration that will cause Alembic to fail, preventing migrations from running.

Medium
Use timezone-aware datetime columns consistently

Update the created_at, started_at, completed_at, and updated_at columns in the
agent_runs table to be timezone-aware by using sa.DateTime(timezone=True).

journal-platform-backend/alembic/versions/e5c18ff2255d_initial_schema.py [302-329]

 op.create_table('agent_runs',
 sa.Column('id', sa.Integer(), nullable=False),
 ...
 sa.Column('generate_kdp', sa.Boolean(), nullable=True),
-sa.Column('created_at', sa.DateTime(), nullable=False),
-sa.Column('started_at', sa.DateTime(), nullable=True),
-sa.Column('completed_at', sa.DateTime(), nullable=True),
-sa.Column('updated_at', sa.DateTime(), nullable=True),
+sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=False),
+sa.Column('started_at', sa.DateTime(timezone=True), nullable=True),
+sa.Column('completed_at', sa.DateTime(timezone=True), nullable=True),
+sa.Column('updated_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=False),
 sa.ForeignKeyConstraint(['project_id'], ['projects.id'], ),
 ...

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies an inconsistency in using timezone-naive DateTime columns in the agent_runs table, which can lead to bugs. Enforcing consistency is good practice.

Medium
  • More

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant