New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch migration on SQLite truncate TIMESTAMP values just to the year #391

Closed
sqlalchemy-bot opened this Issue Oct 7, 2016 · 12 comments

Comments

Projects
None yet
2 participants
@sqlalchemy-bot

sqlalchemy-bot commented Oct 7, 2016

Migrated issue, originally created by Jiri Bajer (@sarimak)

Value stored in TIMESTAMP column of SQLite DB gets truncated to its year after migration in batch mode:

$ rm test.db; python3 init.py; echo ".schema test" | sqlite3 test.db; echo "select * from test;" | sqlite3 test.db; alembic upgrade head; echo "select * from test;" | sqlite3 test.db
#!

CREATE TABLE test (
	identifier INTEGER NOT NULL, 
	start TIMESTAMP, 
	PRIMARY KEY (identifier)
);
1|1970-01-02 03:04:00.000000
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> 123456789123
1|1970

model.py:

from sqlalchemy import Table, Column, Integer
from sqlalchemy.types import TIMESTAMP
from sqlalchemy.ext.declarative import declarative_base


BaseModel = declarative_base()


class Test(BaseModel):
    __tablename__ = "test"

    identifier = Column(Integer(), primary_key=True)
    start = Column(TIMESTAMP())

init.py:

import datetime

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

from model import *


engine = create_engine("sqlite:///test.db")
BaseModel.metadata.create_all(engine)
session = sessionmaker(bind=engine)()

row = Test(start=datetime.datetime(1970, 1, 2, 3, 4))
session.add(row)
session.commit()

env.py:

import sys
from logging.config import fileConfig

from alembic import context
from sqlalchemy import create_engine

sys.path.append(".")
from model import *


fileConfig(context.config.config_file_name)
engine = create_engine("sqlite:///test.db")

with engine.connect() as connection:
    context.configure(connection=connection, target_metadata=BaseModel.metadata)

    with context.begin_transaction():
        context.run_migrations()

revisions/123456789123_v1.py

revision = "123456789123"
down_revision = None
branch_labels = None
depends_on = None

import sys

from alembic import op
import sqlalchemy as sa
from sqlalchemy import Column, Integer
from sqlalchemy.types import TIMESTAMP


def upgrade():
    with op.batch_alter_table("test") as batch_op:
        batch_op.alter_column("start", type_=TIMESTAMP)


def downgrade():
    pass
@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Oct 7, 2016

Jiri Bajer (@sarimak) wrote:

alembic (0.8.6), SQLAlchemy (1.0.12), Python 3.5.2, Ubuntu 16.04.1

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Oct 7, 2016

Changes by Jiri Bajer (@sarimak):

  • edited description
@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Oct 7, 2016

Michael Bayer (@zzzeek) wrote:

Well, SQLite thinks this makes sense:

sqlite> select CAST('1970-01-02 03:04:00.000000' AS TIMESTAMP);
1970

Which of course is a bug in SQLite, just like the fact that it does not support ALTER is a bug, and neither will ever be fixed.

Why are you changing the datatype on SQLite? If you already have TIMESTAMP, there's no need to specify type_. Works fine if you just omit it. I can see documentation notes here and maybe at some point the ability to inject custom casting functions, but that's about it.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Oct 7, 2016

Jiri Bajer (@sarimak) wrote:

Aha, I thought that type_ is just a hint for Alembic. I had to use it because I use a custom subclass of TIMESTAMP (via TypeDecorator) with redefined process_result_value and process_bind_param so I can use timezone-aware timestamps in my code and they are persisted as converted to local timezone and timezone-naiive. My primary database is MariaDB and it choked if I didn't specify the type:

#!
ERROR [alembic.util.messaging] All MySQL CHANGE/MODIFY COLUMN operations require the existing type.
  FAILED: All MySQL CHANGE/MODIFY COLUMN operations require the existing type.
SERVER_TIMEZONE = pytz.timezone("...")

class DateTimeUTC(types.TypeDecorator):
    impl = types.TIMESTAMP

    def process_result_value(self, value, _):
        if value is not None:
            value_local_timezone = SERVER_TIMEZONE.localize(value)
            value_utc_timezone = value_local_timezone.astimezone(UTC_TIMEZONE)
            return value_utc_timezone
        else:
            return None

    def process_bind_param(self, value, _):
        if value is not None:
            if not value.tzinfo:
                value_utc_timezone = UTC_TIMEZONE.localize(value)
            else:
                value_utc_timezone = value
            value_local_timezone = value_utc_timezone.astimezone(SERVER_TIMEZONE)
            value_local_naive = value_local_timezone.replace(tzinfo=None)
            return value_local_naive
        else:
            return None

BTW: I know that persisting in any other timezone than UTC is a bad practice but I was forced to do so by an existing PHP GUI which relies on it...

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Oct 7, 2016

Jiri Bajer (@sarimak) wrote:

The problem can be worked around in the migration script by detecting the dialect and conditionally adding the type_ cast only for MySQL. Far from elegant but at least works:
versions/123456789123_v1.py:

if ALEMBIC_DATABASE_URI.startswith("sqlite:"):
    kwargs = dict()  # Avoids corrupting the data by SQLite CAST AS TIMESTAMP bug which truncates the timestamp just to a year
else:
    kwargs = dict(type_=DateTimeUTC)  # Avoids failing on "All MySQL CHANGE/MODIFY COLUMN operations require the existing type."
...
    with op.batch_alter_table("my_table") as batch_op:
        batch_op.alter_column("my_column", nullable=True, **kwargs)
@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Oct 7, 2016

Michael Bayer (@zzzeek) wrote:

im assuming you're doing batch to avoid production table locking in MySQL, right?

if sqlite is not your production platform I'd just forego doing migrations with it.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Oct 7, 2016

Michael Bayer (@zzzeek) wrote:

it's possible that batch mode shouldn't do CAST() for sqlite at all. SQLite doesn't really care about the data in the column.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Oct 7, 2016

Jiri Bajer (@sarimak) wrote:

My original motivation for the batch mode (and for the SQLAlchemy/Alembic combo as well) was: I want to develop and test primarily on local SQLite before each commit and re-test on remote MySQL before finishing the two week increment. I strongly prefer having a single codebase which handles both backends reasonably well. As my migration scripts need to do ALTER TABLE and SQLite lacks a native support for non-trivial changes, I followed http://alembic.zzzcomputing.com/en/latest/batch.html and switched my migration code to batch mode so SQLite is handled and MySQL doesn't care. I don't care that much about MySQL locking because I can afford to migrate offline (with the web application shut down). But I have 200k rows now in the MySQL (SQLite DB is tiny) and will have much more (~10M) so migrating the DDL under a minute is fine for my use case.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Oct 7, 2016

Michael Bayer (@zzzeek) wrote:

yeah I don't like that use case which is why I resisted having any kind of special SQLite feature for a long time. SQLite really causes lots more trouble than it's worth. mysql installs easily.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Nov 9, 2016

Jiri Bajer (@sarimak) wrote:

Thank you for the clarification. I am currently using my workaround and works fine.

Note: I have realized that I need it whenever I call alter_column() within a batch_op. This also means whenever I am adding a column with nullable=False to an existing table (existing rows do not have any value for it so I have to add it with nullable=True, update the data and updathe the column back to nullable=False). Which happens quite often in my case.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Nov 23, 2016

Michael Bayer (@zzzeek) wrote:

cast() types in batch only if type_affinity is different

Batch mode will not use CAST() to copy data if type_ is given, however
the basic type affinity matches that of the existing type. This to
avoid SQLite's CAST of TIMESTAMP which results in truncation of the
data, in those cases where the user needs to add redundant type_ for
other reasons.

Change-Id: I20f7b399cd3cd7740d67ff7d624aa1da874ebc71
Fixes: #391

3926ac6

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Nov 23, 2016

Changes by Michael Bayer (@zzzeek):

  • changed status to closed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment