You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I used the GitHub search to find a similar issue and didn't find it.
I searched the Prefect documentation for this issue.
I checked that this issue is related to Prefect and not one of its dependencies.
Bug summary
After upgrading to 2.7.11 orion started crashlooping, logs showed sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) duplicate column name: has_data.
Rolled back the migrations, downgraded to 2.7.10, tried to investigate.
Alembic upgrade never terminates (see Error).
The culpable revision is f92143d30c25, introduced in PR #8164.
It prepares flow_run_state and task_run_state tables for the subsequent migration of artifact data (in f92143d30c26)
The relevant part is that it calculates and sets the (boolean) value for has_data column based on the value of data column. data can either be a NULL, a string 'null', or a string containing a json value.
The logic in populate_flow_has_data_in_batches is, I believe, intended to set has_data to 1 if data contains a json value and to 0 otherwise.
The current logic does not achieve that goal.
Problems
Comparing anything with a NULL via an (in)equality operator results in a NULL (see 2.)
Since the two parts of the condition are exclusive, disjunction leads to the condition evaluating to 1 for data='null' (same would happen for data=NULL if not for problem 1)
Goal
data IS NOT NULL
data != 'null'
data IS NOT NULL or data != 'null'
NULL
0
0
NULL
NULL
'null'
0
1
0
1
JSON
1
1
1
1
Problem 2 results in false positives, migrating data from every row to artifact table. This affects runs created on every version, though I'm not sure what the impact is besides bloat.
Problem 1 results in the upgrade never terminating if >500 rows with data=NULL are present in a table.
Seems like in versions>=2.6.0 data defaults to 'null', so this should only affect databases with runs created on <=2.5.0.
Stamp database with f92143d30c25, downgrade to bb38729c471a, replace two lines in revision f92143d30c25, run migrations.
Reproduction
SQL:
importsqlite3conn=sqlite3.connect(":memory:")
cur=conn.cursor()
stmt=""" SELECT (PLACEHOLDER IS NOT NULL), (PLACEHOLDER != 'null'), (PLACEHOLDER IS NOT 'null'), (PLACEHOLDER IS NOT NULL or PLACEHOLDER != 'null'), (PLACEHOLDER IS NOT NULL and PLACEHOLDER != 'null'), (PLACEHOLDER IS NOT NULL or PLACEHOLDER IS NOT 'null'), (PLACEHOLDER IS NOT NULL and PLACEHOLDER IS NOT 'null')"""cur.execute(stmt.replace("PLACEHOLDER", "NULL"))
print(cur.fetchall())
cur.execute(stmt.replace("PLACEHOLDER", "'null'"))
print(cur.fetchall())
cur.execute(stmt.replace("PLACEHOLDER", "'JSON'"))
print(cur.fetchall())
Start orion, start an agent, upload a deployment, run it at least once
pip install prefect==2.7.11
You need to have more rows with data=NULL in flow_run_state than batch_size used here, lower the value manually.
Start orion, migrations will never terminate and the server will never start. Restarting the server will leave database in a broken state since has_data has already been added, but migrations terminated prematurely.
First check
Bug summary
After upgrading to 2.7.11 orion started crashlooping, logs showed
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) duplicate column name: has_data
.Rolled back the migrations, downgraded to 2.7.10, tried to investigate.
Alembic upgrade never terminates (see
Error
).The culpable revision is
f92143d30c25
, introduced in PR #8164.It prepares
flow_run_state
andtask_run_state
tables for the subsequent migration of artifact data (inf92143d30c26
)The relevant part is that it calculates and sets the (boolean) value for
has_data
column based on the value ofdata
column.data
can either be aNULL
, a string'null'
, or a string containing a json value.The logic in populate_flow_has_data_in_batches is, I believe, intended to set
has_data
to 1 ifdata
contains a json value and to 0 otherwise.The current logic does not achieve that goal.
Problems
data='null'
(same would happen fordata=NULL
if not for problem 1)Problem 2 results in false positives, migrating
data
from every row toartifact
table. This affects runs created on every version, though I'm not sure what the impact is besides bloat.Problem 1 results in the upgrade never terminating if >500 rows with
data=NULL
are present in a table.Seems like in versions>=2.6.0
data
defaults to'null'
, so this should only affect databases with runs created on <=2.5.0.Solution
IS NOT
instead of!=
(again, see 2.)and
instead ofor
Workaround
Stamp database with
f92143d30c25
, downgrade tobb38729c471a
, replace two lines in revisionf92143d30c25
, run migrations.Reproduction
SQL:
Complete:
pip install prefect==2.4.2
export PREFECT_ORION_DATABASE_CONNECTION_URL=sqlite+aiosqlite:///test.db
export PREFECT_ORION_DATABASE_ECHO=True
pip install prefect==2.7.11
data=NULL
inflow_run_state
thanbatch_size
used here, lower the value manually.has_data
has already been added, but migrations terminated prematurely.Error
Versions
Additional context
No response
The text was updated successfully, but these errors were encountered: