-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unpin alembic #5249
Unpin alembic #5249
Conversation
@@ -31,7 +31,8 @@ def upgrade(): | |||
# operation is expected to fail under certain circumstances, we execute `drop_constraint()` | |||
# outside of the batch operation context. | |||
try: | |||
op.drop_constraint(constraint_name="status", table_name="runs", type_="check") | |||
# op.drop_constraint(constraint_name="status", table_name="runs", type_="check") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why drop this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just temporarily commented out to see how the table definition looks like without this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How the runs table definition looks like for each database:
sqlite:
CREATE TABLE runs (
run_uuid VARCHAR(32) NOT NULL,
name VARCHAR(250),
source_type VARCHAR(20),
source_name VARCHAR(500),
entry_point_name VARCHAR(50),
user_id VARCHAR(256),
status VARCHAR(9),
start_time BIGINT,
end_time BIGINT,
source_version VARCHAR(50),
lifecycle_stage VARCHAR(20),
artifact_uri VARCHAR(200),
experiment_id INTEGER,
CONSTRAINT run_pk PRIMARY KEY (run_uuid),
FOREIGN KEY(experiment_id) REFERENCES experiments (experiment_id),
CONSTRAINT runs_lifecycle_stage CHECK (lifecycle_stage IN ('active', 'deleted')),
CONSTRAINT source_type CHECK (source_type IN ('NOTEBOOK', 'JOB', 'LOCAL', 'UNKNOWN', 'PROJECT')),
-- 👇 Unnamed check constraint, expression looks correct
-- the reason it's unnamed is probably because we don't specify `name`
-- when constructing `Enum`:
-- https://github.com/mlflow/mlflow/pull/5249/files#diff-3492d101d4bd194139919dcac84b713b0ee4526b79d32e45c44db3655f95e838R46
CHECK (status IN ('SCHEDULED', 'FAILED', 'FINISHED', 'RUNNING', 'KILLED'))
)
postgres:
CREATE TABLE runs (
run_uuid VARCHAR(32) NOT NULL,
name VARCHAR(250),
source_type VARCHAR(20),
source_name VARCHAR(500),
entry_point_name VARCHAR(50),
user_id VARCHAR(256),
status VARCHAR(9),
start_time BIGINT,
end_time BIGINT,
source_version VARCHAR(50),
lifecycle_stage VARCHAR(20),
artifact_uri VARCHAR(200),
experiment_id INTEGER,
CONSTRAINT run_pk PRIMARY KEY (run_uuid),
CONSTRAINT runs_experiment_id_fkey FOREIGN KEY(experiment_id) REFERENCES experiments (experiment_id),
CONSTRAINT source_type CHECK ((source_type)::text = ANY ((ARRAY['NOTEBOOK'::character varying, 'JOB'::character varying, 'LOCAL'::character varying, 'UNKNOWN'::character varying, 'PROJECT'::character varying])::text[])),
CONSTRAINT runs_lifecycle_stage CHECK ((lifecycle_stage)::text = ANY ((ARRAY['active'::character varying, 'deleted'::character varying])::text[])),
-- 👇 Named check constraint, expression looks correct
CONSTRAINT runs_status_check CHECK ((status)::text = ANY ((ARRAY['SCHEDULED'::character varying, 'FAILED'::character varying, 'FINISHED'::character varying, 'RUNNING'::character varying, 'KILLED'::character varying])::text[]))
)
mysql:
CREATE TABLE runs (
run_uuid VARCHAR(32) NOT NULL,
name VARCHAR(250),
source_type VARCHAR(20),
source_name VARCHAR(500),
entry_point_name VARCHAR(50),
user_id VARCHAR(256),
status VARCHAR(9),
start_time BIGINT,
end_time BIGINT,
source_version VARCHAR(50),
lifecycle_stage VARCHAR(20),
artifact_uri VARCHAR(200),
experiment_id INTEGER,
PRIMARY KEY (run_uuid),
CONSTRAINT runs_ibfk_1 FOREIGN KEY(experiment_id) REFERENCES experiments (experiment_id),
-- 👇 Duplicate
CONSTRAINT runs_chk_1 CHECK ((`status` in (_utf8mb4'SCHEDULED',_utf8mb4'FAILED',_utf8mb4'FINISHED',_utf8mb4'RUNNING',_utf8mb4'KILLED'))),
CONSTRAINT runs_lifecycle_stage CHECK ((`lifecycle_stage` in (_utf8mb4'active',_utf8mb4'deleted'))),
CONSTRAINT source_type CHECK ((`source_type` in (_utf8mb4'NOTEBOOK',_utf8mb4'JOB',_utf8mb4'LOCAL',_utf8mb4'UNKNOWN',_utf8mb4'PROJECT'))),
-- 👇 Duplicate
CONSTRAINT status CHECK ((`status` in (_utf8mb4'SCHEDULED',_utf8mb4'FAILED',_utf8mb4'FINISHED',_utf8mb4'RUNNING')))
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we need this drop_constraint
operation for mysql.
Signed-off-by: harupy <hkawamura0130@gmail.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <hkawamura0130@gmail.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <hkawamura0130@gmail.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <hkawamura0130@gmail.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harupy Looks good, just need to address the check constraint issue for MySQL. Thanks for doing this!
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Confirmed the new migration script works for both old and new versions of alembic: |
@WeichenXu123 @dbczumar Could you take another look again and approve if everything looks ok? |
docker-compose run mlflow-sqlite python run_checks.py --schema-output schemas/sqlite.sql | ||
docker-compose run mlflow-postgres python run_checks.py --schema-output schemas/postgres.sql | ||
docker-compose run mlflow-mysql python run_checks.py --schema-output schemas/mysql.sql | ||
docker-compose run mlflow-mssql ./init-mssql-db.sh | ||
docker-compose run mlflow-mssql python run_checks.py --schema-output schemas/mssql.sql | ||
docker-compose down --rmi all --volumes --remove-orphans | ||
docker-compose run mlflow-sqlite | ||
docker-compose run mlflow-postgres | ||
docker-compose run mlflow-mysql | ||
docker-compose run mlflow-mssql | ||
docker-compose down --volumes --remove-orphans --rmi all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaned up this step by adding commands in docker-compose.yml.
COPY dist ./dist | ||
|
||
RUN pip install dist/*.whl | ||
RUN pip install psycopg2 pymysql mysqlclient | ||
COPY dist ./dist | ||
RUN pip install dist/mlflow-*.whl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For rebuilding the image a bit faster.
restart: always | ||
environment: | ||
MYSQL_ROOT_PASSWORD: root-password | ||
MYSQL_DATABASE: mlflowdb | ||
MYSQL_USER: mlflowuser | ||
MYSQL_PASSWORD: mlflowpassword | ||
command: mysqld --default-authentication-plugin=mysql_native_password |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In MySQL >= 8.0.4, this command is required to log in using a password.
# Ensure the following migration scripts work correctly: | ||
# - cfd24bdc0731_update_run_status_constraint_with_killed.py | ||
# - 0a8213491aaa_drop_duplicate_killed_constraint.py | ||
client = mlflow.tracking.MlflowClient() | ||
client.set_terminated(run_id=run.info.run_id, status="KILLED") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a check to ensure the updated migration scripts work properly.
docker-compose pull | ||
docker image ls | grep -E '(REPOSITORY|postgres|mysql|mssql)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Show database image versions for debugging.
"alembic<=1.4.1", | ||
"alembic", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What changes are proposed in this pull request?
Unpin alembic and fix migration scripts affected by this change.
Closes #5245, #4810, #4215
How is this patch tested?
Existing tests
Does this PR change the documentation?
ci/circleci: build_doc
check. If it's successful, proceed to thenext step, otherwise fix it.
Details
on the right to open the job page of CircleCI.Artifacts
tab.docs/build/html/index.html
.Release Notes
Is this a user-facing change?
(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/projects
: MLproject format, project running backendsarea/scoring
: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra
: MLflow Tracking server backendarea/tracking
: Tracking Service, tracking client APIs, autologgingInterface
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportLanguage
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesIntegrations
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrationsHow should the PR be classified in the release notes? Choose one:
rn/breaking-change
- The PR will be mentioned in the "Breaking Changes" sectionrn/none
- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature
- A new user-facing feature worth mentioning in the release notesrn/bug-fix
- A user-facing bug fix worth mentioning in the release notesrn/documentation
- A user-facing documentation change worth mentioning in the release notes