Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add systemdb support for MS SQL Server #6167

Merged

Conversation

JulesHuisman
Copy link
Contributor

Relates to issue #3238.

The main improvement is limiting all String types in the alembic migrations. This allows the possibility to use both MSSQL and MYSQL as a Meltano backend database.

Both databases were manually tested. It might be valuable to add automatic tests to the Github actions.

To use MYSQL run: pip install mysqlclient
To use MSSQL run: pip install pyodbc

@netlify
Copy link

netlify bot commented Jun 12, 2022

👷 Deploy request for meltano pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 7fa5e19

@pandemicsyn
Copy link
Contributor

Hi @JulesHuisman awesome to see you pick this back up, personally very excited to get some support for other DB's into Meltano 🥳

@aaronsteers this is a continuation of this thread: https://gitlab.com/meltano/meltano/-/issues/3315#note_908238671, I think what @JulesHuisman has here plus some docs covering how to use "unofficial backends like MSSQL and MySQL" are a great start and easy way to test the water.

We could follow it up and setup an actions workflow to test migrations against MySQL/MSSQL when a PR with new migrations is created. @aaronsteers In the original thread you'd expressed some concerns about being able to test against MSSQL, looks like the MSSQL on linux docker container is pretty popular and well-supported (https://hub.docker.com/_/microsoft-mssql-server), but just having a GH actions gate for migrations would go along way towards ensuring that future migrations don't having breaking changes for MySQL or MSSQL.

@aaronsteers
Copy link
Contributor

@JulesHuisman and @pandemicsyn - I would be happy to add CI tests for DBs. I am also okay to merge this and add those tests in a follow-on issue, depending on @JulesHuisman's availability and/or others who can help ship this in CI.

A quick google search shows tests may be fairly pretty easy in GitHub Actions - https://datastation.multiprocess.io/blog/2021-12-16-sqlserver-in-github-actions.html

@aaronsteers
Copy link
Contributor

aaronsteers commented Jun 14, 2022

@pandemicsyn and @edgarrmondragon - I'm removing myself from reviewer role but please re-add me if you would like me to rereview. I'll defer to you both - when you feel this is ready to merge, I'm all for it. 👍

And - again - since risk here is super low (adding str max when where they were otherwise omitted), I'm okay to merge without additional CI checks added.

The one thing I'd ask to watch out for is that I'd rather a stringlen max be overly permissive than to run a risk of truncating or failing on use cases that might introduce longer strings. (Most modern data platforms don't have any benefit to shorter lengths anyway, since they only use the num bytes that each row requires.)

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Jun 14, 2022

Correct mapping of SQLAlchemy models to migration scripts

Jobs

Plugin Settings

embed_tokens

NA

Subscriptions

User

These all really have max length of 255, so migration needs to reflect that.

Role

This also has max length of 255, so migration needs to reflect that.

Role Permissions

OAuth

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Jun 14, 2022

@pandemicsyn we used to have a sort of test matrix running tests on both SQLite and PostgreSQL:

  • PYTEST_BACKEND = os.getenv("PYTEST_BACKEND", "sqlite")
    pytest_plugins = [
    "fixtures.db",
    "fixtures.fs",
    "fixtures.core",
    "fixtures.api",
    "fixtures.cli",
    ]
    if PYTEST_BACKEND == "sqlite":
    pytest_plugins.append("fixtures.db.sqlite")
    elif PYTEST_BACKEND == "postgresql":
    pytest_plugins.append("fixtures.db.postgresql")
    else:
    raise Exception(f"Unsuported backend: {PYTEST_BACKEND}.")
    BACKEND = ["sqlite", "postgresql"]
  • .pytest_sqlite:
    extends: .pytest
    variables:
    PYTEST_BACKEND: sqlite
    # `target-sqlite` configuration
    SQLITE_DATABASE: pytest_warehouse
  • pytest_postgres:
    extends: .pytest_postgres
    variables:
    PYTEST_MARKERS: not concurrent

I suggest we migrate those (either before or after this PR merges) and include both MySQL and MSSQL in the mix.

@JulesHuisman
Copy link
Contributor Author

Thanks for taking a look guys,

In some places it might be possible to make the max string length longer. I chose 128 because in some places MYSQL would throw errors when it was longer (450).

I can also take a look at testing the migrations. The MSSQL docker container works quite well, I also used it when developing this PR.

@JulesHuisman
Copy link
Contributor Author

@edgarrmondragon @pandemicsyn Hi guys, what needs to be fixed or adjusted to get this PR through?

And do we want the tests included in this PR, or can that be included in a separate PR?

Thanks!

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Jun 21, 2022

@JulesHuisman can you merge main into this branch? We just added Postgres to the test matrix there.

@codecov
Copy link

codecov bot commented Jun 22, 2022

Codecov Report

Merging #6167 (7fa5e19) into main (ef4428f) will increase coverage by 0.02%.
The diff coverage is 86.11%.

@@            Coverage Diff             @@
##             main    #6167      +/-   ##
==========================================
+ Coverage   96.52%   96.55%   +0.02%     
==========================================
  Files          95       96       +1     
  Lines        8349     8500     +151     
  Branches      398      405       +7     
==========================================
+ Hits         8059     8207     +148     
  Misses        227      227              
- Partials       63       66       +3     
Impacted Files Coverage Δ
tests/conftest.py 91.48% <66.66%> (+0.28%) ⬆️
tests/fixtures/db/mssql.py 87.50% <87.50%> (ø)
tests/fixtures/db/postgresql.py 100.00% <100.00%> (ø)
tests/meltano/cli/test_config.py 89.55% <0.00%> (-3.05%) ⬇️
tests/meltano/core/test_settings_store.py 99.14% <0.00%> (-0.07%) ⬇️
tests/meltano/core/runner/test_runner.py 99.10% <0.00%> (-0.02%) ⬇️
tests/asserts.py 100.00% <0.00%> (ø)
tests/fixtures/api.py 100.00% <0.00%> (ø)
tests/fixtures/db/sqlite.py 100.00% <0.00%> (ø)
tests/meltano/cli/test_ui.py 100.00% <0.00%> (ø)
... and 84 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@JulesHuisman
Copy link
Contributor Author

Ah alright, so it is running into issues with the string length. I will look into it!

@edgarrmondragon
Copy link
Collaborator

@JulesHuisman an options seems to be off with the mssql service, and it's causing all workflows to fail.

We moved away from the services approach in favor of using docker run explicitly in a step (see #6268). What do you think of following a similar approach for mysql and mssql?

@JulesHuisman
Copy link
Contributor Author

JulesHuisman commented Jun 22, 2022

@JulesHuisman an options seems to be off with the mssql service, and it's causing all workflows to fail.

We moved away from the services approach in favor of using docker run explicitly in a step (see #6268). What do you think of following a similar approach for mysql and mssql?

Ah yes, looks good! I will merge the new main branch and switch over mssql and mysql. (I think my health check needs a health check itself 😃)

@joaopamaral
Copy link

Hi @JulesHuisman! First of all, thanks for this awesome work!

I just had to include a small change to make it work in my environment in the src/meltano/migrations/versions/5b43800443d1_rename_job_to_job_run_and_job_id_to_job_.py migration file to include the existing_type required by alembic.

The new file looks like this after the changes:

import sqlalchemy as sa
from alembic import op

from meltano.migrations.utils.dialect_typing import (
    get_dialect_name,
    max_string_length_for_dialect,
)

# revision identifiers, used by Alembic.
revision = "5b43800443d1"
down_revision = "13e8639c6d2b"
branch_labels = None
depends_on = None


def upgrade():
    dialect_name = get_dialect_name(op)
    max_string_length = max_string_length_for_dialect(dialect_name)

    op.alter_column("job", "job_id", 
        new_column_name="job_name",
        existing_type=sa.types.String(max_string_length)
    )
    op.rename_table("job", "runs")


def downgrade():
    dialect_name = get_dialect_name(op)
    max_string_length = max_string_length_for_dialect(dialect_name)
    
    op.rename_table("runs", "job")
    op.alter_column("job", "job_name", 
        new_column_name="job_id",
        existing_type=sa.types.String(max_string_length)
    )

After this change, I managed to make the db migration work with a MySQL instance.

@edgarrmondragon
Copy link
Collaborator

@joaopamaral thanks for sharing! I've put your changes in diff format in case @JulesHuisman or someone else wants to try them out:

 import sqlalchemy as sa
 from alembic import op
 
+from meltano.migrations.utils.dialect_typing import (
+    get_dialect_name,
+    max_string_length_for_dialect,
+)
+
 # revision identifiers, used by Alembic.
 revision = "5b43800443d1"
 down_revision = "13e8639c6d2b"
 
 
 def upgrade():
-    op.alter_column("job", "job_id", new_column_name="job_name")
+    dialect_name = get_dialect_name(op)
+    max_string_length = max_string_length_for_dialect(dialect_name)
+
+    op.alter_column("job", "job_id", 
+        new_column_name="job_name",
+        existing_type=sa.types.String(max_string_length)
+    )
     op.rename_table("job", "runs")
 
 
 def downgrade():
+    dialect_name = get_dialect_name(op)
+    max_string_length = max_string_length_for_dialect(dialect_name)
+    
     op.rename_table("runs", "job")
-    op.alter_column("job", "job_name", new_column_name="job_id")
+    op.alter_column("job", "job_name", 
+        new_column_name="job_id",
+        existing_type=sa.types.String(max_string_length)
+    )

@edgarrmondragon
Copy link
Collaborator

Heads up @JulesHuisman: I'm gonna merge main into this branch to try to resolve the conflicts.

@edgarrmondragon edgarrmondragon changed the title feat: add systemdb support for MySQL, MS SQL Server, and others feat: Add systemdb support for MS SQL Server Jul 28, 2022
@edgarrmondragon
Copy link
Collaborator

@pandemicsyn @aaronsteers this is ready for review for MSSQL support!

Copy link
Contributor

@pandemicsyn pandemicsyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! 🥳

@JulesHuisman
Copy link
Contributor Author

Thanks @edgarrmondragon, I like the idea of moving the Mysql code to another branch. That created a really weird bug in testing which I could not figure out.

@edgarrmondragon
Copy link
Collaborator

Thanks @edgarrmondragon, I like the idea of moving the Mysql code to another branch. That created a really weird bug in testing which I could not figure out.

@JulesHuisman I started a draft PR #6528, but I (or someone else) need to take some to debug those failing state tests.

Copy link
Contributor

@aaronsteers aaronsteers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work here! Very excited to get this in!

Copy link
Collaborator

@tayloramurphy tayloramurphy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments but I'm approving assuming those can get fixed so it's not a blocker

docs/src/_concepts/project.md Outdated Show resolved Hide resolved
docs/src/_guide/advanced-topics.md Outdated Show resolved Hide resolved
@edgarrmondragon edgarrmondragon merged commit 9dc9075 into meltano:main Aug 5, 2022
@JulesHuisman
Copy link
Contributor Author

Thanks for taking this over the finish line @edgarrmondragon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants