Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3541] Augmenting datasources uniqueness constraints #3583

Merged
merged 1 commit into from Nov 20, 2017

Conversation

john-bodley
Copy link
Member

@john-bodley john-bodley commented Oct 4, 2017

This PR fixes issue #3541 by replacing the uniqueness constraint on the datasource_name column of the datasources table with the a combination of the cluster_name and datasource_name columns.

Tested by ensuring that it was valid to have multiple datasources with the same name under the condition that the clusters differed, i.e.,
screen shot 2017-10-03 at 5 38 21 pm

The previous unique datasource_name key was used as a foreign key constraint for the columns and metrics tables. Given that datasource_name is no longer unique, I'm now using the id column (which is the primary key which one could argue is a better candidate to use) as the foreign key.

Note during testing I discovered that there were duplicate foreign key constraints in the columns table (and possibly metrics) as this was defined in multiple migrations (1226819ee0e3, 3b626e2a6783). The upgrade (which actually removes these constraints and the underlying column) resolves this issue which means an upgrade followed by a downgrade will leave the database in a more viable state.

Also we should be using the inspector for the Alembic bind rather than relying on the actual Superset DB engine for inspecting the table schema, given that there may be differences in the perspective of the DB state. For example in MySQL when one performs an upgrade on a new DB the set of foreign key constraints differ between what the DB engine reports (2) and the current schema state (4). Hence the following functions are now deprecated:

  • generic_find_constraint_name
  • table_has_constraint

though regrettably need to remain to ensure consistent historical behavior for upgrades/downgrades.

Finally I incremented the id for several Druid test cases as 10003 was used in multiple test cases.

I tested the data migration, component, i.e., populating the datasource_id column in the columns and metrics table by:

  1. Creating a Druid cluster, datasource, column, and metric on the master branch.
  2. Switching to my branch.
  3. Ran superset db upgrade and confirmed that the columns and metrics tables had been correctly updated.
  4. Ran superset db downgrade and confirmed that the columns and metrics tables had been correctly reverted.

Finally for MySQL, Postgres, and SQLite, I ran the following commands,

  • superset db upgrade
  • superset db downgrade
    to validate the logic across the major SQLA connectors.

@coveralls
Copy link

coveralls commented Oct 4, 2017

Coverage Status

Coverage decreased (-0.07%) to 70.077% when pulling 09440f6a985372f5acb56ed9801cbb917424195d on john-bodley:john-bodley-issue-3541 into 076f9cd on apache:master.

@mistercrunch
Copy link
Member

Looks like Postgres is choking on this...
https://travis-ci.org/apache/incubator-superset/jobs/283000354

Given the try statement, and the fact that removing the constraint may fail on some DB, you may want to remove the original creation of the constraint in the previous migration...

@coveralls
Copy link

coveralls commented Oct 4, 2017

Coverage Status

Coverage decreased (-0.07%) to 69.935% when pulling 9df0b1850f362fbff22ffbfe96f9ddbc8d63afaf on john-bodley:john-bodley-issue-3541 into e95132d on apache:master.

@mistercrunch
Copy link
Member

Knowing that deleting constraints is iffy on some platform with alembic, I vote for also removing this line while at it:
https://github.com/apache/incubator-superset/blob/master/superset/migrations/versions/4e6a06bad7a8_init.py#L71

@john-bodley
Copy link
Member Author

@mistercrunch regardless of whether the line which defines the uniqueness constraint exists, given that superset db upgrade needs to work on all existing instances (which have this constraint) doesn't it seem prudent that the upgrade logic works in the presence of the constraint.

@mistercrunch
Copy link
Member

@john-bodley it's just to insure that new installs that would fail the try block would also not have the constraint

Otherwise, or complementarily, this function, and related commit/PRs hisotry may have hints as to how to handle deleting constraints in Alembic.
https://github.com/apache/incubator-superset/blob/master/superset/migrations/versions/1226819ee0e3_fix_wrong_constraint_on_table_columns.py#L23

@coveralls
Copy link

coveralls commented Oct 6, 2017

Coverage Status

Coverage decreased (-0.2%) to 69.844% when pulling 33b6ea0f33d6742ccced965ffbc445a8bbf3dae0 on john-bodley:john-bodley-issue-3541 into 6f1351f on apache:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.5%) to 69.627% when pulling b931bb147e37ebc0a120c464e8ef0085fb8bbe98 on john-bodley:john-bodley-issue-3541 into 6f1351f on apache:master.

@john-bodley john-bodley changed the title [3541] Augmenting datasources uniqueness constraints [3541] Augmenting datasources uniqueness constraints [WIP] Oct 7, 2017
@coveralls
Copy link

coveralls commented Oct 7, 2017

Coverage Status

Coverage decreased (-0.02%) to 70.065% when pulling fe06fa836326412fbd4ae2fe6bb70c56fb2681ed on john-bodley:john-bodley-issue-3541 into 6f1351f on apache:master.

@coveralls
Copy link

coveralls commented Oct 7, 2017

Coverage Status

Coverage decreased (-0.02%) to 70.065% when pulling dd80dbf58f815b386d730e991660367c20d64867 on john-bodley:john-bodley-issue-3541 into 6f1351f on apache:master.

@coveralls
Copy link

coveralls commented Oct 7, 2017

Coverage Status

Coverage decreased (-0.08%) to 70.003% when pulling 93ebaf7dcf20ccc57ee62540f58a909332d878a3 on john-bodley:john-bodley-issue-3541 into 6f1351f on apache:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.09%) to 69.993% when pulling cbcd4ae16a24e92d0dd2378649beb075f2eed327 on john-bodley:john-bodley-issue-3541 into 6f1351f on apache:master.

3 similar comments
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.09%) to 69.993% when pulling cbcd4ae16a24e92d0dd2378649beb075f2eed327 on john-bodley:john-bodley-issue-3541 into 6f1351f on apache:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.09%) to 69.993% when pulling cbcd4ae16a24e92d0dd2378649beb075f2eed327 on john-bodley:john-bodley-issue-3541 into 6f1351f on apache:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.09%) to 69.993% when pulling cbcd4ae16a24e92d0dd2378649beb075f2eed327 on john-bodley:john-bodley-issue-3541 into 6f1351f on apache:master.

@coveralls
Copy link

coveralls commented Oct 7, 2017

Coverage Status

Coverage decreased (-0.09%) to 69.993% when pulling 1504cc9c3af3e6472390268b2c71d8d73aac19b4 on john-bodley:john-bodley-issue-3541 into 6f1351f on apache:master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.09%) to 69.993% when pulling 1504cc9c3af3e6472390268b2c71d8d73aac19b4 on john-bodley:john-bodley-issue-3541 into 6f1351f on apache:master.

@john-bodley john-bodley changed the title [3541] Augmenting datasources uniqueness constraints [WIP] [3541] Augmenting datasources uniqueness constraints Oct 7, 2017
@john-bodley
Copy link
Member Author

@mistercrunch could you PTAL at this PR?

@mistercrunch
Copy link
Member

super_tv

@mistercrunch mistercrunch self-requested a review October 25, 2017 23:17
Copy link
Member

@mistercrunch mistercrunch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

that was a hairy problem, thanks for tackling it!

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.09%) to 71.366% when pulling 16beb4602b98d2c7bc49636a8edaf0dd8a58ea13 on john-bodley:john-bodley-issue-3541 into b059506 on apache:master.

1 similar comment
@coveralls
Copy link

coveralls commented Nov 6, 2017

Coverage Status

Coverage decreased (-0.09%) to 71.366% when pulling 16beb4602b98d2c7bc49636a8edaf0dd8a58ea13 on john-bodley:john-bodley-issue-3541 into b059506 on apache:master.

@mistercrunch
Copy link
Member

@john-bodley let me know if you want me to merge this once the conflicts are resolved.

@john-bodley john-bodley force-pushed the john-bodley-issue-3541 branch 7 times, most recently from 70ec3b1 to ace9437 Compare November 18, 2017 04:54
@john-bodley
Copy link
Member Author

@mistercrunch I've resolve all the merge conflicts.

@mistercrunch mistercrunch merged commit 3c72e1f into apache:master Nov 20, 2017
michellethomas pushed a commit to michellethomas/panoramix that referenced this pull request May 24, 2018
wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.21.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.21.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants