Fix constraints to use .drop_all() instead of dropping database #1403

erseco · 2020-05-13T15:56:24Z

Fix constraints to use use_alter=True in the foreign key between Users and Teams, this allows to use the sqlalchemy's built-in .drop_all() function instead of dropping database via sqlalchemy-utils' drop_database().

Drop a running database in production is a totally discouraged practice, so a better approach is dropping all database objects. This is the main reason why sqlalchemy don't have built-in methods for dropping and creating databases.

This PR improves PostgreSQL support to try to maintain currently supported databases in the upcoming 3.x versión

This PR also fixes #1397 so I can close that PR

More info about why using use_alter=True to avoid constraint errors: https://docs.sqlalchemy.org/en/13/core/constraints.html#creating-dropping-foreign-key-constraints-via-alter

ColdHeat · 2020-05-13T16:12:12Z

Maintaining database compatibility causes nothing but problems and I do not see any benefit to keeping Postgres support inside of CTFd.

For example, by specifying the name you likely stop the automatic foreign key name generation so there's probably a weird oddly named foreign key inside of MySQL (the actual CTFd database).

You can already deploy MySQL/MariaDB on Heroku so I don't see why you need to fight to support Postgres.

codecov · 2020-05-13T16:20:26Z

Codecov Report

❗ No coverage uploaded for pull request base (master@148bdcc). Click here to learn what that means.
The diff coverage is 66.66%.

@@            Coverage Diff            @@
##             master    #1403   +/-   ##
=========================================
  Coverage          ?   89.01%           
=========================================
  Files             ?      105           
  Lines             ?     5591           
  Branches          ?        0           
=========================================
  Hits              ?     4977           
  Misses            ?      614           
  Partials          ?        0

Impacted Files	Coverage Δ
CTFd/utils/migrations/__init__.py	`78.78% <60.00%> (ø)`
CTFd/utils/exports/__init__.py	`69.87% <66.66%> (ø)`
CTFd/models/__init__.py	`95.55% <100.00%> (ø)`
CTFd/api/v1/awards.py	`97.77% <0.00%> (ø)`
CTFd/utils/encoding/__init__.py	`81.81% <0.00%> (ø)`
CTFd/schemas/users.py	`95.87% <0.00%> (ø)`
CTFd/schemas/submissions.py	`88.88% <0.00%> (ø)`
CTFd/utils/events/__init__.py	`89.47% <0.00%> (ø)`
CTFd/auth.py	`89.10% <0.00%> (ø)`
... and 98 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 148bdcc...6aa0da9. Read the comment docs.

erseco · 2020-05-13T16:35:30Z

PostgreSQL on Heroku brings a 10k rows database for free. The options for MySQL/MariaDB only brings a 5Mb databases for free, this is around 1K rows, maybe less so is impossible to use in a minimal CTFd setup.

Also, the resource usage of PostgreSQL in Docker is lower than MySQL/MariaDB so my students can develop in our low resource machines. PostgreSQL docker container is taking 8Mb of RAM while MySQL/MariaDB is taking more than 200Mb in idle.

These are the main reasons to want to try to maintain the compatibility with PostgreSQL, I would like to collaborate with this project giving patches, but if it's a problem I can maintain my own fork.

Best regards!

ColdHeat · 2020-05-13T17:23:38Z

Thanks for your PR, I'm appreciative that you're using CTFd for education. There isn't an issue with sending a PR but I take issue with PRs that solve one-off problems specific to one particular use case because they end up very disorganized. In this case this tramples on a general expectation that CTFd only supports Postgres unofficially and starts adding warts on MySQL.

I admittedly have considered retaining Postgres support because I have read the literature that it's "superior" but then I remember the hours that I've spent investigating weird differences between how MySQL and Postgres handle data. There's a lot of added complexity for a database that's not really very widely used when the recommendation is MySQL.

To retain Postgres support I would have to see most if not all of the Postgres specific code removed from the CTFd code base and replaced with db-agnostic code.

erseco · 2020-05-13T19:46:02Z

Thanks to you for your project, I started the course doing an easy CTF about informatics, later they learn how to deploy locally, later to deploy on the cloud, and finally how to modify the code. This project fits all needs of the course.

I agree about don't add complexity to support different databases, that is why I did this new PR, the constraint problem is also in MySQL, so this fixes it, you can see the raised error when calling to app.db.drop_all() function.

sqlalchemy.exc.CircularDependencyError: Can't sort tables for DROP; an unresolvable foreign key dependency exists between tables: teams, users.  Please ensure that the ForeignKey and ForeignKeyConstraint objects involved in the cycle have names so that they can be dropped using DROP CONSTRAINT.

This PR fixes this error and seems a cleaner approach because is the recommended usage in sqlachemy docs. More info about why using use_alter=True to avoid constraint errors: https://docs.sqlalchemy.org/en/13/core/constraints.html#creating-dropping-foreign-key-constraints-via-alter

In the future, my idea is removing the sqlachemy-utils dependency using just sqlalchemy for all the tasks.

I checked the code and maybe the messiest problem with different database engines is in the import/export system, that can be changed for an easier approach using the built-in sqlalchemy serializer exporting to pickle. Another approach is loading the data in the ORM and let it set the correct ids instead of doing tricky things to reset the auto-increments.

About your reads: I worked as DevOps for years and yes, PostgreSQL is so much better, that's why it is the default engine in other projects like Django.

ColdHeat

I'm leaving this review as a record of my comments and thoughts. As it stands, without the upgrade line the code won't work for old imports so that at the minimum must be fixed.

Overall, I'm not sold that this PR has significant benefits and I don't think supporting Postgres outweighs the benefits of standardization.

ColdHeat · 2020-05-18T15:23:55Z

CTFd/utils/exports/__init__.py

@@ -165,16 +159,11 @@ def import_ctf(backup, erase=True):

    members = first + members

-    upgrade(revision=alembic_version)


By removing this line, tables are created to the latest migration and imports will fail if they are from a previous version that had different data.

Restored that line

ColdHeat · 2020-05-18T15:28:18Z

CTFd/utils/migrations/__init__.py

+        # will fail: https://github.com/kvesteri/sqlalchemy-utils/pull/372
+        # in that case we asume that the database was previously created and we can't
+        # drop or create the database
+        exists = database_exists_util(url)


Even if we ignore the Postgres support discussion, I don't know if I would accept this change. Too hacky for my tastes and probably could affect other things. Maybe better to only exception handle under Postgres.

We would need to wait for kvesteri/sqlalchemy-utils#372

ColdHeat · 2020-05-18T15:31:15Z

CTFd/utils/exports/__init__.py

@@ -123,8 +118,7 @@ def import_ctf(backup, erase=True):
        )

    if erase:
-        drop_database()
-        create_database()
+        app.db.drop_all()


While I was working on #1419, I experimented with this functionality to see if it resolved an issue I was running into but it didn't. This might be better practice but I am not convinced that it has an actual benefit in CTFd.

ColdHeat · 2020-05-18T15:35:39Z

CTFd/models/__init__.py

-    team_id = db.Column(db.Integer, db.ForeignKey("teams.id"))
+    # use_alter=True along with name='' adds this foreign key after Teams has been created to avoid circular dependency
+    team_id = db.Column(
+        db.Integer, db.ForeignKey("teams.id", use_alter=True, name="users_team_id_fkey")


I don't like pinning the name here. If we can tap into the Sqlalchemy built-in name generator, I would be more willing to merge this in but I still am not certain I want to go down the path of supporting multiple databases overall.

I removed the name for alter=true is mandatory manually set the name of the constraint using its internal conventions: https://alembic.sqlalchemy.org/en/latest/naming.html

erseco · 2020-05-18T17:53:17Z

Closing this PR because don't seems more needed

Fix constraints to use .drop_all() instead of dropping database

16410e8

Merge branch 'upstream' into fix-drop-all-constraints

8ce4195

ColdHeat requested changes May 18, 2020

View reviewed changes

Some code cleanup

6aa0da9

erseco closed this May 18, 2020

sveinugu mentioned this pull request Jun 21, 2020

database_exists: don't connect to 'postgres' data base for database existence check kvesteri/sqlalchemy-utils#372

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix constraints to use .drop_all() instead of dropping database #1403

Fix constraints to use .drop_all() instead of dropping database #1403

erseco commented May 13, 2020

ColdHeat commented May 13, 2020

codecov bot commented May 13, 2020 •

edited

Loading

erseco commented May 13, 2020

ColdHeat commented May 13, 2020

erseco commented May 13, 2020

ColdHeat left a comment

ColdHeat May 18, 2020

erseco May 18, 2020

ColdHeat May 18, 2020

ColdHeat May 18, 2020

ColdHeat May 18, 2020

erseco May 18, 2020 •

edited

Loading

erseco commented May 18, 2020

		@@ -165,16 +159,11 @@ def import_ctf(backup, erase=True):

		members = first + members

		upgrade(revision=alembic_version)

Fix constraints to use .drop_all() instead of dropping database #1403

Fix constraints to use .drop_all() instead of dropping database #1403

Conversation

erseco commented May 13, 2020

ColdHeat commented May 13, 2020

codecov bot commented May 13, 2020 • edited Loading

Codecov Report

erseco commented May 13, 2020

ColdHeat commented May 13, 2020

erseco commented May 13, 2020

ColdHeat left a comment

Choose a reason for hiding this comment

ColdHeat May 18, 2020

Choose a reason for hiding this comment

erseco May 18, 2020

Choose a reason for hiding this comment

ColdHeat May 18, 2020

Choose a reason for hiding this comment

ColdHeat May 18, 2020

Choose a reason for hiding this comment

ColdHeat May 18, 2020

Choose a reason for hiding this comment

erseco May 18, 2020 • edited Loading

Choose a reason for hiding this comment

erseco commented May 18, 2020

codecov bot commented May 13, 2020 •

edited

Loading

erseco May 18, 2020 •

edited

Loading