Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix constraints to use .drop_all() instead of dropping database #1403

Closed
wants to merge 3 commits into from

Conversation

erseco
Copy link
Contributor

@erseco erseco commented May 13, 2020

Fix constraints to use use_alter=True in the foreign key between Users and Teams, this allows to use the sqlalchemy's built-in .drop_all() function instead of dropping database via sqlalchemy-utils' drop_database().

Drop a running database in production is a totally discouraged practice, so a better approach is dropping all database objects. This is the main reason why sqlalchemy don't have built-in methods for dropping and creating databases.

This PR improves PostgreSQL support to try to maintain currently supported databases in the upcoming 3.x versión

This PR also fixes #1397 so I can close that PR

More info about why using use_alter=True to avoid constraint errors: https://docs.sqlalchemy.org/en/13/core/constraints.html#creating-dropping-foreign-key-constraints-via-alter

@ColdHeat
Copy link
Member

Maintaining database compatibility causes nothing but problems and I do not see any benefit to keeping Postgres support inside of CTFd.

For example, by specifying the name you likely stop the automatic foreign key name generation so there's probably a weird oddly named foreign key inside of MySQL (the actual CTFd database).

You can already deploy MySQL/MariaDB on Heroku so I don't see why you need to fight to support Postgres.

@codecov
Copy link

codecov bot commented May 13, 2020

Codecov Report

❗ No coverage uploaded for pull request base (master@148bdcc). Click here to learn what that means.
The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1403   +/-   ##
=========================================
  Coverage          ?   89.01%           
=========================================
  Files             ?      105           
  Lines             ?     5591           
  Branches          ?        0           
=========================================
  Hits              ?     4977           
  Misses            ?      614           
  Partials          ?        0           
Impacted Files Coverage Δ
CTFd/utils/migrations/__init__.py 78.78% <60.00%> (ø)
CTFd/utils/exports/__init__.py 69.87% <66.66%> (ø)
CTFd/models/__init__.py 95.55% <100.00%> (ø)
CTFd/api/v1/awards.py 97.77% <0.00%> (ø)
CTFd/utils/encoding/__init__.py 81.81% <0.00%> (ø)
CTFd/schemas/users.py 95.87% <0.00%> (ø)
CTFd/schemas/submissions.py 88.88% <0.00%> (ø)
CTFd/utils/events/__init__.py 89.47% <0.00%> (ø)
CTFd/auth.py 89.10% <0.00%> (ø)
... and 98 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 148bdcc...6aa0da9. Read the comment docs.

@erseco
Copy link
Contributor Author

erseco commented May 13, 2020

PostgreSQL on Heroku brings a 10k rows database for free. The options for MySQL/MariaDB only brings a 5Mb databases for free, this is around 1K rows, maybe less so is impossible to use in a minimal CTFd setup.

Also, the resource usage of PostgreSQL in Docker is lower than MySQL/MariaDB so my students can develop in our low resource machines. PostgreSQL docker container is taking 8Mb of RAM while MySQL/MariaDB is taking more than 200Mb in idle.

These are the main reasons to want to try to maintain the compatibility with PostgreSQL, I would like to collaborate with this project giving patches, but if it's a problem I can maintain my own fork.

Best regards!

@ColdHeat
Copy link
Member

Thanks for your PR, I'm appreciative that you're using CTFd for education. There isn't an issue with sending a PR but I take issue with PRs that solve one-off problems specific to one particular use case because they end up very disorganized. In this case this tramples on a general expectation that CTFd only supports Postgres unofficially and starts adding warts on MySQL.

I admittedly have considered retaining Postgres support because I have read the literature that it's "superior" but then I remember the hours that I've spent investigating weird differences between how MySQL and Postgres handle data. There's a lot of added complexity for a database that's not really very widely used when the recommendation is MySQL.

To retain Postgres support I would have to see most if not all of the Postgres specific code removed from the CTFd code base and replaced with db-agnostic code.

@erseco
Copy link
Contributor Author

erseco commented May 13, 2020

Thanks to you for your project, I started the course doing an easy CTF about informatics, later they learn how to deploy locally, later to deploy on the cloud, and finally how to modify the code. This project fits all needs of the course.

I agree about don't add complexity to support different databases, that is why I did this new PR, the constraint problem is also in MySQL, so this fixes it, you can see the raised error when calling to app.db.drop_all() function.

sqlalchemy.exc.CircularDependencyError: Can't sort tables for DROP; an unresolvable foreign key dependency exists between tables: teams, users.  Please ensure that the ForeignKey and ForeignKeyConstraint objects involved in the cycle have names so that they can be dropped using DROP CONSTRAINT.

This PR fixes this error and seems a cleaner approach because is the recommended usage in sqlachemy docs. More info about why using use_alter=True to avoid constraint errors: https://docs.sqlalchemy.org/en/13/core/constraints.html#creating-dropping-foreign-key-constraints-via-alter

In the future, my idea is removing the sqlachemy-utils dependency using just sqlalchemy for all the tasks.

I checked the code and maybe the messiest problem with different database engines is in the import/export system, that can be changed for an easier approach using the built-in sqlalchemy serializer exporting to pickle. Another approach is loading the data in the ORM and let it set the correct ids instead of doing tricky things to reset the auto-increments.

About your reads: I worked as DevOps for years and yes, PostgreSQL is so much better, that's why it is the default engine in other projects like Django.

Copy link
Member

@ColdHeat ColdHeat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaving this review as a record of my comments and thoughts. As it stands, without the upgrade line the code won't work for old imports so that at the minimum must be fixed.

Overall, I'm not sold that this PR has significant benefits and I don't think supporting Postgres outweighs the benefits of standardization.

@@ -165,16 +159,11 @@ def import_ctf(backup, erase=True):

members = first + members

upgrade(revision=alembic_version)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By removing this line, tables are created to the latest migration and imports will fail if they are from a previous version that had different data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restored that line

# will fail: https://github.com/kvesteri/sqlalchemy-utils/pull/372
# in that case we asume that the database was previously created and we can't
# drop or create the database
exists = database_exists_util(url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we ignore the Postgres support discussion, I don't know if I would accept this change. Too hacky for my tastes and probably could affect other things. Maybe better to only exception handle under Postgres.

We would need to wait for kvesteri/sqlalchemy-utils#372

@@ -123,8 +118,7 @@ def import_ctf(backup, erase=True):
)

if erase:
drop_database()
create_database()
app.db.drop_all()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I was working on #1419, I experimented with this functionality to see if it resolved an issue I was running into but it didn't. This might be better practice but I am not convinced that it has an actual benefit in CTFd.

team_id = db.Column(db.Integer, db.ForeignKey("teams.id"))
# use_alter=True along with name='' adds this foreign key after Teams has been created to avoid circular dependency
team_id = db.Column(
db.Integer, db.ForeignKey("teams.id", use_alter=True, name="users_team_id_fkey")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like pinning the name here. If we can tap into the Sqlalchemy built-in name generator, I would be more willing to merge this in but I still am not certain I want to go down the path of supporting multiple databases overall.

Copy link
Contributor Author

@erseco erseco May 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the name for alter=true is mandatory manually set the name of the constraint using its internal conventions: https://alembic.sqlalchemy.org/en/latest/naming.html

@erseco
Copy link
Contributor Author

erseco commented May 18, 2020

Closing this PR because don't seems more needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants