-
-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace SQLAlchemy ORM with asyncpg db driver plus optional query builder #254
Comments
SQLAlchemy is a great package. While there is a small performance overhead compared to using the driver directly, I think the trade off is worth it. If we use psycopg directly, I'm not sure how we could manage the database models in a maintainable way. We currently also rely in automatic database migration (creation) on the first start of FMTM (e.g. for local dev servers), and this relies on SQLAlchemy. |
I think we'll be doing this in a much simpler way, and just load an sql file that's a dump of the schema. Other than using sqlalchemy to create the database, I was using the lower level function to execute SQL anyway. This will also make it easier to create test cases, since they'll all need to create a database with test data. It's not a performance issue, mostly just reducing complexity since pretty soon we'll be doing major triage on the schema. |
There are advantages to both approaches - I agree an SQL file is very simple given basic knowledge of SQL. Do you propose we change to psycopg prior to the schema edits for role types? |
Yes, as it'll make refactoring the schema less messy. At this point we haven't figured out where to add the roles, so that refactoring is probably a week or two away. |
I'll just note this is not a high priority, nor in the critical path for now. :-) But at least it's worth discussing it. |
We need to make a decision if we want to move ahead with this, or continue with an ORM. If we continue with SQLAlchemy, then we need a way to migrate the database after updates (e.g. alembic). |
I personally find psycopg2 simpler since we'll never talk to anything but a postgresql database with postgis support. In other projects rather than use sqlalchemy to setup the database schema, I just load a an SQL file, which makes it very easy to update the schema. Sqlalchemy is great if you need to talk to different databases, but FMTM doesn't do that. I do know that in most of the queries I added to the backend I use the lower level API anyway using execute(). The classic question is how much time to refactor ? I'd guess at least a week ? Not sure if I want to add alembic as another dependency. |
This idea is growing on me. Work requiredI estimate we currently have about 10 database queries to re-write in pure SQL, to be able to remove SQLAlchemy - I could work on this after finishing the docker config I am working on. Additional benefits
For consideration
|
I think stripping out the geo functionality is a separate issue. I know all the SQL queries I added to the backend were using the lower level support in SqlAlchemy to just call execute(), those would be trivial to convert. I think the biggest difference is that without SqlAlchemy we'd have to have docker load an SQL file to create all the tables. Personally I find it easier to edit an SQL text file than change what SqlAlchemy uses. I like the idea of going all the way and using psycopg3. In the long run, I think dropping SqlAlchemy will make the code simpler and more maintainable. |
Is anyone working on this? I made a start this evening using psycopg3. I slightly underestimated the task - it's a pretty big refactor, but I do think the end result is easy to understand. The downside is that we lose the overview of what tables and fields are present in the DB. Plus the ORM provides centralisation of tables and fields that we don't get with raw SQL (updating or adding a new field requires updating in multiple places). We would also have to be wary of SQL injection on raw queries. This risk is reduced by passing in params instead of substituting variables directly into the string. Note 1: We can use pooling of connections with psycopg and probably even use async queries. Note 2: I stripped out a lot of unused dependencies and updated others (I had never actually gone through the package list we inherited before). |
After discussion in PR #441, we decided to use asyncpg over psycopg3, and try out a lightweight query generator to use alongside raw SQL. |
Sqlalchemy is a high level framework to work with multiple database type, like mysql, postgresql, etc... It's overkill for this project as we will only use postgresql with postgis. For postgresql access, sqlalchemy uses psycopg2. We should simplify the code base to use psycopg2 directly. This will make it much easier to change the database schema, and simplifying the code base will make it easier to maintain.
The text was updated successfully, but these errors were encountered: