Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to make supporting additional DBs easier #77

Closed
10 tasks done
jamadden opened this issue Jun 28, 2016 · 1 comment
Closed
10 tasks done

Refactor to make supporting additional DBs easier #77

jamadden opened this issue Jun 28, 2016 · 1 comment
Milestone

Comments

@jamadden
Copy link
Member

jamadden commented Jun 28, 2016

We would like to implement support for a few new databases (MS SQL Server for production #29, SQLite for testing). Maybe we'd even like to support some sort of simple plug-in system so database support can be contributed and distributed as third-party packages. This currently seems challenging due to the organization of the code.

Currently, code to support a specific database is scattered in several places:

  • adapters/<dbname>py
    The main knowledge about how to connect to a database and pull together all the other db-specific pieces. Each file defines an implementation of IRelStorageAdapter and a subclass of AbstractConnectionManager. (oracle.py also defines another subclass of OracleScriptRunner from adapters/scriptrunner.py)
  • adapters/batch.py
    Each supported database has a subclass of RowBatcher here.
  • adapters/locker.py
    Each supported database has a subclass of Locker here.
  • adapters/oidallocator.py
    Each supported database has a unique implementation of IOIDAllocator here.
  • adapters/packundo.py
    Each database has two subclasses of a PackUndo class.
  • adapters/scriptrunner.py
    Has a subclass of ScriptRunner for oracle.
  • adapters/stats
    Has a custom object (with no interface or superclass) for each database.
  • adapters/txncontrol.py
    Has a subclass of TransactionControl for each object.

There are two files that are particularly difficult.

  • adapters/mover.py
    This contains one class, ObjectMover, that has three versions of each method, one for each supported database. At runtime, it selects a set of methods to use based on the database adapter in use.
  • adapters/schema.py
    In addition to a number of strings named for each database, there are two large strings (one history-free, one history-preserving) that have a custom syntax to make the specific statement to run to create the same logical object for each database.

I believe these files (especially the last two) are organized like this to keep similar objects together across database in order to make it easier to maintain parity between them.

However, it has a couple of less-nice consequences too. A person attempting to understand what's specific about MySQL, say, has to visit all of these other files, and wade through unrelated code. Similarly, trying to add support for a new database requires editing every file in the adapters directory, which carries the possibility of unintended changes for other databases.

Maybe we can find a nice middle ground.

@jamadden
Copy link
Member Author

My initial idea is to introduce one package (directory) for each supported database. They would all mirror each other in contents (so mysql/objectmover.py, postgresql/objectmover.py, oracle/objectmover.py). The adapters/ directory would only contain abstract classes and interfaces. The schema file, in particular, would move from strings to a class with an interface or abstract methods/properties for each specific object to be created.

While this does spread out similar items, it has the desirable property that, if you forget to make a change for, say, oracle, those files will be conspicuously absent from the diff/git commit log, hopefully making it easier to remember and track what has been done for each database. (Now, you have to manually review each file or class to make sure that all three appropriate pieces have been updated. Because everything lives in one file, git status or the git commit log is of no help.)

@jamadden jamadden modified the milestones: 2.0b4, 2.0b5 Jul 16, 2016
jamadden added a commit that referenced this issue Jul 21, 2016
jamadden added a commit that referenced this issue Jul 25, 2016
* Begining refactor to fix #77.

* Split schema into specific packages.

* Refactor batchers for dbs.

* Move locker implementations to db-specific places

* split oidallocator.

* split scriptrunner.

* split txncontrol.

* split stats.

* split packundo.

* Factor ObjectMover. Not quite finished, Oracle still needs some attention and there's one more massive method that needs attention.

* oracle query formatting

* consolidate oraclescriptrunner

* split oracle connmanager.

* cleaner __init__ for oracle.

* clean up postgresql adapter.

* split mysql.

* refactor the big ObjectMover.move_from_temp method.

* Minor opts.

* minor fixes for oracle. test suite is clean for oracle now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant