-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for attaching multiple DuckDB Databases #5764
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ase instance and into the attached database
…llow creating non-internal entries in system catalog and vice versa
…d CatalogTransaction to avoid having to create a connection + transaction to fill built-in functions
…nges the name of the main database
…ead of a pointer comparison
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #5048
Closes #1985
This PR adds support for using ATTACH and DETACH to attach multiple DuckDB databases to the same running database instance. This is a major refactor that does not just add support for reading external databases - but adds full support for operating on multiple databases concurrently, including creating new tables, views, schemas, inserting data, updating and deleting data, altering tables, etc.
Example Usage
A list of all attached databases can be obtained using the command
SHOW databases
.SHOW databases; ┌─────────┐ │ name │ │ varchar │ ├─────────┤ │ memory │ │ system │ │ temp │ └─────────┘
Object Resolution & Qualified Names
The full qualified name of all objects now contains the catalog in addition to the schema. For example:
The catalog search path determines the set of schemas in which objects are looked up by default. The default catalog search path includes the system catalog, the temporary catalog and the initially attached database together with the
main
schema.We can change the default database + schema pair using the
USE
commandWhen providing only a single qualification, the system can interpret this as either a catalog or a schema, as long as there are no conflicts. For example:
If we create a conflict (i.e. we have both a schema and a catalog witht he same name) the system requests that a fully qualified path is used instead:
Database Manager & Attached Databases
Adding support for attaching multiple databases has several interesting consequences for system design. After this rework, DuckDB now supports having multiple catalogs, multiple storage managers, and multiple active (running) transactions.
This is achieved by moving the
Catalog
,StorageManager
andTransactionManager
classes out of theDatabaseInstance
class and into a separateAttachedDatabase
class. TheDatabaseManager
class is added to manage the set of currently attached databases.Transactions
The
Transaction
object inside the connection has been replaced with aMetaTransaction
- which is responsible for managing the (potentially multiple) active transactions when reading and writing to different attached databases. The actual underlying transactions are started lazily. That is to say, callingBEGIN TRANSACTION
no longer beings an actual transaction in a database but only starts aMetaTransaction
. When an attached database is referenced a transaction is started in that attached database. On commit or rollback all active transactions are ended.SET immediate_transaction_mode=true
can be toggled to change this behavior to eagerly start transactions in all attached databases instead. This is primarily useful for writing tests involving multiple transactions.While multiple transactions can be active at a time - the system only supports writing to a single attached database in a single transaction. If you try to write to multiple attached databases in a single transaction the following error will be thrown:
Catalogs & System Functions
As there are now multiple catalogs - the
Catalog::GetCatalog(context)
function has been removed. It has been replaced by two functions:Similarly,
Transaction::Get
now requires a catalog to be specified:Transaction &Get(ClientContext &context, Catalog &catalog);
The system catalog is a database that is always attached on start-up that holds all system data - including system functions, built-in views, etc. The system catalog is special in that it does not have any attached storage - and hence does not support storing tables. Extensions also generally register their new functions in the system catalog.
Temporary Entries
In addition to the system catalog, temporary objects have also been moved from a schema within the main catalog to a separate catalog. Each client creates their own attached in-memory catalog called
temp
that holds temporary objects.