support snapshots or similar #2906

mickroll · 2020-09-29T11:30:50Z

One of the use cases for h2 databases is definitley being a storage backend for testing. We use it for our integration tests and really like its speed and ease to use 👍

Now we want to isolate our tests and for that, we currently rely on basically 3 schemes:

transaction rollback, hoping the test did not commit any data: this is error prone, and won't catch half the use cases
manually coding data-removal commands: this is error prone, one test misses to clean all data and another unrelated test fails due to residual data. It is very hard to find the culprit.
clearing the whole database and rebuilding it from scratch using scripts (or flyway/liquibase): takes ages regarding the amount of tests and complexity of data.

We generally have a few megabytes (10..100) of database data as a 'base line', and each test creates a very small amount of data on top of that.

I am aware of SCRIPT/RUNSCRIPT or the possibility to 'manually' backup/restore the underlying database files. But all of these solutions have their own drawbacks:

creating and parsing a large sql files takes time
changing the database files while h2 is running doesn't seem right and asks for trouble

A really nice feature would be to have a createsnapshot/restoresnapshot functionality built into h2.

From my point of view it could work as follows:
createsnapshot:

disallow new transactions from being created / disallow commands outside of transactions
wait for already running transactions to complete
copy underlying database files, maybe to a subdirectory with name of the snapshot
continue normal operation

restoresnapshot:

disallow new transactions from being created / disallow commands outside of transactions
wait for already running transactions to complete
delete underlying database files
copy snapshot database files to original place
continue normal operation

Any in-memory-objects could:

either be saved to disk as well
or held in memory: snapshot/restore would be blazingly fast

Now, what do you think?

The text was updated successfully, but these errors were encountered:

andreitokar · 2020-09-29T23:31:30Z

changing the database files while h2 is running doesn't seem right and asks for trouble

and a dozen lines later

delete underlying database files

copy snapshot database files to original place

Looks like a contradiction to me.
On the other hand, how this would be different from shutting down database, copying file from a known good copy and starting again. Should not take ages.

mickroll · 2020-09-30T09:28:29Z

I do not think it is a contradiction, because when triggered from within h2, the database can be prepared for the whole process. I see a lot of reasons why the whole snapshot-process should be covered by the database itself, and not from the outside:

Shutting down the database is not an option, because components using this database may continue to run across multiple tests. They would loose their connection etc.
Using the file-copy-mechanism should definitely be handled internally by h2 and not from the outside. And this is only one possible solution.
In-memory-databases do not rely on files and may be switched extremely fast.
Caching structures may also be stored alongside the database data.
Maybe snapshots are incorporated into the database files in the future, instead of duplicating them.
Maybe some kind of delta-storage is implemented in the future.

None of these future changes would break existing code that relies on the proposed createsnapshot/restoresnapshot functionality.

andreitokar · 2020-09-30T23:16:58Z

Shutting down the database is not an option, because components using this database may continue to run across multiple tests. They would loose their connection etc.

But what is there to loose, besides connection itself, if all transactions need to be closed, therefore all connections will be idle anyway? Also IMHO, it is not a good idea to keep connection open between test cases.

As far as use of in-memory database in conjunction with copying to/from real file - that popped up a few times before, i.e. here.

In any case, it is not a trivial exercise, but approach with shutdown / copy / start, or shutdown / start with initialization script (for in-memory case) may solve your problem today.

mickroll · 2020-10-01T08:35:36Z

i.e. here

These are the messages in the mentioned google group:

niklas...@gmail.com
20.06.2019, 11:32:38
an H2 Database
Hi,

We are using H2 for end-to-end tests for our JEE application.

So basically the setup is:

* start H2 (in memory) and application server
* run many testcases
* stop H2 and application server

Each testcase brings its own data and clears all tables before running.
Although H2 is pretty fast the setup of the data takes some time.

What I am thinking about is finding a way to start H2 e.g. from a file that contains some prefilled data and then only add the special data the test needs on top.
But I have not found any way to reset the database to the initial state after the test. 
Of course copying the file again and restarting H2 would work, but this is probably also slow when running very many tests.

Has anybody a similar setup and have you solved this problem somehow? Running in-memory but backed by a read-only file for example?

Thanks & best regards,
  Niklas

Noel Grandin
20.06.2019, 11:37:30
an h2-da...@googlegroups.com

This has come up several times before, and the reply I gave there was that we have a FilePath abstraction which sits
between our engine and various kind of storage backends (memory, disk, etc).

It should be possible to
(a) restore a disk file to an in-memory backend, since the FilePath abstraction looks like a filesystem
(b) make a copy of a populated database from an in-memory backend to a disk file.
(c) overwrite an existing in-memory database "virtual file" from another in-memory "virtual file"

However, to my knowledge, no-one has ever done the work to actually make this happen.

(not including a mention of savepoint/rollback, because this would need a single running transaction)

manticore-projects · 2020-10-01T15:30:54Z

Hi Mick.

Pardon my ignorance, but have you tried to simply Copy that H2 Database file (e. g. using Apache Commons IO) in order to create your snapshot?
We do exactly that on quite large databases (> 2GByte) running in server mode, even on a Windows Filesystem.
We ensure that there is no User Connected and have never seen any corruption or negative side effects.

Our biggest challenge from that practise is, that we have now a large zoo of such H2 database files laying around and when we want to migrate to a newer H2 version, we need to migrate all of them, while each may have its own incompatibilities (depending, when exactly they have been created). But that's the only Contra argument I have in mind.

Cheers

manticore-projects · 2020-10-01T15:35:05Z

Alternatively: Why not just loop through all schemas and tables (e.g. via JDBC DatabaseMetaData) and create snapshots as per sql CREATE TABLE target AS SELECT * FROM source;

If the schemas do not change, then you could add a timestamp field in order to identify your snapshot.
If the schemas change, then create new schemas with a timestamp alike suffix (e.g. MY_SCHEMA_20200930120000)

Cheers

mickroll · 2020-10-07T08:08:02Z

Hi @manticore-projects,

thanks for your ideas, but:

copying files won't work with our in-memory databases
looping through tables is not enough. There are sequences, views, changed constraints, maybe procedures... the lot

We now went with SCRIPT command for dumping and DROP ALL OBJECTS + executing the generated script for restoring.
This comes with the (now usual) workaround for #2390 (sorting VIEW-related commands to the end of the dump).

I hoped for a more native approach but it seems this won't happen in the near future.

regards,
Michael

mickroll closed this as completed Dec 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support snapshots or similar #2906

support snapshots or similar #2906

mickroll commented Sep 29, 2020 •

edited

Loading

andreitokar commented Sep 29, 2020 •

edited

Loading

mickroll commented Sep 30, 2020 •

edited

Loading

andreitokar commented Sep 30, 2020

mickroll commented Oct 1, 2020

manticore-projects commented Oct 1, 2020

manticore-projects commented Oct 1, 2020

mickroll commented Oct 7, 2020 •

edited

Loading

support snapshots or similar #2906

support snapshots or similar #2906

Comments

mickroll commented Sep 29, 2020 • edited Loading

andreitokar commented Sep 29, 2020 • edited Loading

mickroll commented Sep 30, 2020 • edited Loading

andreitokar commented Sep 30, 2020

mickroll commented Oct 1, 2020

manticore-projects commented Oct 1, 2020

manticore-projects commented Oct 1, 2020

mickroll commented Oct 7, 2020 • edited Loading

mickroll commented Sep 29, 2020 •

edited

Loading

andreitokar commented Sep 29, 2020 •

edited

Loading

mickroll commented Sep 30, 2020 •

edited

Loading

mickroll commented Oct 7, 2020 •

edited

Loading