-
Couldn't load subscription status.
- Fork 4
Description
Context / Goal
Since the tool is designed to help you validate migration of data across different schema types and even (relational) database implementations, and primarily works based on hashed data representing a row in a dataset, we want to have a way to validate that it is actually working correctly for this purpose.
Since there are potentially differences in the way drivers handle things such as character encodings, number types, timestamp/data types we want to ensure that the hashed data representing one data type in one database is considered hash-identical to that of another. If there is not, there should be a way to ensure they are so using something simple in SQL, or we should probably change our implementation.
Expected Outcome
- Modify/refactor the
MultiDataSourceConnectivityIntegrationTestfrom Configure integration test tooling to allow testing across DB types #28 to instead of just testing R2DBC connectivity via Micronaut, instead have a suite of simple scenarios that focus on real integration testing scenarios, focused on type differences- It is likely that the hashing impl in HashedRow is not going to work correctly. An
intwith value10in one DB will likely not be considered equal to alongwith value10in another DB, and similar with other types. We will have to make some decisions about how this should work to canonicalize values, and how configurable it needs to be. - Should a string of
"10"be considered equal to abigintof10?
- It is likely that the hashing impl in HashedRow is not going to work correctly. An
- Run simple, but real reconciliations that focus on ensuring that hashed values from one DB of a given type are equivalent to that of a different DB
Out of Scope
- Anything we don't support in terms of types (without SQL coercion) from Support mapping/hashing of more common data types #27
Additional context / implementation notes
- At time of writing we are using the Exposed framework in test code to generate schemas for testing with. This may not give us the level of control over data types in the databases that we require, and may need to be re-evaluated.
- Possibly these tests could run in a matrix style with simple data set queries on each side like
SELECT id as MigrationKey, test_type_column FROM testdata, creating a single table with a single test data column per test- Dimension 1: DB (mysql, postgres, mssql)
- Dimension 2: DB Type under test (
CHAR/VARCHAR,INT/INTEGER,BIGINT,NUMERIC/DECIMAL/REAL/FLOAT,DATETIME/DATE/TIME/TIMESTAMPetc)
- Types