-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JDBC to Spanner support #60
Comments
JDBC to Spanner is already supported and we are testing the behavior. Please let me clarify how to specify a config file because currently it fails to execute a job (maybe my config fault). PreconditionCloud SQL tablea_table
Spanner tableATable
Config manifest{
"sources": [
{
"name": "input",
"module": "jdbc",
"parameters": {
"query": "SELECT column_a, column_b, column_c FROM public.a_table",
"url": "jdbc:postgresql:///xxx?cloudSqlInstance=yyy:asia-northeast1:zzz&socketFactory=com.google.cloud.sql.postgres.SocketFactory",
"driver": "org.postgresql.Driver",
"user": "jjjj",
"password": "kkkk"
}
}
],
"sinks": [
{
"name": "spanner",
"module": "spanner",
"input": "input",
"parameters": {
"projectId": "xxxxxx",
"instanceId": "yyyyyyy",
"databaseId": "xxxxxxxxxx",
"table": "TableA",
"createTable": false,
"keyFields": ["ColumnA", "ColumnB", "ColumnC"] // extract part of the whole schema to choose supported types (exclude ARRAY and NUMERIC)
}
}
]
}
ExecutionError
Does this mean column names are the same between Cloud SQL and Spanner or it there any way to map columns between them? |
Thank you for sharing the details of the use case ! Is it possible to convert or refine the field names in jdbc source module query parameter as follows? |
Thank you so much! I tried your suggestion and it worked perfectly as intended!!! I confirmed the original issue |
It seems that support for arrays and numeric types will be necessary when submitting data to Spanner, so you can continue with this issue here! |
Jdbc -> Spanner now supports Numeric, Json, and Array types. https://github.com/mercari/DataflowTemplate/tree/enhance-jdbc |
Also, we added
If the table you want to retrieve data from is huge and it takes a long time to retrieve the data, please consider using this option. The following is an example. https://github.com/mercari/DataflowTemplate/blob/enhance-jdbc/examples/jdbc-table-to-spanner.json |
IAM database authentication using dataflow service account is now supported. |
Thank you for the enhancement! We are testing the Postgres schemathere are other columns, but this one may be the only significant difference
Spanner shchema
Actual data sample in postgres
Error message
Othertables including the following columns succeeded in migration
|
Thanks for trying it out right away! |
Thank you for the prompt fix! I confirmed the above case worked correctly 👍 I'll try other cases, such as NUMERIC and report if I find something! |
WHAT
I was impressed that Spanner to JDBC (Cloud SQL) is supported. JDBC to Spanner is also deeply appreciated.
WHY
We are migrating Cloud SQL for PostgreSQL to Spanner and data migration is one of the most difficult part.
When types of columns used for PostgreSQL are primitive,
pg_dump
as CSV and Cloud Storage Text to Cloud Spanner are available. However, when using more complicated types, such as ARRAY and NUMERIC, we need different method.HarbourBridge is introduced in GCP doc, but it doesn't work for data migration against the spec. Some of Avro to Spanner templates require to use the same format as Spanner to Avro manifest for each, so it is difficult to use them for data migration even if we can use a date export tools such as spotify/dbeam.
If this feature is supported, many migrators will be happy!!!
Features
The text was updated successfully, but these errors were encountered: