-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More advanced NUMERIC data conversion #101
Comments
The proposed change makes sense to me. Any thoughts @ewencp, I know you were looking at #89 before. @clumsy does this also take care of point 1 you described - JDBC |
I think there are compatibility concerns to take care of since we'd start producing If it's just Oracle, the mapping for |
Yes, it does cover the point 1. One has to put a I would not say that the change is breaking though. A I believe there are several ways we can go:
But one thing I'm certain of is that I don't want my primitive columns to be stored as |
@clumsy Yeah, this all makes sense and this is definitely a common problem folks are running into (with lots of confused "why am I just getting a bunch of bytes" questions). I agree the solution you propose makes sense -- the way it is implemented now is definitely simpler to explain (one JDBC type maps to a single Connect/Java type), but isn't great for some databases. Re: breaking changes, this definitely is a breaking change, i.e. backwards incompatible, for anyone already getting |
@ewencp Agree, makes sense. The pull request is to follow shortly. |
The PR is done. |
Hi, SQL> create table bdtk3 (ID integer NOT NULL,lastname varchar(100),CID integer); Table created. SQL> desc bdtk3 ID NOT NULL NUMBER(38) Then the check "if (metadata.getScale(col) == 0 && precision < 20)" returns false in case of integers and then we still hit the error: "Invalid type for incrementing column: BYTES" |
@bdrouvot Thank you for trying this out. If the column has the default precision of 38, we can't safely treat that as an int even with the patch. You will have to explicitly create it with a precision that we can map to the int type. e.g. |
@shikhar on the second thought, what if we substitute the casting logic with additional configuration parameter where we a user can specify the desired type himself to match his expectations. The idea is one can supply a list of column and type to enforce in result schema. Whenever the type is explicitly specified the connector will try to convert to expected value by mirroring the current logic but from the over way around. The value resolution in this case will be something like:
|
@clumsy I agree that the automatic precision mapping as in #104 is messy and may not do what the user actually wants. There are many use-cases where it's tempting to add advanced configuration like you are describing. Instead of doing it on a per-connector basis, we would like to add framework-level support for reusable, configurable transformations. Then you can imagine configuring something like
|
@shikhar This is exactly what I was thinking for (please check my previous updated post). P.S. |
@clumsy that was just a proposition, we'd need some framework-level support for transformations and a specific transformer implementation like the 'TypeMarshaller'. It would only be able to kick in after the JDBC source connector has handed over records to the framework, and so wouldn't be able to rely on the typed JDBC If you need this today, another solution is to use a custom |
@clumsy Is your enhancement available? |
@stewartbryson only as a PoC I use locally. I can share a gist. |
hi @clumsy, thanks for bringing this issue, and the fix related to it. My case is a bit different but still interesting to see how you handled this: i'm using connect to synch two postgres database, and all my numeric data are converted to bytea type which unreadable. |
@clumsy I am also interested in this. Can you share the gist or do you have a PR already I could try out? |
Ok, due to multiple inquiries I share my PoC that I used in my project: clumsy/kafka-connect-jdbc@754c79fc517f8bfe75b7b07e670b5d1a64505dbf What this PoC does:
Please note:
I do believe that this is the only sane solution for the JDBC data sources, there are just to many implementations out there and unless we tell them what we need - we will always be surprised. Hope this helps, |
A note to other who might run into this issue. A NUMBER column with undefined precision in Oracle DB is reported by the driver as having scale=-127. When adding a work around for this I got most of our use cases working. |
When's the eta for this getting resolved? I still have this problem. |
Hello, I'm encountering the issue described in #33 , the toLogical conversion seems to fail. |
@nylund have you resolved this? have the same issue with type NUMBER (no precision) |
@boristyukin I worked around this specific issue in a local branch but eventually gave up on kafka-connect with Oracle as we faced a bunch of other issues. |
@niknyl thanks man I ended up not using Connect for Oracle and used StreamSets as I needed to create a quick demo. This is unfortunately as Oracle is still a king in database world. I wish I could fix this myself but I have not touched Java in years :) Sqoop had similar issues with our source system but at least there was a custom mapping. I did try Kafka single message transform to convert the data types but this is another github issue with that one as it does not support number type / byte conversion right now as well. Looks like a dead end |
The JDBC connector now supports a |
Hi. I tried to use Click to toggle contents of exceptionorg.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error handler at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:178) at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:104) at org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:44) at org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:532) at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:490) at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:321) at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:225) at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:193) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.kafka.connect.errors.DataException: Unexpected type in Cast transformation: BYTES at org.apache.kafka.connect.transforms.Cast.convertFieldType(Cast.java:206) at org.apache.kafka.connect.transforms.Cast.getOrBuildSchema(Cast.java:168) at org.apache.kafka.connect.transforms.Cast.applyWithSchema(Cast.java:137) at org.apache.kafka.connect.transforms.Cast.apply(Cast.java:107) at org.apache.kafka.connect.runtime.TransformationChain.lambda$apply$0(TransformationChain.java:44) at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128) at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162) ... 14 moreFor my use case, I can use Can this be a situation that just occurs in Microsoft SQL Server? Or is this happening in Oracle Numeric/Decimal datatype too? Any help will be appreciated at this point. |
Hi,
I was trying to use kafka-connect-jdbc to feed from an Oracle DB.
Here are a few blocker issues I was able to identify:
For
NUMERIC
columns that are the result of aggregation it will say that the scale is 0 even though the result is a floating point - which will result in an Exception when trying to create aBigDecimal
with scale 0 and some value with fraction. This happens because the default rounding policy forBigDecimal
isROUND_UNNECESSARY
that throws the exception. There's already an issue raised for that: BigDecimal has mismatching scale value for given Decimal schema #44.Users should be advised to use a
CAST
function or an alternative to tackle such problems.BIT
,TINYINY
,SMALLINT
,INTEGER
,BIGINT
Instead it represents them with a
NUMBER(precision,scale)
which according to the currentDataConverter
implementation maps toNUMERIC
that is handled by theDECIMAL
conversion resulting inBigDecimal
values.Using
BigDecimal
is an overkill to store the values are known to be in the range of the datatypes listed above.I suggest either providing a way of specifying a custom type mapper or changing the default one to be like this:
This will also require the following modification to the value conversion:
I can provide a pull-request if you agree with the proposed change.
The text was updated successfully, but these errors were encountered: