Support for Redshift types like NUMERIC(20, 0) #39

joeschmid · 2019-11-06T13:52:03Z

Thanks for the work on this project! We're just trying out Singer for moving data from MySQL to Redshift. In MySQL we have a column type of bigint(18) unsigned. Some values in this column don't fit it Redshift's bigint column type and we get errors like Overflow (Long valid range -9223372036854775808 to 9223372036854775807)

Typically we declare a Redshift column as NUMERIC(20, 0) to hold these values. Is there a way to tell target-redshift to use that type for a particular Redshift column?

The text was updated successfully, but these errors were encountered:

AlexanderMann · 2019-11-07T19:20:31Z

@joeschmid thanks for the kind words! We're always looking to make Target-Redshift better, so we really appreciate questions like this.

There is currently no supported way to do what you're asking. There have been conversations in the past about building up tooling to detect data widths so that we can leverage tighter constraints inside Redshift and avoid penalties for things like TEXT columns everywhere, instead of VARCHAR(20), etc.

There is some work coming down the pipe which will make a number of these improvements simpler in the future, but what the "future" here means is pretty up in the air.

Given this, I don't think the most expedient way for you to resolve your is to wait out for this feature.

I'd be happy to help walk you through what changes I would expect you'd need to make to get things working if that's useful to you?

joeschmid · 2019-11-07T19:48:35Z

@AlexanderMann thanks very much for the update and explanation. That all makes sense. If you wouldn't mind walking through the changes to get this scenario working I'd appreciated it. (And maybe any others who come across similar issues would see the explanation here and it would help them out.)

AlexanderMann · 2019-11-11T16:40:32Z

@joeschmid no problem. So I will start by saying that the way to "get this working" is to fork this repo, and start trying to get what you're after working. I'm also not sure if it'll "work" or end up being a 🐰 🕳

Worth noting, Stitch also doesn't "support" this: https://www.stitchdata.com/docs/destinations/redshift/#data-limits

Integer range
9223372036854775808 to 9223372036854775807
Integer values outside of this range will be rejected and logged in the _sdc_rejected table.

Easiest Option

Make all integers NUMERIC(0, 20)

Pros

Prolly be straightforward and simple.

Cons

Column widths will balloon for all integers. Redshift (last I checked) uses the full width for a column for all values in the column, whereas PostgreSQL uses the width of the data in the row to consume memory.

Changes

In these lines, you're just going to make a mapping for JSONSchema's integer type to Redshift's NUMERIC(0,20): https://github.com/datamill-co/target-redshift/blob/master/target_redshift/redshift.py#L97-L118

For more examples of what that'd look like, check in here: https://github.com/datamill-co/target-postgres/blob/master/target_postgres/postgres.py#L806-L870

awm33 · 2020-05-16T21:09:10Z

@joeschmid I'm not sure if you resolved this, but a hack (and for anyone looking this issue) would be to create a view where that column is a text/string type then use a SQL transform to parse that into a custom numeric type after replication.

Atif8Ted mentioned this issue Nov 17, 2021

mysql to redshift datatype mapping for bigint causing overflow error transferwise/pipelinewise-target-redshift#113

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Redshift types like NUMERIC(20, 0) #39

Support for Redshift types like NUMERIC(20, 0) #39

joeschmid commented Nov 6, 2019

AlexanderMann commented Nov 7, 2019

joeschmid commented Nov 7, 2019

AlexanderMann commented Nov 11, 2019

awm33 commented May 16, 2020

Support for Redshift types like NUMERIC(20, 0) #39

Support for Redshift types like NUMERIC(20, 0) #39

Comments

joeschmid commented Nov 6, 2019

AlexanderMann commented Nov 7, 2019

joeschmid commented Nov 7, 2019

AlexanderMann commented Nov 11, 2019

Easiest Option

Pros

Cons

Changes

awm33 commented May 16, 2020