Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RedshiftTarget update_id too long for marker table #1003

Closed
kianho opened this issue Jun 11, 2015 · 8 comments
Closed

RedshiftTarget update_id too long for marker table #1003

kianho opened this issue Jun 11, 2015 · 8 comments

Comments

@kianho
Copy link
Contributor

kianho commented Jun 11, 2015

The following exception was raised when executing an S3CopyToTable task:

Traceback (most recent call last):
  File "/home/kian/workspaces/contrib/luigi/luigi/worker.py", line 137, in run
    new_deps = self._run_get_new_deps()
  File "/home/kian/workspaces/contrib/luigi/luigi/worker.py", line 88, in _run_get_new_deps
    task_gen = self.task.run()
  File "/home/kian/workspaces/contrib/luigi/luigi/contrib/redshift.py", line 181, in run
    self.output().touch(connection)
  File "/home/kian/workspaces/contrib/luigi/luigi/postgres.py", line 163, in touch
    str(datetime.datetime.now())))
DataError: value too long for type character varying(256)

which was occurring when the value inserted into the marker table update_id
column was longer than 256 characters.

The offending code:

https://github.com/spotify/luigi/blob/master/luigi/task.py#L277

which is called from here:

https://github.com/spotify/luigi/blob/master/luigi/contrib/rdbms.py#L98-L102

Since the default update_id (by default, the task_id) is set to a string
containing the task class name and its parameters, in my case it was:

MyRedshiftTask(host=<really long AWS endpoint>:5439,
               database=dev, user=<username>,
               password=<really long password>,
               table=test_redshift_table_5439,
               local_tsv=./test.tsv,
               aws_access_key_id=xxxxxxxxxxxxxxxxxxxx,
               aws_secret_access_key=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,
               s3_load_path=s3://xxxxxxx/test.tsv)

which was considerably longer than 256 chars, due to the long AWS
endpoint, keys, and login details.

(edit: the proposed correction was wrong, removed it to avoid further confusion, see #1463 for the correct solution)

@joeshaw
Copy link
Contributor

joeshaw commented Sep 3, 2015

Just ran into this issue as well. Thanks @kianho for figuring out the root cause.

@drewfustin
Copy link

Same as above, thanks @kianho.

@matthewdu
Copy link
Contributor

Yup same. Thanks @kianho

@dlstadther
Copy link
Collaborator

I have run into this error before and just overwrote the update_id (as you did). If this is just an issue where we will handle the edge cases ourselves, can we close this?

@joeshaw
Copy link
Contributor

joeshaw commented Apr 20, 2016

I think this fixed by #1444. Unfortunately it introduces a migration problem, see #1578.

@drewfustin
Copy link

Very timely comment @joeshaw. Appreciate it. I was just trying to figure out why my hacked update_id was all of a sudden causing "psycopg2.ProgrammingError: can't adapt type 'method'" and your message explained it. Just need to go through my ETL now and undo all my subclassing of S3CopyToTable.

@joeshaw
Copy link
Contributor

joeshaw commented Apr 21, 2016

@drewfustin the "method" thing bit me as well. It turns out that update_id was changed from a method to a property. I found #1463 very helpful in showing me what I needed to change in my own code.

It is very frustrating the number of breaking changes that are generally introduced in each Luigi version, and in 2.1.0 specifically. The release notes for 2.1.0 make no mention of them, so unless you follow all the Luigi pull requests between releases, you're just on your own when your own code breaks inexplicably.

@dlstadther
Copy link
Collaborator

Appears to have been fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants