New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when writing data to postgres #37

Closed
sslivitzky opened this Issue Aug 30, 2013 · 16 comments

Comments

Projects
None yet
2 participants
@sslivitzky

sslivitzky commented Aug 30, 2013

Hi,

I get this error when migrating a mysql db:

Traceback (most recent call last):
File "/usr/local/bin/py-mysql2pgsql", line 5, in
pkg_resources.run_script('py-mysql2pgsql==0.1.6', 'py-mysql2pgsql')
File "/usr/local/lib/python2.7/dist-packages/setuptools-1.1-py2.7.egg/pkg_resources.py", line 492, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/local/lib/python2.7/dist-packages/setuptools-1.1-py2.7.egg/pkg_resources.py", line 1350, in run_script
execfile(script_filename, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/EGG-INFO/scripts/py-mysql2pgsql", line 38, in
mysql2pgsql.Mysql2Pgsql(options).convert()
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/mysql2pgsql.py", line 31, in convert
Converter(reader, writer, self.file_options, self.run_options.verbose).convert()
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/converter.py", line 51, in convert
self.writer.write_contents(table, self.reader)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/init.py", line 86, in decorated_function
ret = f(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_db_writer.py", line 193, in write_contents
self.copy_from(f, '"%s"' % table.name, ['"%s"' % c['name'] for c in table.columns])
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_db_writer.py", line 116, in copy_from
columns=columns
psycopg2.extensions.QueryCanceledError: COPY from stdin failed: error in .read() call
CONTEXT: COPY tupdate, line 1

I'm running mysql 5.5, postgresql 9.2 and python 2.7.3.

I've read other posts on this issue and I've tried py-mysql2pgsql 0.1.5 with the same result.

Any help would be greatly appreciated.

Thanks,

Serge

@sslivitzky

This comment has been minimized.

sslivitzky commented Sep 3, 2013

I dig more and found out that the table giving the error has a column defined as longtext containing rows over 6000 characters maybe this is the problem. As a test I deleted all rows from the problematic table and the migration worked fine so I just need to fix this.

@kworr

This comment has been minimized.

Collaborator

kworr commented Sep 3, 2013

Probably I'm little late but this errors can be a result of malformed MySQL query too. You had to check both postgresql and mysql logs.

Anyway are we closing this one?

@sslivitzky

This comment has been minimized.

sslivitzky commented Sep 4, 2013

There's no error in the mysql logs but I get those one in postgres:

...
2013-09-04 08:48:26 EDT WARNING: nonstandard use of \ in a string literal at character 413
2013-09-04 08:48:26 EDT HINT: Use the escape string syntax for backslashes, e.g., E''.
2013-09-04 08:48:26 EDT WARNING: nonstandard use of \ in a string literal at character 205
2013-09-04 08:48:26 EDT HINT: Use the escape string syntax for backslashes, e.g., E''.
2013-09-04 08:48:26 EDT WARNING: nonstandard use of \ in a string literal at character 154
2013-09-04 08:48:26 EDT HINT: Use the escape string syntax for backslashes, e.g., E''.
2013-09-04 08:48:26 EDT WARNING: nonstandard use of \ in a string literal at character 174
2013-09-04 08:48:26 EDT HINT: Use the escape string syntax for backslashes, e.g., E''.
2013-09-04 08:48:26 EDT WARNING: nonstandard use of \ in a string literal at character 165
2013-09-04 08:48:26 EDT HINT: Use the escape string syntax for backslashes, e.g., E''.
2013-09-04 08:48:26 EDT WARNING: nonstandard use of \ in a string literal at character 139
2013-09-04 08:48:26 EDT HINT: Use the escape string syntax for backslashes, e.g., E''.
2013-09-04 08:48:26 EDT WARNING: nonstandard use of \ in a string literal at character 107
2013-09-04 08:48:26 EDT HINT: Use the escape string syntax for backslashes, e.g., E''.
2013-09-04 08:48:28 EDT WARNING: nonstandard use of \ in a string literal at character 103
2013-09-04 08:48:28 EDT HINT: Use the escape string syntax for backslashes, e.g., E''.
2013-09-04 08:48:28 EDT WARNING: nonstandard use of \ in a string literal at character 115
2013-09-04 08:48:28 EDT HINT: Use the escape string syntax for backslashes, e.g., E''.
2013-09-04 08:48:28 EDT WARNING: nonstandard use of \ in a string literal at character 225
2013-09-04 08:48:28 EDT HINT: Use the escape string syntax for backslashes, e.g., E''.
2013-09-04 08:48:28 EDT WARNING: nonstandard use of \ in a string literal at character 126
2013-09-04 08:48:28 EDT HINT: Use the escape string syntax for backslashes, e.g., E''.
2013-09-04 08:48:28 EDT WARNING: nonstandard use of \ in a string literal at character 232
2013-09-04 08:48:28 EDT HINT: Use the escape string syntax for backslashes, e.g., E''.
2013-09-04 08:48:28 EDT ERROR: COPY from stdin failed: error in .read() call
2013-09-04 08:48:28 EDT CONTEXT: COPY tupdate, line 1
2013-09-04 08:48:28 EDT STATEMENT: COPY "tupdate"("id","type","id_update_package","filename","checksum","previous_checksum","svn_version","data","data_rollback","description","db_table_name","db_field_name","db_field_value") FROM stdin WITH DELIMITER AS ' ' NULL AS '\N'

@sslivitzky

This comment has been minimized.

sslivitzky commented Sep 4, 2013

I tried to output to a file and I got this error:

Traceback (most recent call last):
File "/usr/local/bin/py-mysql2pgsql", line 5, in
pkg_resources.run_script('py-mysql2pgsql==0.1.6', 'py-mysql2pgsql')
File "/usr/local/lib/python2.7/dist-packages/setuptools-1.1-py2.7.egg/pkg_resources.py", line 492, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/local/lib/python2.7/dist-packages/setuptools-1.1-py2.7.egg/pkg_resources.py", line 1350, in run_script
execfile(script_filename, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/EGG-INFO/scripts/py-mysql2pgsql", line 38, in
mysql2pgsql.Mysql2Pgsql(options).convert()
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/mysql2pgsql.py", line 31, in convert
Converter(reader, writer, self.file_options, self.run_options.verbose).convert()
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/converter.py", line 51, in convert
self.writer.write_contents(table, self.reader)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/init.py", line 86, in decorated_function
ret = f(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_file_writer.py", line 135, in write_contents
pr(table, row)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_writer.py", line 153, in process_row
row[index] = Binary(row[index]).getquoted()[1:-8] if row[index] else row[index]
TypeError: can't escape unicode to binary

@kworr

This comment has been minimized.

Collaborator

kworr commented Sep 4, 2013

04.09.2013 18:48, Serge wrote:

I tried to output to a file and I got this error:

Can you try something like this:

Index: mysql2pgsql/lib/postgres_writer.py

--- mysql2pgsql/lib/postgres_writer.py (revision 63)
+++ mysql2pgsql/lib/postgres_writer.py (working copy)
@@ -150,7 +150,7 @@
row[index] = bin(ord(row[index]))[2:]
elif isinstance(row[index], (str, unicode, basestring)):
if column_type == 'bytea':

  •                row[index] = Binary(row[index]).getquoted()[1:-8] 
    
    if row[index] else row[index]
  •                row[index] = 
    
    Binary(row[index].encode('utf-8')).getquoted()[1:-8] if row[index] else
    row[index]
    elif 'text[' in column_type:
    row[index] = '{%s}' % ','.join('"%s"' %
    v.replace('"', r'"') for v in row[index].split(','))
    else:

The better way would be some sample table with data that yields this error.

Sphinx of black quartz, judge my vow.

@sslivitzky

This comment has been minimized.

sslivitzky commented Sep 4, 2013

I can try to work with what you gave me but I also can send you a mysqldump of the problematic table.

@sslivitzky

This comment has been minimized.

sslivitzky commented Sep 4, 2013

The zip dump file is 3.3 MB.

@sslivitzky

This comment has been minimized.

sslivitzky commented Sep 4, 2013

I did the modification and I don't have an error any more, the migration complete successfully. Do you still want some sample data?

@kworr

This comment has been minimized.

Collaborator

kworr commented Sep 5, 2013

Well... I'd prefer to thoroughly test such changes and don't commit blind patches to the code. And I'm not feeling like I did everything right.

I think it's time to hand-craft sample mysql database creation script to be included for testing purposes...

@sslivitzky

This comment has been minimized.

sslivitzky commented Sep 5, 2013

Let me know the best way to send you the dump file, I didn't see any upload button.

@kworr

This comment has been minimized.

Collaborator

kworr commented Sep 5, 2013

Let's start from the simple things. Can you provide table schema for this table without any data so I can try to guess what's going on? On MySQL you can get table schema with "SHOW CREATE TABLE <table_name>;".

@sslivitzky

This comment has been minimized.

sslivitzky commented Sep 5, 2013

CREATE TABLE tupdate (
id int(11) unsigned NOT NULL,
type enum('code','db_data','db_schema','binary') DEFAULT NULL,
id_update_package int(11) unsigned NOT NULL DEFAULT '0',
filename varchar(250) DEFAULT '',
checksum varchar(250) DEFAULT '',
previous_checksum varchar(250) DEFAULT '',
svn_version int(4) unsigned NOT NULL DEFAULT '0',
data longtext,
data_rollback longtext,
description text,
db_table_name varchar(140) DEFAULT '',
db_field_name varchar(140) DEFAULT '',
db_field_value varchar(1024) DEFAULT '',
PRIMARY KEY (id),
KEY id_update_package (id_update_package),
CONSTRAINT tupdate_ibfk_1 FOREIGN KEY (id_update_package) REFERENCES tupdate_package (id) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1

@kworr

This comment has been minimized.

Collaborator

kworr commented Sep 5, 2013

Yes, I was totally wrong. Second column is misdetected as bytea because it's definition contains predefined word 'binary'. You can confirm this by looking at the contents of the second field (which should contain garbage after converting).

Better fix would be putting enum higher so that it would grab enum with any content that further types can claim:

Index: postgres_writer.py
===================================================================
--- postgres_writer.py  (revision 63)
+++ postgres_writer.py  (working copy)
@@ -104,10 +104,6 @@
                     return default, 'time with time zone'
                 else:
                     return default, 'time without time zone'
-            elif 'blob' in column['type'] or 'binary' in column['type']:
-                return default, 'bytea'
-            elif column['type'] in ('tinytext', 'mediumtext', 'longtext', 'text'):
-                return default, 'text'
             elif re.search(r'^enum', column['type']):
                 default = (' %s::character varying' % default) if t(default) else None
                 enum = re.sub(r'^enum\(|\)$', '', column['type'])
@@ -114,6 +110,10 @@
                 # TODO: will work for "'.',',',''''" but will fail for "'.'',','.'"
                 max_enum_size = max([len(e.replace("''", "'")) for e in enum.split("','")])
                 return default, ' character varying(%s) check(%s in (%s))' % (max_enum_size,     column['name'], enum)
+            elif 'blob' in column['type'] or 'binary' in column['type']:
+                return default, 'bytea'
+            elif column['type'] in ('tinytext', 'mediumtext', 'longtext', 'text'):
+                return default, 'text'
             elif 'bit(' in column['type']:
                 return ' DEFAULT %s' % column['default'].upper() if column['default'] else column['default'], 'varbit(%s)' % re.search(r'\((\d+)\)', column['type']).group(1)
             elif 'set(' in column['type']:

Please test whether second field in your table would turn into the string.

@sslivitzky

This comment has been minimized.

sslivitzky commented Sep 5, 2013

This is strange. After setting the bytea_output to escape, the value for the second column are shown correctly in psql.

@sslivitzky

This comment has been minimized.

sslivitzky commented Sep 9, 2013

I removed the previous mod we did and applied this one.

The migration worked without any error and the column value was ok.

pandora=> \d tupdate
Table "public.tupdate"
Column | Type | Modifiers
-------------------+-------------------------+-------------------------------
id | bigint | not null
type | character varying(9) |
id_update_package | bigint | not null default 0
filename | character varying(250) | default ''::character varying
checksum | character varying(250) | default ''::character varying
previous_checksum | character varying(250) | default ''::character varying
svn_version | bigint | not null default 0
data | text |
data_rollback | text |
description | text |
db_table_name | character varying(140) | default ''::character varying
db_field_name | character varying(140) | default ''::character varying
db_field_value | character varying(1024) | default ''::character varying
Indexes:
"tupdate_id_pkey" PRIMARY KEY, btree (id)
"tupdate_id_update_package" btree (id_update_package)
Check constraints:
"tupdate_type_check" CHECK (type::text = ANY (ARRAY['code'::character varying, 'db_data'::character varying, 'db_schema'::character varying, 'binary'::character varying]::text[]))
Foreign-key constraints:
"tupdate_id_update_package_fkey" FOREIGN KEY (id_update_package) REFERENCES tupdate_package(id)
Referenced by:
TABLE "tupdate_journal" CONSTRAINT "tupdate_journal_id_update_fkey" FOREIGN KEY (id_update) REFERENCES tupdate(id)

Thanks.

@kworr

This comment has been minimized.

Collaborator

kworr commented Oct 1, 2013

I've pushed a final fix. Closing.

@kworr kworr closed this Oct 1, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment