Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RealDictRow.items() behave inconsistently on duplicate row names #884

Closed
ProgVal opened this Issue Apr 4, 2019 · 6 comments

Comments

Projects
None yet
3 participants
@ProgVal
Copy link

ProgVal commented Apr 4, 2019

Hi,

When a SELECT query has two columns with the same name, cur.fetchone().items() behaves inconsistently:

>>> import psycopg2, psycopg2.extras
>>> db = psycopg2.connect('service=swh-replica', cursor_factory=psycopg2.extras.RealDictCursor)
>>> cur = db.cursor()
>>> cur.execute('SELECT 1 AS a, 2 AS a;')
>>> row = cur.fetchone()
>>> list(row.items())
[('a', 2), ('a', 2)]
>>> row
{'a': 2}
>>> len(row)
1

row.items() having a length not equal to len(row) causes bugs in users of this dictionary that expect them to be equal.

eg. we noticed this bug in a script that queries psycopg2 with a duplicate column name (unintentionally), and the RealDictRow ends up being passed to msgpack.packb, which produces corrupted data when this happens.

This issue happens on psycopg2 2.8, but not 2.7.7.

@dvarrazzo

This comment has been minimized.

Copy link
Member

dvarrazzo commented Apr 4, 2019

The bug is in the query: if you are using a name-based mapping object and you are passing a query with duplicate names you are asking for troubles.

There is nothing we can reasonably fix here. Fix your query.

@dvarrazzo dvarrazzo closed this Apr 4, 2019

@ProgVal

This comment has been minimized.

Copy link
Author

ProgVal commented Apr 4, 2019

I understand this is caused by my query, but this issue took some time to track down: it only happened in a particular build because it doesn't happen versions of psycopg2 + it seemed to happen in an unrelated module (msgpack).

Could psycopg2 raise an error when instead of building such a dict?

@dvarrazzo

This comment has been minimized.

Copy link
Member

dvarrazzo commented Apr 4, 2019

I'll look into that

dvarrazzo added a commit that referenced this issue Apr 6, 2019

More robust RealDictRow population
In the presence of repeated columns the mapping would have remained into
into the dictionary. Now it is removed.

Fix #884
@dvarrazzo

This comment has been minimized.

Copy link
Member

dvarrazzo commented Apr 6, 2019

Issue fixed on master, but it slipped from release 2.8.1. Because of fixing #886, queries with repeated columns will fail in a different way. Call it "undefined result".

After 2.8.2 the result will be likely to leave only the last value in the record.

@ProgVal

This comment has been minimized.

Copy link
Author

ProgVal commented Apr 6, 2019

Thanks!

@erakli

This comment has been minimized.

Copy link

erakli commented Apr 11, 2019

Moreover, in 2.8.1 this behaviour appears in conjunction with RealDictCursor (closely related to #886):

>>> postgres = psycopg2.connect(...)
>>> cur = postgres.cursor(cursor_factory=psycopg2.extras.RealDictCursor)
>>> cur.execute("select 1 as a, 2 as a")
>>> cur.fetchall()
[RealDictRow([(<class 'psycopg2.extras.RealDictRow'>, ['a', 'a']), ('a', 2)])]

This key - instance of <class 'psycopg2.extras.RealDictRow'> - breaks kwargs expansion with error TypeError: __init__() keywords must be strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.