Add support for postgres bytea type #6987

villebro · 2019-03-06T19:40:18Z

Psycopg2 returns a memoryview object for bytea type, which can be read by calling .tobytes(). Fixes #6981

Before:

After

codecov-io · 2019-03-06T20:04:40Z

Codecov Report

Merging #6987 into master will decrease coverage by <.01%.
The diff coverage is 50%.

@@            Coverage Diff             @@
##           master    #6987      +/-   ##
==========================================
- Coverage   64.38%   64.38%   -0.01%     
==========================================
  Files         421      421              
  Lines       20574    20576       +2     
  Branches     2251     2251              
==========================================
+ Hits        13247    13248       +1     
- Misses       7194     7195       +1     
  Partials      133      133

Impacted Files	Coverage Δ
superset/utils/core.py	`88.22% <50%> (-0.14%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c1ba914...e876db5. Read the comment docs.

superset/utils/core.py

mmuru · 2019-03-16T21:34:44Z

@villebro: I tried to verify this PR fix, now both preview and sqllab run query throws the following exception

2019-03-16 14:25:34,661:ERROR:root:'utf-8' codec can't decode byte 0xac in position 0: invalid start byte
Traceback (most recent call last):
File "/Users/muru/muru-superset/superset/views/core.py", line 2613, in sql_json
encoding=None,
File "/Users/muru/muru-superset/venv367/lib/python3.6/site-packages/simplejson/init.py", line 399, in dumps
**kw).encode(obj)
File "/Users/muru/muru-superset/venv367/lib/python3.6/site-packages/simplejson/encoder.py", line 296, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/Users/muru/muru-superset/venv367/lib/python3.6/site-packages/simplejson/encoder.py", line 378, in iterencode
return _iterencode(o, 0)
File "/Users/muru/muru-superset/superset/utils/core.py", line 378, in pessimistic_json_iso_dttm_ser
return json_iso_dttm_ser(obj, pessimistic=True)
File "/Users/muru/muru-superset/superset/utils/core.py", line 360, in json_iso_dttm_ser
val = base_json_conv(obj)
File "/Users/muru/muru-superset/superset/utils/core.py", line 344, in base_json_conv
return str(obj.tobytes(), 'utf8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xac in position 0: invalid start byte

villebro · 2019-03-17T08:27:40Z

Ok let's reopen #6981 and take another stab at this. Any additional info you can give (postgres version, create table script, sample data that throws the error etc) will help track down the problem.

mmuru · 2019-03-18T23:00:24Z

@villebro:
The data must be in binary format. As I mentioned, it was decoding issue, binary data using UTF-8.

Here is the test case to reproduce the issue

create table if not exists test_bytea (
b_data bytea
);

Please, unzip and load data using copy
bytea.dat.zip

copy test_bytea from '/Users/muru/Downloads/bytea.dat' WITH (FORMAT Binary);

Ping me if you need any other information.

villebro · 2019-03-19T06:39:11Z

@mmuru I was unable to get preview working in preview mode without excplicit handling for memoryview, but making this work seems fairly straight forward. Would this be expected behaviour?

mmuru · 2019-03-19T14:22:12Z

@villebro: Yes that's correct. I think, instead of displaying content of the binary data we should simply say "binary data" or something similar text. In superset released version 0.28.1, preview mode shows "Unserializable [<class 'memoryview'>]" for bytea column without any error. The issue is the user should able to perform select * from table (table contains bytea column) especially if the table has lot of columns.

Add handling for memoryview

e876db5

villebro changed the title ~~Add handling for memoryview~~ Add support for postgres bytea type Mar 8, 2019

john-bodley reviewed Mar 13, 2019

View reviewed changes

superset/utils/core.py Show resolved Hide resolved

mistercrunch merged commit 5e66008 into apache:master Mar 15, 2019

villebro mentioned this pull request Mar 19, 2019

Improve handling of bytes data #7062

Merged

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.34.0 labels Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for postgres bytea type #6987

Add support for postgres bytea type #6987

villebro commented Mar 6, 2019 •

edited

codecov-io commented Mar 6, 2019

mmuru commented Mar 16, 2019

villebro commented Mar 17, 2019

mmuru commented Mar 18, 2019

villebro commented Mar 19, 2019

mmuru commented Mar 19, 2019 •

edited

Add support for postgres bytea type #6987

Add support for postgres bytea type #6987

Conversation

villebro commented Mar 6, 2019 • edited

Before:

After

codecov-io commented Mar 6, 2019

Codecov Report

mmuru commented Mar 16, 2019

villebro commented Mar 17, 2019

mmuru commented Mar 18, 2019

villebro commented Mar 19, 2019

mmuru commented Mar 19, 2019 • edited

villebro commented Mar 6, 2019 •

edited

mmuru commented Mar 19, 2019 •

edited