Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad UTF8 handling in caching ? #841

Closed
marcinkoziej opened this issue Jul 28, 2016 · 2 comments
Closed

Bad UTF8 handling in caching ? #841

marcinkoziej opened this issue Jul 28, 2016 · 2 comments
Labels
#bug:cant-reproduce Bugs that cannot be reproduced

Comments

@marcinkoziej
Copy link

I use Caravel against MySQL database in UTF8, with /etc/mysql/my.cnf default client charset set to utf8.

Despite that, I get utf8 exceptions having something to do with polish province names (which are kept in column set to groupable, filterable)

I am using amancevice/caravel docker image to run caravel, I tried to modify it to add sitecustomize.py file with sys.setdefaultencoding('utf8') but this did not solve my problem.

The stacktrace and offending piece of data (obj) goes here:

The offending string might be '?\xf3dzkie' which should be 'łódzkie'. You can see many characters are substituted to ?, for some reason 'ó' is converted differently. By connecting to DB with mysql console, i can see proper characters.

{u'json_endpoint':
'/caravel/explore/table/9/?force=false&slice_name=&series=province&entity=campaign_title&show_legend=false&show_legend=y&granularity_sqla=campaign_start&size=count&flt_op_0=in&viz_type=bubble&since=1+year+ago&json=true&until=now&collapsed_fieldsets=&datasource_id=9&y=signature_count&flt_eq_0=&flt_col_0=campaign_title&slice_id=&where=&previous_viz_type=bubble&datasource_type=table&y_log_scale=false&limit=50&datasource_name=ad_contact_activity_first_campaign&x=campaign_start_metric&x_log_scale=false&time_grain_sqla=Time+Column&having=&max_bubble_size=25',
u'form_data': {'slice_name': u'', 'entity': u'campaign_title',
'show_legend': True, 'granularity_sqla': u'campaign_start', 'size':
u'count', 'flt_op_5': u'in', 'flt_op_4': u'in', 'flt_op_7': u'in',
'flt_op_6': u'in', 'flt_op_1': u'in', 'flt_op_0': u'in', 'flt_op_3':
u'in', 'flt_op_2': u'in', 'flt_op_9': u'in', 'flt_op_8': u'in',
'json': u'true', 'until': u'now', 'flt_col_7': u'campaign_title',
'flt_col_6': u'campaign_title', 'limit': u'50', 'async': u'', 'where':
u'', 'max_bubble_size': u'25', 'extra_filters': u'', 'force':
u'false', 'series': u'province', 'viz_type': u'bubble', 'since': u'1
year ago', 'x': u'campaign_start_metric', 'collapsed_fieldsets': u'',
'time_grain_sqla': u'Time Column', 'flt_eq_8': u'', 'flt_eq_9': u'',
'flt_eq_6': u'', 'flt_eq_7': u'', 'flt_eq_4': u'', 'flt_eq_5': u'',
'flt_eq_2': u'', 'flt_eq_3': u'', 'flt_eq_0': u'', 'flt_eq_1': u'',
'flt_col_1': u'campaign_title', 'flt_col_0': u'campaign_title',
'flt_col_3': u'campaign_title', 'flt_col_2': u'campaign_title',
'flt_col_5': u'campaign_title', 'flt_col_4': u'campaign_title',
'slice_id': u'', 'standalone': u'', 'flt_col_9': u'campaign_title',
'flt_col_8': u'campaign_title', 'previous_viz_type': u'bubble',
'y_log_scale': False, 'y': u'signature_count', 'x_log_scale': False,
'having': u''}, u'cache_key': '56639cc76f977e76612835636b43d141',
u'standalone_endpoint':
'/caravel/explore/table/9/?force=false&slice_name=&series=province&entity=campaign_title&show_legend=false&show_legend=y&granularity_sqla=campaign_start&size=count&flt_op_0=in&viz_type=bubble&since=1+year+ago&until=now&collapsed_fieldsets=&datasource_id=9&y=signature_count&flt_eq_0=&flt_col_0=campaign_title&slice_id=&standalone=true&where=&previous_viz_type=bubble&datasource_type=table&y_log_scale=false&limit=50&datasource_name=ad_contact_activity_first_campaign&x=campaign_start_metric&x_log_scale=false&time_grain_sqla=Time+Column&having=&max_bubble_size=25',
u'csv_endpoint':
'/caravel/explore/table/9/?force=false&slice_name=&series=province&entity=campaign_title&show_legend=false&show_legend=y&granularity_sqla=campaign_start&size=count&flt_op_0=in&viz_type=bubble&since=1+year+ago&csv=true&until=now&collapsed_fieldsets=&datasource_id=9&y=signature_count&flt_eq_0=&flt_col_0=campaign_title&slice_id=&where=&previous_viz_type=bubble&datasource_type=table&y_log_scale=false&limit=50&datasource_name=ad_contact_activity_first_campaign&x=campaign_start_metric&x_log_scale=false&time_grain_sqla=Time+Column&having=&max_bubble_size=25',
u'query': u"SELECT province AS province,\n campaign_title AS
campaign_title,\n COUNT(*) AS count,\n signature_count AS
signature_count,\n year(campaign_start) + month(campaign_start)/12 AS
campaign_start_metric\nFROM ad_contact_activity_first_campaign\nINNER
JOIN\n (SELECT province AS province__,\n campaign_title AS
campaign_title__\n FROM ad_contact_activity_first_campaign\n WHERE
campaign_start >= '2015-07-28 14:12:08.000000'\n AND campaign_start <=
'2016-07-28 14:12:08.000000'\n GROUP BY province,\n campaign_title\n
ORDER BY COUNT(*) DESC LIMIT 50) AS anon_1 ON province =
province__\nAND campaign_title = campaign_title__\nWHERE
campaign_start >= '2015-07-28 14:12:08.000000'\n AND campaign_start <=
'2016-07-28 14:12:08.000000'\nGROUP BY province,\n
campaign_title\nORDER BY count DESC LIMIT 5000", u'cached_dttm':
u'2016-07-28T14:12:22', u'cache_timeout': 600, u'data': [{u'values':
[{u'province': 'pomorskie', u'count': 2498, u'group': 'pomorskie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title': 'Na
zaostrzenie ustawy antyaborcyjnej nie pozwolimy!', u'y': 1, u'x':
2016.3333, u'campaign_start_metric': 2016.3333, u'size': 2498},
{u'province': 'pomorskie', u'count': 1732, u'group': 'pomorskie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title':
'ulaskawienie', u'y': 1, u'x': 2015.9167, u'campaign_start_metric':
2015.9167, u'size': 1732}, {u'province': 'pomorskie', u'count': 984,
u'group': 'pomorskie', u'signature_count': 2, u'shape': u'circle',
u'campaign_title': 'TK_orzeczenie', u'y': 2, u'x': 2016.25,
u'campaign_start_metric': 2016.25, u'size': 984}], u'key':
'pomorskie'}, {u'values': [{u'province': 'mazowieckie', u'count':
11991, u'group': 'mazowieckie', u'signature_count': 1, u'shape':
u'circle', u'campaign_title': 'Na zaostrzenie ustawy antyaborcyjnej
nie pozwolimy!', u'y': 1, u'x': 2016.3333, u'campaign_start_metric':
2016.3333, u'size': 11991}, {u'province': 'mazowieckie', u'count':
8675, u'group': 'mazowieckie', u'signature_count': 8, u'shape':
u'circle', u'campaign_title': 'ulaskawienie', u'y': 8, u'x':
2015.9167, u'campaign_start_metric': 2015.9167, u'size': 8675},
{u'province': 'mazowieckie', u'count': 3372, u'group': 'mazowieckie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title':
'TK_orzeczenie', u'y': 1, u'x': 2016.25, u'campaign_start_metric':
2016.25, u'size': 3372}, {u'province': 'mazowieckie', u'count': 1335,
u'group': 'mazowieckie', u'signature_count': 2, u'shape': u'circle',
u'campaign_title': 'prywatnosc', u'y': 2, u'x': 2016.0833,
u'campaign_start_metric': 2016.0833, u'size': 1335}, {u'province':
'mazowieckie', u'count': 1262, u'group': 'mazowieckie',
u'signature_count': 26, u'shape': u'circle', u'campaign_title':
'powietrze', u'y': 26, u'x': 2015.75, u'campaign_start_metric':
2015.75, u'size': 1262}, {u'province': 'mazowieckie', u'count': 938,
u'group': 'mazowieckie', u'signature_count': 3, u'shape': u'circle',
u'campaign_title': 'trybunal', u'y': 3, u'x': 2015.9167,
u'campaign_start_metric': 2015.9167, u'size': 938}, {u'province':
'mazowieckie', u'count': 912, u'group': 'mazowieckie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title': 'M?dry
dyrektor szko?y uczy tolerancji a nie nienawi?ci. Podpisz apel', u'y':
1, u'x': 2016.25, u'campaign_start_metric': 2016.25, u'size': 912},
{u'province': 'mazowieckie', u'count': 734, u'group': 'mazowieckie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title':
'Bronimy polskiego rolnictwa i ?ywno?ci', u'y': 1, u'x': 2016.3333,
u'campaign_start_metric': 2016.3333, u'size': 734}, {u'province':
'mazowieckie', u'count': 658, u'group': 'mazowieckie',
u'signature_count': 5, u'shape': u'circle', u'campaign_title':
'Alimenty', u'y': 5, u'x': 2015.8333, u'campaign_start_metric':
2015.8333, u'size': 658}, {u'province': 'mazowieckie', u'count': 603,
u'group': 'mazowieckie', u'signature_count': 2, u'shape': u'circle',
u'campaign_title': 'molestowanie', u'y': 2, u'x': 2016.1667,
u'campaign_start_metric': 2016.1667, u'size': 603}, {u'province':
'mazowieckie', u'count': 511, u'group': 'mazowieckie',
u'signature_count': 5, u'shape': u'circle', u'campaign_title':
'marsz', u'y': 5, u'x': 2015.8333, u'campaign_start_metric':
2015.8333, u'size': 511}], u'key': 'mazowieckie'}, {u'values':
[{u'province': '?wi?tokrzyskie', u'count': 545, u'group':
'?wi?tokrzyskie', u'signature_count': 1, u'shape': u'circle',
u'campaign_title': 'Na zaostrzenie ustawy antyaborcyjnej nie
pozwolimy!', u'y': 1, u'x': 2016.3333, u'campaign_start_metric':
2016.3333, u'size': 545}], u'key': '?wi?tokrzyskie'}, {u'values':
[{u'province': 'zachodniopomorskie', u'count': 1955, u'group':
'zachodniopomorskie', u'signature_count': 1, u'shape': u'circle',
u'campaign_title': 'Na zaostrzenie ustawy antyaborcyjnej nie
pozwolimy!', u'y': 1, u'x': 2016.3333, u'campaign_start_metric':
2016.3333, u'size': 1955}, {u'province': 'zachodniopomorskie',
u'count': 809, u'group': 'zachodniopomorskie', u'signature_count': 7,
u'shape': u'circle', u'campaign_title': 'ulaskawienie', u'y': 7, u'x':
2015.9167, u'campaign_start_metric': 2015.9167, u'size': 809}],
u'key': 'zachodniopomorskie'}, {u'values': [{u'province':
'kujawsko-pomorskie', u'count': 1475, u'group': 'kujawsko-pomorskie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title': 'Na
zaostrzenie ustawy antyaborcyjnej nie pozwolimy!', u'y': 1, u'x':
2016.3333, u'campaign_start_metric': 2016.3333, u'size': 1475},
{u'province': 'kujawsko-pomorskie', u'count': 820, u'group':
'kujawsko-pomorskie', u'signature_count': 5, u'shape': u'circle',
u'campaign_title': 'ulaskawienie', u'y': 5, u'x': 2015.9167,
u'campaign_start_metric': 2015.9167, u'size': 820}], u'key':
'kujawsko-pomorskie'}, {u'values': [{u'province': '?\xf3dzkie',
u'count': 2544, u'group': '?\xf3dzkie', u'signature_count': 3,
u'shape': u'circle', u'campaign_title': 'Na zaostrzenie ustawy
antyaborcyjnej nie pozwolimy!', u'y': 3, u'x': 2016.3333,
u'campaign_start_metric': 2016.3333, u'size': 2544}, {u'province':
'?\xf3dzkie', u'count': 1388, u'group': '?\xf3dzkie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title':
'ulaskawienie', u'y': 1, u'x': 2015.9167, u'campaign_start_metric':
2015.9167, u'size': 1388}, {u'province': '?\xf3dzkie', u'count': 645,
u'group': '?\xf3dzkie', u'signature_count': 2, u'shape': u'circle',
u'campaign_title': 'TK_orzeczenie', u'y': 2, u'x': 2016.25,
u'campaign_start_metric': 2016.25, u'size': 645}], u'key':
'?\xf3dzkie'}, {u'values': [{u'province': '?l?skie', u'count': 3749,
u'group': '?l?skie', u'signature_count': 1, u'shape': u'circle',
u'campaign_title': 'Na zaostrzenie ustawy antyaborcyjnej nie
pozwolimy!', u'y': 1, u'x': 2016.3333, u'campaign_start_metric':
2016.3333, u'size': 3749}, {u'province': '?l?skie', u'count': 2019,
u'group': '?l?skie', u'signature_count': 2, u'shape': u'circle',
u'campaign_title': 'ulaskawienie', u'y': 2, u'x': 2015.9167,
u'campaign_start_metric': 2015.9167, u'size': 2019}, {u'province':
'?l?skie', u'count': 976, u'group': '?l?skie', u'signature_count': 2,
u'shape': u'circle', u'campaign_title': 'TK_orzeczenie', u'y': 2,
u'x': 2016.25, u'campaign_start_metric': 2016.25, u'size': 976},
{u'province': '?l?skie', u'count': 932, u'group': '?l?skie',
u'signature_count': 9, u'shape': u'circle', u'campaign_title':
'powietrze', u'y': 9, u'x': 2015.75, u'campaign_start_metric':
2015.75, u'size': 932}], u'key': '?l?skie'}, {u'values':
[{u'province': 'dolno?l?skie', u'count': 3882, u'group':
'dolno?l?skie', u'signature_count': 1, u'shape': u'circle',
u'campaign_title': 'Na zaostrzenie ustawy antyaborcyjnej nie
pozwolimy!', u'y': 1, u'x': 2016.3333, u'campaign_start_metric':
2016.3333, u'size': 3882}, {u'province': 'dolno?l?skie', u'count':
1948, u'group': 'dolno?l?skie', u'signature_count': 3, u'shape':
u'circle', u'campaign_title': 'ulaskawienie', u'y': 3, u'x':
2015.9167, u'campaign_start_metric': 2015.9167, u'size': 1948},
{u'province': 'dolno?l?skie', u'count': 889, u'group': 'dolno?l?skie',
u'signature_count': 2, u'shape': u'circle', u'campaign_title':
'TK_orzeczenie', u'y': 2, u'x': 2016.25, u'campaign_start_metric':
2016.25, u'size': 889}], u'key': 'dolno?l?skie'}, {u'values':
[{u'province': 'lubelskie', u'count': 983, u'group': 'lubelskie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title': 'Na
zaostrzenie ustawy antyaborcyjnej nie pozwolimy!', u'y': 1, u'x':
2016.3333, u'campaign_start_metric': 2016.3333, u'size': 983},
{u'province': 'lubelskie', u'count': 619, u'group': 'lubelskie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title':
'ulaskawienie', u'y': 1, u'x': 2015.9167, u'campaign_start_metric':
2015.9167, u'size': 619}], u'key': 'lubelskie'}, {u'values':
[{u'province': 'warmi?sko-mazurskie', u'count': 749, u'group':
'warmi?sko-mazurskie', u'signature_count': 1, u'shape': u'circle',
u'campaign_title': 'Na zaostrzenie ustawy antyaborcyjnej nie
pozwolimy!', u'y': 1, u'x': 2016.3333, u'campaign_start_metric':
2016.3333, u'size': 749}], u'key': 'warmi?sko-mazurskie'}, {u'values':
[{u'province': 'lubuskie', u'count': 688, u'group': 'lubuskie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title': 'Na
zaostrzenie ustawy antyaborcyjnej nie pozwolimy!', u'y': 1, u'x':
2016.3333, u'campaign_start_metric': 2016.3333, u'size': 688}],
u'key': 'lubuskie'}, {u'values': [{u'province': 'wielkopolskie',
u'count': 3507, u'group': 'wielkopolskie', u'signature_count': 1,
u'shape': u'circle', u'campaign_title': 'Na zaostrzenie ustawy
antyaborcyjnej nie pozwolimy!', u'y': 1, u'x': 2016.3333,
u'campaign_start_metric': 2016.3333, u'size': 3507}, {u'province':
'wielkopolskie', u'count': 2061, u'group': 'wielkopolskie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title':
'ulaskawienie', u'y': 1, u'x': 2015.9167, u'campaign_start_metric':
2015.9167, u'size': 2061}, {u'province': 'wielkopolskie', u'count':
1084, u'group': 'wielkopolskie', u'signature_count': 1, u'shape':
u'circle', u'campaign_title': 'TK_orzeczenie', u'y': 1, u'x': 2016.25,
u'campaign_start_metric': 2016.25, u'size': 1084}], u'key':
'wielkopolskie'}, {u'values': [{u'province': 'opolskie', u'count':
492, u'group': 'opolskie', u'signature_count': 2, u'shape': u'circle',
u'campaign_title': 'Na zaostrzenie ustawy antyaborcyjnej nie
pozwolimy!', u'y': 2, u'x': 2016.3333, u'campaign_start_metric':
2016.3333, u'size': 492}], u'key': 'opolskie'}, {u'values':
[{u'province': 'podkarpackie', u'count': 595, u'group':
'podkarpackie', u'signature_count': 1, u'shape': u'circle',
u'campaign_title': 'Na zaostrzenie ustawy antyaborcyjnej nie
pozwolimy!', u'y': 1, u'x': 2016.3333, u'campaign_start_metric':
2016.3333, u'size': 595}], u'key': 'podkarpackie'}, {u'values':
[{u'province': 'podlaskie', u'count': 549, u'group': 'podlaskie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title': 'Na
zaostrzenie ustawy antyaborcyjnej nie pozwolimy!', u'y': 1, u'x':
2016.3333, u'campaign_start_metric': 2016.3333, u'size': 549}],
u'key': 'podlaskie'}, {u'values': [{u'province': 'ma?opolskie',
u'count': 5165, u'group': 'ma?opolskie', u'signature_count': 7,
u'shape': u'circle', u'campaign_title': 'powietrze', u'y': 7, u'x':
2015.75, u'campaign_start_metric': 2015.75, u'size': 5165},
{u'province': 'ma?opolskie', u'count': 3122, u'group': 'ma?opolskie',
u'signature_count': 1, u'shape': u'circle', u'campaign_title': 'Na
zaostrzenie ustawy antyaborcyjnej nie pozwolimy!', u'y': 1, u'x':
2016.3333, u'campaign_start_metric': 2016.3333, u'size': 3122},
{u'province': 'ma?opolskie', u'count': 2863, u'group': 'ma?opolskie',
u'signature_count': 12, u'shape': u'circle', u'campaign_title':
'Alarm', u'y': 12, u'x': 2015.9167, u'campaign_start_metric':
2015.9167, u'size': 2863}, {u'province': 'ma?opolskie', u'count':
2166, u'group': 'ma?opolskie', u'signature_count': 2, u'shape':
u'circle', u'campaign_title': 'ulaskawienie', u'y': 2, u'x':
2015.9167, u'campaign_start_metric': 2015.9167, u'size': 2166},
{u'province': 'ma?opolskie', u'count': 1876, u'group': 'ma?opolskie',
u'signature_count': 11, u'shape': u'circle', u'campaign_title':
'krakowbezsmogu', u'y': 11, u'x': 2015.9167, u'campaign_start_metric':
2015.9167, u'size': 1876}, {u'province': 'ma?opolskie', u'count':
1096, u'group': 'ma?opolskie', u'signature_count': 2, u'shape':
u'circle', u'campaign_title': 'TK_orzeczenie', u'y': 2, u'x': 2016.25,
u'campaign_start_metric': 2016.25, u'size': 1096}], u'key':
'ma?opolskie'}]}


2016-07-28 14:12:22,681:ERROR:root:'utf8' codec can't decode byte 0xf3 in position 1: invalid continuation byte
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/caravel/views.py", line 755, in explore
    payload = obj.get_json()
  File "/usr/local/lib/python2.7/dist-packages/caravel/viz.py", line 286, in get_json
    return self.json_dumps(payload)
  File "/usr/local/lib/python2.7/dist-packages/caravel/viz.py", line 294, in json_dumps
    raise e
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf3 in position 1: invalid continuation byte
@mistercrunch
Copy link
Member

Are you sure it comes from caching? Do you still see this error if you disabled caching? What's the Caravel version in that docker image?

@mistercrunch mistercrunch added the #bug:cant-reproduce Bugs that cannot be reproduced label Jul 29, 2016
@xrmx xrmx added the unicode label Aug 9, 2016
@mistercrunch
Copy link
Member

Notice: this issue has been closed because it has been inactive for 622 days. Feel free to comment and request for this issue to be reopened.

zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this issue Nov 17, 2021
…apache#841)

While resizing chart sometimes top results were filtered out because their sizes were too big. This
solution makes sure that top 10% of results will always be displayed by gradually scaling down the
chart if needed.
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this issue Nov 24, 2021
…apache#841)

While resizing chart sometimes top results were filtered out because their sizes were too big. This
solution makes sure that top 10% of results will always be displayed by gradually scaling down the
chart if needed.
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this issue Nov 25, 2021
…apache#841)

While resizing chart sometimes top results were filtered out because their sizes were too big. This
solution makes sure that top 10% of results will always be displayed by gradually scaling down the
chart if needed.
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this issue Nov 26, 2021
…apache#841)

While resizing chart sometimes top results were filtered out because their sizes were too big. This
solution makes sure that top 10% of results will always be displayed by gradually scaling down the
chart if needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
#bug:cant-reproduce Bugs that cannot be reproduced
Projects
None yet
Development

No branches or pull requests

3 participants