[bugfix] Fixing regression introduced in #4396 #4500

john-bodley · 2018-02-28T06:49:04Z

This PR fixes a regression introduced in #4396 which returned the error No data even when the query was unsuccessful for other reasons. This PR also refactors the No data logic (previously defined in two places) and also registered that the data was loaded if the query didn't fail. Note it seems like BaseViz.get_df(...) may never throw an exception give this (and thus I'm uncertain whether this logic is needed) hence the additional check that the query succeeded is necessary to confirm that the data was loaded.

@mistercrunch from our offline conversation I found this difficult to write specific test cases, i.e., we discovered this regression in Presto when the underlying table metadata had changed either a column or the table no-longer existed. When I tried to mock this behavior, I was able to exercise the issue as in SQLite the inconsistency was discovered whilst the query was being compiled rather than during query execution.

to: @mistercrunch
cc: @graceguo-supercat @michellethomas @timifasubaa

john-bodley · 2018-02-28T06:57:53Z

superset/viz.py

@@ -327,8 +322,9 @@ def get_df_payload(self, query_obj=None):
        if query_obj and not is_loaded:
            try:


The try/except block may no longer be required per the PR description. This would make the stacktrace obsolete as well.

john-bodley · 2018-02-28T16:27:30Z

superset/viz.py

        if self.status != utils.QueryStatus.FAILED:
+            if df is None or df.empty:
+                raise Exception('No data')


Note I'm not certain why we need to throw an exception here, as previously the error would be present in error_message.

mistercrunch · 2018-02-28T17:55:52Z

I feel like we need some unit tests here making sure errors bubble up somehow. It should be easy to generate a No Data exception with a filter. For the other test case that is more representative (currently says No Data but should say something else) you could reference a metric that does not exist in the query, and change the query method to make sure it actually raises a specific message Metric referenced does not exist anymore and make sure it comes through.

john-bodley · 2018-02-28T18:07:00Z

@mistercrunch for the later simply referencing an invalid metric doesn't actually cover the specific case we discovered as the compiler detects this, i.e., prior to this fix it wouldn't return No data. I can look more into this.

john-bodley · 2018-02-28T19:10:09Z

@mistercrunch I've updated the logic and added a couple of unit tests which test the two scenarios.

john-bodley · 2018-02-28T19:12:12Z

superset/viz.py

@@ -151,9 +151,6 @@ def get_df(self, query_obj=None):
        # If the datetime format is unix, the parse will use the corresponding
        # parsing logic.
        if df is None or df.empty:


Note that the df.empty check is only required by the test_get_df_returns_empty_df unit test due to the mocking. Otherwise it would be safe to proceed with an empty pd.DataFrame.

john-bodley · 2018-02-28T19:12:52Z

superset/viz.py

        if self.status != utils.QueryStatus.FAILED:
-            payload['data'] = self.get_data(df)
+            if df is None or df.empty:
+                payload['error'] = 'No data'


It's more consistent to log the payload error as opposed to throwing an exception here.

john-bodley · 2018-02-28T19:13:26Z

superset/viz.py

@@ -612,7 +611,7 @@ def query_obj(self):
        return None

    def get_df(self, query_obj=None):
-        return None
+        return pd.DataFrame()


I added this for consistency, i.e. get_df(...) should return a pd.DataFrame().

john-bodley · 2018-03-02T06:52:56Z

@mistercrunch would you mind taking another look at this? I made a few small tweaks after adding the addition of a couple of unit tests.

michellethomas · 2018-03-06T00:01:44Z

lgtm

mistercrunch · 2018-03-06T00:57:00Z

Sorry for the delay, was traveling last week with limited attention. Thanks @michellethomas for merging it!

mistercrunch · 2018-03-06T17:44:48Z

I think this broke the brittle markup viz in trunk. Has to do with the fact that the payload gets a error key that says No data and the frontend then refuses to render the viz.

john-bodley · 2018-03-06T18:09:39Z

Sorry @mistercrunch. I think I may have a fix and will add an additional unit test for the markup test. I think the solution should be:

class MarkupViz(BaseViz):
    def get_df(self, query_obj=None):
       return None  # like before

and

def get_payload(self, query_obj=None):
    ...
    if df is not None and df.empty:  # previously if df is None or df.empty
        payload['error'] = 'No data'

mistercrunch · 2018-03-06T18:43:50Z

Sounds about right, this area is pretty brittle around the no-query special cases...

(cherry picked from commit ef4e5ec)

* Cherry pick apache#4581 * Add flask-compress cherry * Add shortner fix * Add Return __time in Druid scan apache#4504 * Picking cherry Fixing regression from apache#4500 (apache#4549) * [bugfix] SQL Lab 'MySQL has gone away' It appears the 'MySQL has gone away' is triggered by the line of code I wrapped in a try block here. This is a temporary fix, there will be another PR shortly getting to the bottom of this. Related: https://github.com/lyft/druidstream/issues/40

[bugfix] Fixing regression introduced in apache#4396

john-bodley commented Feb 28, 2018

View reviewed changes

john-bodley force-pushed the john-bodley-fix-pr-4396 branch from 3408fd6 to 269754f Compare February 28, 2018 17:35

john-bodley force-pushed the john-bodley-fix-pr-4396 branch from 269754f to 11c9e8d Compare February 28, 2018 19:05

john-bodley commented Feb 28, 2018

View reviewed changes

[payload] Fixing regression introducted in #apache#4396

7440d34

john-bodley force-pushed the john-bodley-fix-pr-4396 branch from 11c9e8d to 7440d34 Compare February 28, 2018 22:23

john-bodley changed the title ~~[payload] Fixing regression introduced in ##4396~~ [bugfix] Fixing regression introduced in ##4396 Mar 1, 2018

john-bodley changed the title ~~[bugfix] Fixing regression introduced in ##4396~~ [bugfix] Fixing regression introduced in #4396 Mar 1, 2018

john-bodley merged commit 48430a1 into apache:master Mar 6, 2018

john-bodley deleted the john-bodley-fix-pr-4396 branch March 6, 2018 00:51

john-bodley mentioned this pull request Mar 6, 2018

[bugfix] Fixing regression from #4500 #4549

Merged

john-bodley pushed a commit to john-bodley/superset that referenced this pull request Mar 6, 2018

[bugfix] Fixing regression from apache#4500

aef8ad3

mistercrunch pushed a commit that referenced this pull request Mar 7, 2018

[bugfix] Fixing regression from #4500 (#4549)

ef4e5ec

john-bodley added a commit to john-bodley/superset that referenced this pull request Mar 7, 2018

[bugfix] Fixing regression from apache#4500 (apache#4549)

7d65550

mistercrunch pushed a commit to lyft/incubator-superset that referenced this pull request Mar 14, 2018

[bugfix] Fixing regression from apache#4500 (apache#4549)

f9cd8a6

(cherry picked from commit ef4e5ec)

michellethomas pushed a commit to michellethomas/panoramix that referenced this pull request May 24, 2018

[bugfix] Fixing regression from apache#4500 (apache#4549)

e0e2179

wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018

Merge pull request apache#4500 from john-bodley/john-bodley-fix-pr-4396

95e48da

[bugfix] Fixing regression introduced in apache#4396

wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018

[bugfix] Fixing regression from apache#4500 (apache#4549)

c0319f6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bugfix] Fixing regression introduced in #4396 #4500

[bugfix] Fixing regression introduced in #4396 #4500

john-bodley commented Feb 28, 2018 •

edited

Loading

john-bodley Feb 28, 2018

john-bodley Feb 28, 2018

mistercrunch commented Feb 28, 2018

john-bodley commented Feb 28, 2018

john-bodley commented Feb 28, 2018

john-bodley Feb 28, 2018

john-bodley Feb 28, 2018

john-bodley Feb 28, 2018

john-bodley commented Mar 2, 2018

michellethomas commented Mar 6, 2018

mistercrunch commented Mar 6, 2018

mistercrunch commented Mar 6, 2018 •

edited

Loading

john-bodley commented Mar 6, 2018

mistercrunch commented Mar 6, 2018

		@@ -327,8 +322,9 @@ def get_df_payload(self, query_obj=None):
		if query_obj and not is_loaded:
		try:

[bugfix] Fixing regression introduced in #4396 #4500

[bugfix] Fixing regression introduced in #4396 #4500

Conversation

john-bodley commented Feb 28, 2018 • edited Loading

john-bodley Feb 28, 2018

Choose a reason for hiding this comment

john-bodley Feb 28, 2018

Choose a reason for hiding this comment

mistercrunch commented Feb 28, 2018

john-bodley commented Feb 28, 2018

john-bodley commented Feb 28, 2018

john-bodley Feb 28, 2018

Choose a reason for hiding this comment

john-bodley Feb 28, 2018

Choose a reason for hiding this comment

john-bodley Feb 28, 2018

Choose a reason for hiding this comment

john-bodley commented Mar 2, 2018

michellethomas commented Mar 6, 2018

mistercrunch commented Mar 6, 2018

mistercrunch commented Mar 6, 2018 • edited Loading

john-bodley commented Mar 6, 2018

mistercrunch commented Mar 6, 2018

john-bodley commented Feb 28, 2018 •

edited

Loading

mistercrunch commented Mar 6, 2018 •

edited

Loading