New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WiP] Shorten URL for explore #3795

Closed
graceguo-supercat opened this Issue Nov 7, 2017 · 4 comments

Comments

Projects
None yet
4 participants
@graceguo-supercat
Contributor

graceguo-supercat commented Nov 7, 2017

Currently Superset uses stateful url for explore chart. We append all query parameters in GET request. When user open a slice, or change one control to make a new query, Superset make a new GET request. Superset also provided a shorten url function, but this shorten url is only generated per user request (click a button from UI). Recently we see many failed requests within airbnb, when GET request's parameters are too many too long, which cause chrome browser throw 400 Bad Request error.

I am trying to provide a solution for this problem:

  1. For each explore view request, we generate a cache_key for it, which is the MD5 hash of sorted query parameters. The slice/explore URL will be like:
    http://localhost:8088/superset/explore/table/313/92bb019eda2cfb5ffc8a33cb7bae733f
    Currently we already stored form_data as part of cache content with cache_key.

  2. Server-side parses cache_key to get parameters, and pass it to front-end as part of bootstrap data.

  3. In front-end, when user interact with controls, or update markdown slice, the arbitrary amount of query parameters, as well as the content of Markdown slice, will be POST to the server in the body of the request message. see #2886.

  4. When generating dashboard bootstrap data, we should use shorten format as slice URL.

  5. Explore view will still accept query parameters in GET request. These url parameters will be merged with parameters hash.

@graceguo-supercat graceguo-supercat self-assigned this Nov 7, 2017

@graceguo-supercat graceguo-supercat changed the title from WiP Shorten URL for explore view to [WiP] Shorten URL for explore view Nov 7, 2017

@graceguo-supercat graceguo-supercat changed the title from [WiP] Shorten URL for explore view to [WiP] Shorten URL for explore Nov 7, 2017

@mistercrunch

This comment has been minimized.

Contributor

mistercrunch commented Nov 8, 2017

We mentioned a temporary solution at the meeting, @fabianmenges what did you use for compression? lz-strings?
http://pieroxy.net/blog/pages/lz-string/index.html

@fabianmenges

This comment has been minimized.

Contributor

fabianmenges commented Nov 8, 2017

I tried to use brotli but couldn't get it to work, the npm package seemed to be out of date and broken (can't remember which one that was). I ended up using lz-string as you suggested.

This is what I did in the swivel index.jsx, which keeps everything backwards compatible:

import { decompressFromBase64 } from 'lz-string';

let formData;
if (bootstrapData.lz_form_data) {
  formData = JSON.parse(decompressFromBase64(bootstrapData.lz_form_data));
} else if (bootstrapData.form_data) {
  formData = JSON.parse(bootstrapData.form_data);
} else {
  formData = {};
}

And the in the url generation in exploreUtils.js I added something like this:

import { compressToBase64 } from 'lz-string';
...
export function trimFormData(formData) {
  const cleaned = { ... formData };
  Object.entries(formData).forEach(([k, v]) => {
    if (v === null || v === undefined || v === false) {
      delete cleaned[k];
    }
  });
  return cleaned;
}

export function getExploreUrl(form_data, endpointType = 'base', force = false,...) {
  ...
  search.lz_form_data = compressToBase64(JSON.stringify(trimFormData(form_data)));
  ...
}

I didn't make any adaptions to the python side (for Swivel).
I can try to submit a PR early next week for making this work with the explore view, let me know.

@rumbin

This comment has been minimized.

Contributor

rumbin commented Nov 8, 2017

I am concerned whether this won't clutter up the url table fairly quickly. Or did I miss a point here?

Wouldn't it suffice to store the query parameters in the cache backend? Of course, the cache timeout for these entries should then be set to quite a long time, though...

@graceguo-supercat

This comment has been minimized.

Contributor

graceguo-supercat commented Nov 8, 2017

@rumbin I have same concern. I store this key in url table because I want to same some decoding time for each shorten URL request. For example, every time when request http://localhost:8088/superset/explore/table/313/?query=92bb019eda2cfb5ffc8a33cb7bae733f
coming in, i have to decode 92bb019eda2cfb5ffc8a33cb7bae733f to find the form_params are { a:1, b:2, c:true ...}.

Fabian and Max had a suggestion: we compress all form_data from front-end, and send compressed string as a request parameter. I assume in server-side we still need to decompress this lz_form_data parameter, so there is a similar cost for this solution as well.

I saw we had a payload data (which has charting data, form_data, etc) in cache, I probably just need to figure out how to get form_data from cache. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment