Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New action: "All-column" vis when only few columns in dataframe #199

Closed
dorisjlee opened this issue Jan 6, 2021 · 1 comment
Closed
Assignees
Labels
enhancement New feature or request

Comments

@dorisjlee
Copy link
Member

dorisjlee commented Jan 6, 2021

When there are only two columns in a dataframe involving a quantitative and a temporal attribute, we display separate univariate charts. The better chart to display here would be a line chart with both attributes. We need to change the action logic to account for this use case.

tmax = pd.read_csv('https://raw.githubusercontent.com/koldunovn/python_for_geosciences/master/DelhiTmax.txt', delimiter=r"\s+", parse_dates=[[0,1,2]], header=None)
tmax.columns = ['Date', 'Temp']

image

This notebook has a good use case example for time series.

@dorisjlee dorisjlee added the enhancement New feature or request label Jan 6, 2021
@dorisjlee dorisjlee changed the title Display time-series when only few columns is shown Display all-column visualization when only few columns in dataframe Jan 15, 2021
@dorisjlee
Copy link
Member Author

A similar use case is when there are 1 dimension and 2 quantitative variables, the visualization shows as two separate charts.
image
image
A single visualization like this would be more appropriate

image
image

Example code:

df = pd.DataFrame({'nPts': {0: 49999, 1: 71174, 2: 101317, 3: 144224, 4: 205303, 5: 292249, 6: 416016, 7: 592198, 8: 842993, 9: 1200000}, 't_heatmap': {0: 0.15121674537658691, 1: 0.1811518669128418, 2: 0.2179429531097412, 3: 0.2787730693817139, 4: 0.3973350524902344, 5: 0.4233138561248779, 6: 0.580251932144165, 7: 0.7928099632263184, 8: 1.0876789093017578, 9: 1.6242818832397459}, 't_color_heatmap': {0: 0.14242982864379886, 1: 0.18866705894470212, 2: 0.1566781997680664, 3: 0.16737699508666992, 4: 0.19900894165039065, 5: 0.2701129913330078, 6: 0.2533812522888184, 7: 0.37183785438537603, 8: 0.3830866813659668, 9: 0.39321017265319824}, 't_bar': {0: 0.01940608024597168, 1: 0.02618718147277832, 2: 0.024693727493286133, 3: 0.029685020446777344, 4: 0.03471803665161133, 5: 0.04173588752746582, 6: 0.04706382751464844, 7: 0.0667569637298584, 8: 0.08260798454284668, 9: 0.1006929874420166}, 't_cbar': {0: 0.04185795783996582, 1: 0.050965070724487305, 2: 0.052091121673583984, 3: 0.0610501766204834, 4: 0.07504606246948242, 5: 0.09924101829528807, 6: 0.10392117500305176, 7: 0.13044309616088867, 8: 0.18137121200561526, 9: 0.2069528102874756}, 't_hist': {0: 0.01846003532409668, 1: 0.018136978149414062, 2: 0.018748998641967773, 3: 0.01876473426818848, 4: 0.02434706687927246, 5: 0.025368213653564453, 6: 0.02530217170715332, 7: 0.02823114395141602, 8: 0.029034852981567383, 9: 0.034781217575073235}, 't_scatter': {0: 0.8179378509521484, 1: 1.0982017517089844, 2: 1.4875690937042236, 3: 2.146117925643921, 4: 3.1123709678649902, 5: 3.92416787147522, 6: 6.097048044204713, 7: 8.803220987319945, 8: 12.054464101791382, 9: 17.740845680236816}, 't_color_scatter': {0: 1.3474130630493164, 1: 1.7953579425811768, 2: 2.6268541812896733, 3: 3.646371126174927, 4: 4.901940107345581, 5: 6.974237203598023, 6: 10.197483777999876, 7: 14.134275913238525, 8: 21.13951110839844, 9: 27.127063989639282}})
pdf = df.melt(id_vars=["nPts"],value_vars=['t_heatmap', 't_color_heatmap', 't_bar', 't_cbar', 't_hist','t_scatter', 't_color_scatter'], var_name="type", value_name='time')

@dorisjlee dorisjlee changed the title Display all-column visualization when only few columns in dataframe New action: "All-column" vis when only few columns in dataframe Jan 15, 2021
@caitlynachen caitlynachen self-assigned this Mar 1, 2021
dorisjlee added a commit that referenced this issue Apr 18, 2021
Co-authored-by: Caitlyn Chen <caitlynachen@berkeley.edu>
Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com>
dorisjlee added a commit that referenced this issue Apr 30, 2021
…ales (#262)

* Add support to improve temporal action to display different timescales

* Resolve PR comments

* Add support to improve temporal action to display different timescales

* Resolve PR comments

* Reformat files using black

* "All-column" vis when only few columns in dataframe #199 (#336)

Co-authored-by: Caitlyn Chen <caitlynachen@berkeley.edu>
Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com>

* documentation and cleaning
* added notebook gallery
* update README
* removed scatterplot message in SQLExecutor
* fixed typo in SQL documentation

* update README and bump version

* bump version

* clear propagated vis data intent after PandasExecutor completes execute (#297)

* fix black to stable version

* Scalability: incorporate early pruning optimizations (#368)

* changes from perf branch to config
* added flag for turning on/off lazy maintain optimization

* merged in approx early pruning code

* increase overall sampling start and cap

* Adjust width and length criteria for early pruning vislist based on experiment results; Add warning message and test for early pruning

* black version update

* version lock on black

* * fixed sql tests (added approx to execute constructor)
* fixed sampling config test
* improved Executor documentation

* timescale feature
* adding weekday
* adding docs
* bugfix for y axis line chart export
* fixing temporal axis by adding timescale variable in Clause

Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com>
Co-authored-by: Caitlyn Chen <caitlynachen@gmail.com>
Co-authored-by: Caitlyn Chen <caitlynachen@berkeley.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants