Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README, ensure table code is working #37

Closed
sharlagelfand opened this issue May 4, 2021 · 30 comments
Closed

Update README, ensure table code is working #37

sharlagelfand opened this issue May 4, 2021 · 30 comments
Assignees
Labels
documentation Improvements or additions to documentation r

Comments

@sharlagelfand
Copy link
Collaborator

Once #23 is all closed out and we have a functional widget, will need to update the README illustrating how it works! I don't think you can embed an htmlwidget into a static README so this may have to be in the form as htmlwidget -> movie -> GIF, but will double check on that.

Since we haven't done anything with the table code yet, will take a pass through and ensure it all still works and can be left in.

@sharlagelfand sharlagelfand added documentation Improvements or additions to documentation r labels May 4, 2021
@sharlagelfand sharlagelfand self-assigned this May 4, 2021
@sharlagelfand
Copy link
Collaborator Author

Need to update the README with the widget fot plotting, but confirming that the table code works and can be left in 👍

@sharlagelfand
Copy link
Collaborator Author

sharlagelfand commented May 6, 2021

👍 plot examples in README work, all good:

"small_salary %>% group_by(Degree) %>% summarize(mean = mean(Salary))" %>%
  datamation_sanddance()
example_1.mov
"small_salary %>% group_by(Degree, Work) %>% summarize(mean = mean(Salary))" %>%
  datamation_sanddance()
example_2.mov

Some notes on discrepancies that remain between the originals (in the main branch README) and these versions, + beyond in the case of 3 grouping variables (e.g. penguins) and some general thoughts on what else we might want to fix. making a checklist so i can mark off as we add these in:

  • axes (+ titles) are missing (discussed in Multiple axis in fake faceted views #39)
  • y-axis should be autoscaled to range of data, instead of from 0 (discussed in Decide where calculation of infogrid and jitter coordinates happens #23)
  • values not centred on the x-axis, e.g. some appear on the edge of their facets instead of centred (discussed in Hack faceted vega specs to build fake faceted view in a single plot #32) - maybe just domain needs to be increased to show
  • x-axis values are numeric, not actual group labels in the case of 3+ variables (discussed in Fix gemini axes, labels and other component issues #33) - wondering if it'll work to have numeric values as the breaks, but the values (e.g. undefined, male, female) as the labels. similar to what i discussed/figured out here + beyond in that thread, but with sex instead of the islands on the x-axes.
  • Maybe we should add facet titles? so "Degree" as the title on the column facets and "Work" as the title on the row ones
  • Filled points instead of transparent? Seems pretty quick to change in vega lite and much more like the scatter plots I'm used to :) I can update in the specs I'm passing but likely needs to be updated on the JS side as well
  • Mostly just curious on this end... what's going on with the size of the facet labels changing? Goes smaller -> bigger -> smaller a few times

might have missed some things but will add to this as I find em! cc @jhofman @giorgi-ghviniashvili

@jhofman
Copy link
Contributor

jhofman commented May 6, 2021

awesome, this looks great and thanks for summarizing the needed updates in an easy-to-follow checklist!

@giorgi-ghviniashvili
Copy link
Collaborator

giorgi-ghviniashvili commented May 7, 2021

Mostly just curious on this end... what's going on with the size of the facet labels changing? Goes smaller -> bigger -> smaller a few times

@sharlagelfand please update gemini.web.js with this to include latest tickPadding fix.

@giorgi-ghviniashvili
Copy link
Collaborator

Filled points instead of transparent? Seems pretty quick to change in vega lite and much more like the scatter plots I'm used to :) I can update in the specs I'm passing but likely needs to be updated on the JS side as well

yes, try to use filled: true.

@giorgi-ghviniashvili
Copy link
Collaborator

Maybe we should add facet titles? so "Degree" as the title on the column facets and "Work" as the title on the row ones

You can add titles now, currently facet.row.title and facet.column.title are null.

@giorgi-ghviniashvili
Copy link
Collaborator

x-axis values are numeric, not actual group labels in the case of 3+ variables (discussed in #33) - wondering if it'll work to have numeric values as the breaks, but the values (e.g. undefined, male, female) as the labels. similar to what i discussed/figured out here + beyond in that thread, but with sex instead of the islands on the x-axes.

@sharlagelfand Since this is encoding.x.axis , can you try to set it yourself and just sent it via spec? Use tickExpr.

@giorgi-ghviniashvili
Copy link
Collaborator

values not centred on the x-axis, e.g. some appear on the edge of their facets instead of centred (discussed in #32) - maybe just domain needs to be increased to show

Please try to add 0.5 for padding.

@sharlagelfand
Copy link
Collaborator Author

sharlagelfand commented May 7, 2021

Sorry, just finally gone through all these now!

Mostly just curious on this end... what's going on with the size of the facet labels changing? Goes smaller -> bigger -> smaller a few times

@sharlagelfand please update gemini.web.js with this to include latest tickPadding fix.

Will try that - you don't have the latest version here right? The sizes are still changing there I've updated my version of gemini but still running into this issue.

Filled points instead of transparent? Seems pretty quick to change in vega lite and much more like the scatter plots I'm used to :) I can update in the specs I'm passing but likely needs to be updated on the JS side as well

yes, try to use filled: true.

The specs here now all use filled: true but it doesn't perpetuate through the datamation here, maybe needs to be updated in the JS too?

Maybe we should add facet titles? so "Degree" as the title on the column facets and "Work" as the title on the row ones

You can add titles now, currently facet.row.title and facet.column.title are null.

Have updated the titles in the specs here but doesn't look like they're coming up

x-axis values are numeric, not actual group labels in the case of 3+ variables (discussed in #33) - wondering if it'll work to have numeric values as the breaks, but the values (e.g. undefined, male, female) as the labels. similar to what i discussed/figured out here + beyond in that thread, but with sex instead of the islands on the x-axes.

@sharlagelfand Since this is encoding.x.axis , can you try to set it yourself and just sent it via spec? Use tickExpr.

Will try this on my end - I figured it out before so shouldn't be too bad to do again. Got this working!

values not centred on the x-axis, e.g. some appear on the edge of their facets instead of centred (discussed in #32) - maybe just domain needs to be increased to show

Please try to add 0.5 for padding.

Where should I add this? To the actual X values? Or padding the domain on the plot?

@jhofman
Copy link
Contributor

jhofman commented May 10, 2021

@sharlagelfand, @giorgi-ghviniashvili: just checking in on this. is it realistic to try to update the README w/ the new examples today (before tonight's talk)?

it's okay if not, but just want to get a sense.

@sharlagelfand
Copy link
Collaborator Author

@jhofman definitely realistic and my plan to! If @giorgi-ghviniashvili has time to answer some of the questions in my comment above we can make some progress on the more visual aspects of the datamations, but if not will definitely update with how it's looking now. The app is updated with the latest versions of everything.

Would you be able to merge the two existing PRs, then I can rebase on them, update the README and PR my refactor-test branch? Then if we are able to make additional changes before tonight I can do that on a new, smaller branch so we don't have to worry about this mammoth branch too late in the day 😁

@jhofman
Copy link
Contributor

jhofman commented May 10, 2021

great!

btw, just tried out the app and saw some funniness in placement of the points on the last frame:

Screen Shot 2021-05-10 at 12 30 40 PM

@sharlagelfand
Copy link
Collaborator Author

Not sure what's going on there, the last spec on its own looks fine so it might be something on the JS / axes faking side.

@sharlagelfand
Copy link
Collaborator Author

+ bonus ahh, that's not good because the values are actually < 100 so that final plot is incorrect (not just e.g. axes don't go far enough), here's how it should look:

Screen Shot 2021-05-10 at 12 35 59 PM

@giorgi-ghviniashvili
Copy link
Collaborator

giorgi-ghviniashvili commented May 10, 2021

Where should I add this? To the actual X values? Or padding the domain on the plot?

axis.scale.domain

....

  • filled: true fixed
  • facet titles fixed

@giorgi-ghviniashvili
Copy link
Collaborator

@jhofman Ok, I fixed it. The problem was that distances between facet title and axis and regular axis title and axis are different, which made the top positions different.
Actually generally these kind of things are cons of "hacking facets", we need to have some hardcoded padding / spacing adjustments..

@sharlagelfand
Copy link
Collaborator Author

Thanks @giorgi-ghviniashvili! Filled points and facet title are working now.

@sharlagelfand
Copy link
Collaborator Author

Unfortunately doesn't look like I can do the axis.scale.domain stuff on my end (due to the fake facets, the specs fly off the screen then fly back on... not what we want!), was going to try to hack it today before the talk but will have to wait to be done properly in JS!

Working on updating the README now with updated GIFs, then I will PR @jhofman

@jhofman
Copy link
Contributor

jhofman commented May 10, 2021

sounds good, thanks for the update.

@sharlagelfand
Copy link
Collaborator Author

Looks like something is going on here with the scaling of the y-axis - it's scaled just to the domain for the jitter view (which has no axes, as shown in this comment) but then scaled 0 -> full domain for the summary view (which has axes).
The consequence is that briefly it shows the axis for the jitter view but with wrong values, so it looks like somehow values between 0 and 40 (Masters in Academia) have a mean of ~85, which of course isn't what's actually happening in the data

So the animation has these frames:

jitter view, no axes

Screen Shot 2021-05-10 at 3 48 17 PM

jitter view, incorrect axes

Screen Shot 2021-05-10 at 3 49 00 PM

summary view, correct axes (but domain is too large)

Screen Shot 2021-05-10 at 3 49 07 PM

@giorgi-ghviniashvili
Copy link
Collaborator

@sharlagelfand good catch! That's because we need to match domain of real faceted view for axis to the hacked facet domain. I think I fixed it.

@sharlagelfand
Copy link
Collaborator Author

sharlagelfand commented May 11, 2021

I've updated the README with how things are looking now, the main thing that could be updated before we merge is that the axes should show up one frame earlier (as soon as animation of infogrid -> jitter starts).

A couple other questions/things to keep an eye on:

  • the final frame doesn't show up for this pipeline palmerpenguins::penguins %>% group_by(species, island, sex) %>% summarize(mean = mean(bill_length_mm)): foiled by needing na.rm = TRUE, my bad! fixed this.
no_final_frame.mov
  • values moving across facets (seen in above video) - I have a feeling this is related to NAs so I will dig into it.

  • size of the initial info grid - seems really big especially when there's only one set of facets - maybe should only take up the room of the "final" frame? The first GIF in the README is a good example.

  • In general I think we could size down the widget, it doesn't really fit in my RStudio viewer (where I imagine most people would be generating these), I'll look into how to add sizing options as an argument too.

(Happy to move these ^^ notes to a new issue, just somewhere to collect my thoughts)

cc @jhofman @giorgi-ghviniashvili

@sharlagelfand
Copy link
Collaborator Author

Just to update where things are at, the app here has the latest of everything!

You can change the size now, there are some defaults set but it would be nice to have it auto-size a bit based on the number of row/column facets but at least some control is nice! Still having some issues with values moving across facets but we can dig in more tomorrow.

@sharlagelfand
Copy link
Collaborator Author

(Will also close this issue and move outstanding stuff to new issues tomorrow, since the README is updated!)

@jhofman
Copy link
Contributor

jhofman commented May 12, 2021

thanks @sharlagelfand

@giorgi-ghviniashvili is it possible that there's still a bug in the axis positioning or labeling? seems like the averages in the last frame are in the 90k region, but i remember them being in the high 80s.

@giorgi-ghviniashvili
Copy link
Collaborator

@jhofman no, the y values are 90s:
image

@sharlagelfand
Copy link
Collaborator Author

Looks like these are right:

Screen Shot 2021-05-14 at 11 46 50 AM

  Degree   mean
1 Masters  90.2
2 PhD      88.2

so going to close this now!

@jhofman
Copy link
Contributor

jhofman commented May 17, 2021

Looks like the shiny app still shows two values about 90. is that previous screenshot from the app or somewhere else?

Screen Shot 2021-05-17 at 10 31 08 AM

@sharlagelfand
Copy link
Collaborator Author

Huh, you're right! And the data shown in the app is wrong too (but seems to match the values on the plot)

Screen Shot 2021-05-17 at 10 36 32 AM

Not sure what's going on here - I'll dig into it.

@sharlagelfand
Copy link
Collaborator Author

Oooh - there's two different "small salary" data sets - small_salary and small_salary_data:

library(dplyr)
library(datamations)

small_salary
#> # A tibble: 100 x 6
#>       ID Degree  Work     Salary i     order
#>    <int> <fct>   <fct>     <dbl> <chr> <int>
#>  1    22 Masters Academia   81.9 id        1
#>  2    96 PhD     Academia   84.5 id        2
#>  3    10 Masters Academia   82.9 id        3
#>  4    42 PhD     Academia   83.8 id        4
#>  5    55 PhD     Academia   83.8 id        5
#>  6    14 PhD     Academia   85.3 id        6
#>  7    33 PhD     Industry   91.4 id        7
#>  8   100 PhD     Academia   85.3 id        8
#>  9    57 Masters Academia   83.3 id        9
#> 10     2 PhD     Industry   92.3 id       10
#> # … with 90 more rows

small_salary %>% 
  group_by(Degree) %>%
  summarise(mean = mean(Salary))
#> # A tibble: 2 x 2
#>   Degree   mean
#>   <fct>   <dbl>
#> 1 Masters  90.2
#> 2 PhD      88.2

small_salary_data
#> # A tibble: 30 x 3
#>    Degree  Work     Salary
#>    <chr>   <chr>     <dbl>
#>  1 Masters Industry     86
#>  2 Masters Academia     71
#>  3 PhD     Industry    104
#>  4 Masters Industry     94
#>  5 Masters Academia     93
#>  6 Masters Academia     96
#>  7 PhD     Academia    100
#>  8 Masters Industry     86
#>  9 PhD     Academia     80
#> 10 Masters Industry     85
#> # … with 20 more rows

small_salary_data %>%
  group_by(Degree) %>% 
  summarise(mean = mean(Salary))
#> # A tibble: 2 x 2
#>   Degree   mean
#>   <chr>   <dbl>
#> 1 Masters  90.6
#> 2 PhD      92.1

Will open a new issue for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation r
Projects
None yet
Development

No branches or pull requests

3 participants