Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dot, Lollipop and Cleveland Dot Plots #842

Merged
merged 26 commits into from Mar 20, 2021

Conversation

bclehmann
Copy link
Member

New Contributors:
please review CONTRIBUTING.md

Purpose:
#817

New Functionality:
So far this only adds Lollipop plots because I'm still working on it.

I'm also unsure whether dot plots are worth the effort, dot plots offer no benefit compared to bar plots that I know of other than being easy to draw with typewriters or ASCII art. And adding dot plot support looks like it would be non-trivial if one wants to minimize the number of fractional dots drawn. If someone has a compelling use case I can come back to them.

Lollipop

double[] values = { 26, 20, 23, 7, 16 };

var bar = plt.AddBar(values);
bar.DisplayStyle = BarStyle.Lollipop;
bar.HorizontalOrientation = true;

image

Also a modest refactor to bar rendering.
@bclehmann bclehmann marked this pull request as draft February 28, 2021 21:18
@bclehmann
Copy link
Member Author

Note that this commit does make a refactor to the rendering of bar plots (RenderBarVertical and RenderBarHorizontal both call RenderBarFromRect) but I did not refactor any of the errorbar related stuff or the drawing of values above the bars. Mostly because it didn't achieve anything with regard to this PR.

I'm not sure about showing the values above the bars, but errorbars probably could be refactored out into its own function which would be shared between both vertical and horizontal barcharts.

@bclehmann
Copy link
Member Author

I'm wondering if Cleveland Dot Plots should be their own plottable, or a type of bar chart? At first, I didn't think it would work, but bar charts have the YOffsets field, which was originally added in #463 #476 to allow for waterfall charts.

Let me know what you think, in terms of code it would be quite elegant, but conceptually I don't know that Cleveland Plots are really in the Bar Plot family.

@bclehmann
Copy link
Member Author

Cleveland Dot Plots look like this:
image

Code:

double[] home_wins = { 12, 17, 16, 18, 18 }; // Data collected from https://footystats.org/england/premier-league/home-away-league-table
double[] away_wins = { 11, 13, 16, 14, 14 };
away_wins = away_wins.Select((y, i) => y - home_wins[i]).ToArray(); // y2 should be absolute, not in terms of its distance from y1

string[] labels = { "2015/16", "2016/17", "2017/18", "2018/19", "2019/20" };
double[] label_positions = Enumerable.Range(0, labels.Length).Select(i => (double)i).ToArray();

var bar = plt.AddBar(away_wins);
bar.YOffsets = home_wins;
bar.DisplayStyle = BarStyle.ClevelandDot;
plt.XAxis.ManualTickPositions(label_positions, labels);
plt.Title("British Premier League Champion Home vs Away Wins");

While this method is convenient for the implementer, it's perhaps clunky for the end-user.

@bclehmann
Copy link
Member Author

With Legend support:

image

Code:

double[] home_wins = { 12, 17, 16, 18, 18 }; // Data collected from https://footystats.org/england/premier-league/home-away-league-table
double[] away_wins = { 11, 13, 16, 14, 14 };
away_wins = away_wins.Select((y, i) => y - home_wins[i]).ToArray(); // y2 should be absolute, not in terms of its distance from y1

string[] labels = { "2015/16", "2016/17", "2017/18", "2018/19", "2019/20" };
double[] label_positions = Enumerable.Range(0, labels.Length).Select(i => (double)i).ToArray();

var bar = plt.AddBar(away_wins);
bar.YOffsets = home_wins;
bar.DisplayStyle = BarStyle.ClevelandDot;
plt.XAxis.ManualTickPositions(label_positions, labels);
plt.Title("British Premier League Champion Home vs Away Wins");

bar.ClevelandLabel1 = "Home Wins";
bar.ClevelandLabel2 = "Away Wins";
plt.Legend();

@bclehmann bclehmann marked this pull request as ready for review March 1, 2021 01:19
@bclehmann
Copy link
Member Author

Also let me know if a neutral colour (instead of the colour of the bottom dot) would be better for the line between dots in Cleveland dot plot.

@swharden
Copy link
Member

swharden commented Mar 1, 2021

Hey @bclehmann, sorry it took a little while for me to get to this. So far things are looking great! Here are some quick thoughts:

Dot Plots

I agree with you this seems pretty low priority / not worth implementing at this time. Users who are really compelled to make these plots could probably do so by getting creative with scatter plots. However, it might make a good example for the FAQ page demonstrating how to create custom plot types? https://github.com/ScottPlot/Website/issues/6

Error Bars

I did not refactor any of the errorbar related stuff or the drawing of values above the bars. Mostly because it didn't achieve anything with regard to this PR

Seems fair! I'd be surprised if users wanted to mix error bars with these new bar styles anyway. If users wanted something custom, they could probably add a scatter plot with no lines or markers (just error bars) and customize that as desired.

Extend BarPlot or Inherit it?

I'm wondering if Cleveland Dot Plots should be their own plottable, or a type of bar chart?

My gut reaction is that inheriting BarPlot makes sense. These new plot types could have custom fields for customizations, and override its render method. I'd probably go with names like LollipopPlot and ClevelandPlot. Probably the best way to do this is to pull common functionality into a BarPlotBase that uses as many protected fields as possible. Error bar functionality could be exposed only from the BarPlot class. Cleveland plots could have properties like values1 and values2 that point to Ys and YOffsets under the hood. I recognize this is a good bit of refactoring, but it's probably how I'd approach this. It would also keep the functions shorter and avoid those awkward switch statements. What do you think about this design?

EDIT: Whether/how you refactor this way may depend on if/how you implement data object

Data Objects

I suspect we may soon want to refactor bar plots to accomodate a BarData data object (#841) to make it easier to add bars in real time. I'm reminded how similar Cleveland plots are to candlestick charts, and how much that plot type would benefit from a data object with an Add() method (so the user could hold on to the data object rather than the plottable itself).

I'm on the fence about if exploring this will make this PR too complicated or not, but if a lot of refactoring goes down it might be an opportunity to experiment with this concept a bit. Rather than suggest implementation details, I'm curious what you think of the idea and if you like it what your preferred implementation what it would look like. Perhaps you could make a data object for bar plots, I'll do the same with scatter plots, then we could compare our implementations and identify the best parts of both and converge onto a common style before creating data objects for other plot types? It sounds good in theory, but it also might just be clash of personal preferences in API design! I'm interested in your thoughts on this front. If you want to discuss it using shorter messages I'm happy to touch base on slack or discord. I'll probably work on this a little bit every day for the next week, so progress may be a bit slow on my end.

@swharden
Copy link
Member

swharden commented Mar 1, 2021

Also let me know if a neutral color (instead of the color of the bottom dot) would be better for the line between dots in Cleveland dot plot

Want to have 3 user-configurable colors? Rather than expose the fields, a setter might be a nice way to let the user control transparency. Thinking out loud we could make everything private and have style methods like this:

void SetStyleDot1(Color color, float size, MarkerShape shape) { /* */ }
void SetStyleDot2(Color color, float size, MarkerShape shape) { /* */ }
void SetStyleLine(Color color, float width, LineStyle style, double alpha = 1.0) { /* */ }

@bclehmann
Copy link
Member Author

Seems fair! I'd be surprised if users wanted to mix error bars with these new bar styles anyway.

Sorry, I wasn't clear. Error bars work with the new bar styles (although I question their usefulness). The refactoring I was referring to was moving the bar rendering into its own function, while the error bar code is there twice in RenderBarHorizontal and RenderBarVertical.

My gut reaction is that inheriting BarPlot makes sense.

I like this idea, they're conveniently implemented as extensions to bar charts but I don't know how intuitive that is from an API perspective. Inheritance addresses both concerns pretty well in my opinion.

Data Objects

I am not particularly fond of making our own data object. I agree that handling growing data is best handled in the datastructure, but I think most of our users want to append (and possibly remove) data either to the beginning or the end. That sounds like a deque to me. C# doesn't provide a deque class but they're not particularly difficult to implement.

I think if we create a Deque<T> class we can replace double[] ys or OHLC[] data with IList<double> ys and IList<OHLC> data. This is a dropin replacement for any array parameters, as T[] is compatible with IList<T>.

To me this has a few main benefits:

  • Allows users to use any IList they want, including arrays, collections in the Base Class Library, and 3rd party collections
  • Most users probably only want List<double> support anyways, as appending to the end is more common than the beginning
  • Writing one Deque<T> class is much easier than writing a BarData, SignalData, OHLCData... etc class
    • There are also plenty of deque implementations out there one can borrow

In principle, one could use IEnumerable<T> instead of IList<T> on some plot types (by no means all) which would allow users to use Queue<double> or LinkedList<double> as well. However, I think that's not a great idea as it means a lot more effort and it's likely only applicable to a few plottables.

I am open to discussing this on discord or slack or wherever else you would like, although unfortunately I'm pretty busy for the next week.

@bclehmann
Copy link
Member Author

Want to have 3 user-configurable colors?

Yeah that's probably the best option.

Thinking out loud we could make everything private and have style methods like this ...

Yeah I like that approach, it organizes the different properties of a dot more clearly.

@bclehmann
Copy link
Member Author

As an aside it just occurred to me that customizable marker shapes would probably be a good idea.

@bclehmann
Copy link
Member Author

I created a BarPlotBase which Bar, Lollipop and Cleveland Dot plots inherit from.

For the customization of the Cleveland Dot plot I added a SetDot1Style, but elected to leave the StemColor as just a field which is Color.Gray by default. I don't know that there's that much to customize with the stem so I don't know if a method makes sense.

I also didn't touch the BarData stuff as that seems like a bigger thing than just a new plottable.

@swharden
Copy link
Member

Hey @bclehmann, thanks again for all your work here! I finally got to start working on this (#813 took me way longer than I anticipated), and I was hoping to finish it today but I wasn't able to.

Your modules look great - I'm going to take this opportunity to refactor the BarPlot module a bit more to bring it into consistency with some of the other plot types. I'll work on this PR aggressively over the next couple days and merge it, then we can revisit some of the detailed topics addressed above 👍

@bclehmann
Copy link
Member Author

Those test failures came because YOffsets and YErrors were never initialized on Lollipop plots, so we got a null reference. They can still be overridden if the user were so inclined, they are just default initialized to an array of zeroes now.

@swharden
Copy link
Member

swharden commented Mar 15, 2021

Those test failures came because YOffsets and YErrors were never initialized on Lollipop plots, so we got a null reference. They can still be overridden if the user were so inclined, they are just default initialized to an array of zeroes now.

@bclehmann thanks for pointing that out and stepping in to fix it!

Something came up tonight and I tried to wrap up and commit my work before leaving the computer and in hindsight that was an irresponsible / rookie move! I'm super embarrassed I broke the build here and didn't have the time to fix it 😝

Thanks for the quick repair! I think I can finish my work on this PR tomorrow, and I'll try not to leave branches in a broken state again in the future 😅

EDIT: I'm slammed with some other stuff this week so it'll have to wait until the weekend

@swharden swharden mentioned this pull request Mar 19, 2021
48 tasks
The old field names only make sense for vertically-oriented bar graphs.

Original field names have been marked obsolete.
The bar plot base now contains data management only.

This commit leaves child classes with sub-ideal duplicated code. Plot-type-specific minimal-complexity render methods should be created for these child classes in the future.
@swharden
Copy link
Member

Hey @bclehmann, thanks for your work on this PR!

I continued to refactor these classes and basically removed the IPlottable interface from BarPlotBase and instead only implemented it in the child classes. This lets the base focus on data management (positions, orientation, offsets, etc.) and each child plot type implements its own render methods. It also reduces the complexities associated with abstract classes and overrides.

I didn't extensively review/refactor the render code in the new child classes. I suspect large portions of the render code can be deleted, and the render methods can become relatively simple now (e.g., lollipop plots don't need to know how to draw rectangles).

I really want to focus on the current high-priority tasks preventing me from releasing ScottPlot 4.1 (#716), so I'll probably put this down and pick it up again in a few weeks. I think this landed in a pretty good place for now, but I look forward to revisiting some of the discussions above about data management (live data, growing and auto-resizing arrays, etc.) after the release.

@swharden swharden merged commit febe832 into ScottPlot:master Mar 20, 2021
@swharden swharden linked an issue Mar 20, 2021 that may be closed by this pull request
@swharden swharden linked an issue Mar 21, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dot Plots, Lollipop and Cleveland Dot Plots
2 participants