Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Episode 03 - Visualizing Tabular Data : Suggestions #835

Open
xintin opened this issue Jun 21, 2020 · 2 comments
Open

Episode 03 - Visualizing Tabular Data : Suggestions #835

xintin opened this issue Jun 21, 2020 · 2 comments

Comments

@xintin
Copy link

xintin commented Jun 21, 2020

Hi, while going through Episode3, Visualizing Tabular Data, I came across a few things that I feel can be improved/added.

  1. It is a globally accepted convention to use an alias for mathplot.pyplot as plt. Can we replace
    import matplotlib.pyplot with import matplotlib.pyplot as plt
    Ref: http://google.github.io/styleguide/pyguide.html#22-imports

  2. In the introduction, we are saying that,

    First, we will import the pyplot module from matplotlib and use two of its functions to create and display a heat map of our data

    Can we explain the usage of imshow and show()? Reason being nowhere else in the lesson we are using imshow(). And it leaves me without a clear explanation of why do I need imshow(data) along with show().

  3. matplotlib is used to plot bar charts, pie charts, histograms, scatter plots, etc. Shall we make the tutorial more varied with examples apart from line charts alone? We can also include this as an exercise.

  4. In this episode, we are only using matplotlib. Shall we rename the episode as "Visualizing Tabular Data using Matplotlib"? Because we do have seaborn, plotly, and ggplot like packages gaining popularity too.

  5. How about adding a small snippet using set_title() to distinguish sub-plots?

Kindly share your thoughts on the above points.

Thank you.

@ldko
Copy link
Contributor

ldko commented Jun 22, 2020

Hi @xintin ,
Thank you for providing these carefully considered suggestions to improve the Visualizing Tabular Data episode! Here are my responses to your separate points:

  1. It is a globally accepted convention to use an alias for mathplot.pyplot as plt. Can we replace
    import matplotlib.pyplot with import matplotlib.pyplot as plt
    Ref: http://google.github.io/styleguide/pyguide.html#22-imports

This is something that has come up multiple times. For the time being we are not making this change. Please see issue #830 for reasoning.

  1. In the introduction, we are saying that,
    First, we will import the pyplot module from matplotlib and use two of its functions to create and display a heat map of our data
    Can we explain the usage of imshow and show()? Reason being nowhere else in the lesson we are using imshow(). And it leaves me without a clear explanation of why do I need imshow(data) along with show().

Yes, I think it would be helpful to briefly indicate what imshow and show do in no more than 1-2 sentences. Perhaps it could fit in before the sentence that starts "Blue pixels in this heat map represent". If you are willing to add this, please open a PR to add this text.

  1. matplotlib is used to plot bar charts, pie charts, histograms, scatter plots, etc. Shall we make the tutorial more varied with examples apart from line charts alone? We can also include this as an exercise.

I think the time it would take to add more examples of different types of visualizations to the main episode body is prohibitive. I think seeing more of these visualization types would be of interest to learners though, so I think including some of them through exercise(s) that use the inflammation data would be worthwhile. That would help facilitate instructors bringing in more examples when they want to focus on the visualization but skip over them when there is greater need to move to other concepts in the lesson. If you would like to submit such examples, please create one PR per exercise you would like to see included.

  1. In this episode, we are only using matplotlib. Shall we rename the episode as "Visualizing Tabular Data using Matplotlib"? Because we do have seaborn, plotly, and ggplot like packages gaining popularity too.

We do not tend to be that specific in the episode titles. I think the idea here may be to focus on the objective of visualizing data, not the specific libraries we will use to get the job done. We do mention matplotlib in Key Points and as the file name for the episode though. I am interested to hear others' opinions on specifying "Matplotlib" in the episode title.

  1. How about adding a small snippet using set_title() to distinguish sub-plots?

I think people would find this useful. It could be added to the existing code under the Grouping plots heading to add titles to each of the three plots. If you would like, please open a PR specifically for adding titles to the plots.

@xintin
Copy link
Author

xintin commented Jun 22, 2020

Hi @ldko,
Thanks for your thoughts. I agree with the reasoning for the 1. above. For the rest, I would probably try to submit the feasible PRs over the weekend. In the meantime, if someone wants to contribute, please feel free to do the needful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants