Skip to content

senthilsnat/PythonDataVizTutorials

Repository files navigation

Python Data Visualization Tutorials

I've been working in data analytics pretty much since the back half of my junior year of college. When I first started at Rice, I was totally unsure of what I wanted to do career-wise, but I'd like to think I've gotten a little bit more clarity on my direction since then. Thanks to sports, I've come to really love working with data. Not just in the number-crunching or modeling sense, but also with data visualization.

My focus on data visualization has really picked up over the last year and a half or so. When I showed Elijah Meeks my network graphs from my first sports analytics project back in 2016, he audibly revolted (or as audibly as you can do on a digital medium, I suppose). Everyone has to start somewhere. And that's important! Even today, I've still got 10 tabs of package documentation open as I code. But over the last couple years spent dedicated to improving my coding and technical data viz skills, I've picked up some best practices and re-usable modules that I like to think of as my guide points. I'm laying them out in this repository for anyone else looking to do data visualization in the wonderful world of Python.


Newly Updated!

  • Advanced Packages: While Seaborn and Matplotlib can get you fairly far, features like interactivity or unique chart types can be lacking. However, there are some really powerful other 3rd party packages out there in the Python ecosystem. In this notebook, I'll take a little bit of time to discuss these packages, show how to get familiar with the syntax, and show some unique examples highlighting the capabilities of each package.
  • Style Sheets: The default Matplotlib styling can be a great source of consternation, and everybody will have their own unique styles and needs. However, it can be tedious to write the same repetitive code over and over to customize each graphic. Fortunately, we can pretty easily customize persistent configurations for our graphs, and I'll demonstrate the various ways of doing so in this notebook.

Q: So what's the best way to learn data visualization in Python?

A: Reading documentation, as with most coding, is honestly the best place to get intimately familiar. But, also as with most coding, the second best way is to just start by doing. I'm hoping that this repository eases your transition to "just doing" or maybe shows a technique that you may not have known previously. What really works for me is just taking examples that I find in the docs for a library and then googling/reading documentation as I manipulate every aspect of the example until I'm satisfied with the variety of outputs. What also works for me is creating this repo so I don't forget the things that I've learned....

Q: So what's contained in this repo?

A: Ah, here we come to the crux of the subject. Using completely public or synthetic data, I've built a set of notebooks that span a range of topics in Python data viz, from fundamentals to different graph types to building complex objects like ridge plots. The following jupyter notebooks (which can render completely in GitHub in your browsers) are included currently:

  • Part 1 Key Principles: Fundamentals and best practices of data visualization using Matplotlib, Python's ground level data viz library
  • Part 2 Archtypes of Viz: Creating various different data viz archetypes and discussing use cases
  • Part 3 Analytics Viz: Demonstrating the incorporation of data viz as part of the scientific/research process
  • Part 4 Complex Viz Manipulation: Techniques for manipulating our fundamental viz templates to create some more complex data viz and data viz systems
  • Part 5 Style Sheets: Methods for setting up graph styling customizations that will persist through your notebook or script
  • Part 6 Advanced 3rd Party Packages: Introducing and discussing powerful packages to extend our data visualization capabilities in Python

Q: I'm not able to preview the dynamic graphs or interactive modules in Github. What do I do?

A: You could always clone the repo, but if you want to see the entire rendered notebook, I recommend using NBViewer, which, unlike Github's preview interface, will display JS and rich graphs.

Q: What's still to come?

A: This repo is a living destination, and I consider it to be perpetually a work in progress, as I continue learning and get new inspiration. With that said, there's still some more notebooks that I'm building out already:

  • Interactive visualizations, showing how to build animations as well as interactive controls inside a jupyter notebook

Q: Contact me with feedback!

A: So this is not really a question, but please reach out if you have any feedback or have any requests/ideas/inspiration. I can be found on Twitter @SENTHIS.

About

A set of comprehensive (but certainly non exhaustive) tutorials for data visualization with Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published