A workshop from PyTennessee 2018
You can go through the tutorial online by clicking the following button:
Be aware that launching this environment online usually takes at least 10 minutes with a good internet connection. If you don't have a good internet connection or don't want to wait that long, go through the Installing Environment & Dependencies section below (locally installing the environment takes 5 minutes).
This tutorial covers methods of performing several visualization techniques in Python with a particular focus on exploratory research and data cleaning. It uses a dataset of Stack Overflow posts from the first two weeks of September 2017 compiled from the Stack Exchange Data Explorer.
- Data Completeness
- Text Visualization
- Visualize Time
- Visualize Network Connections
- Mapping Location
To use the scripts in this repository, you must have Anaconda installed on the systems that will be running the scripts. This will simplify the process of installing all the dependencies.
If you are unsure if you have Anaconda on your machine run conda -h
in your terminal. This should bring up a help message. If you get a command not found error, follow the installation instructions here. After installation, you may still need to add Anaconda to your path variable. If conda -h
still doesn't work, see instructions on adding Anaconda to your path in the Anaconda installation instructions.
For reference on Anaconda environments, see: https://conda.io/docs/user-guide/tasks/manage-environments.html
The package counts with a Makefile with useful functions. You must use this Makefile to ensure that you have all the necessary dependencies, as well as the correct conda environment.
- Show all available functions in the Makefile
$: make show-help
Available rules:
clean Delete all compiled Python files
environment Set up python interpreter environment - Using environment.yml
remove_environment Delete python interpreter environment
test_environment Test python environment is setup correctly
update_environment Update python interpreter environment
- Create the environment from the
environment.yml
file:
make environment
- Activate the new environment pytn_viz_2018.
source activate pytn_viz_2018
- To update the
environment.yml
file (when the required packages have changed):
make update_environment
- Deactivate the new environment:
source deactivate
If you are trying to install the environment on a mac and get an XCode error, you may need to run
xcode-select install
Thank you to:
- Victor Calderon - for helping me set up the environment
- Gayathri Narasimham - for inspiring me to do this talk and getting me interested in forum data
- Rachael Brady - for teaching me the power of visualization and being a guinea pig for this talk