# Welcome to DATA_SCI 7040 Big Data Visualization 

This course will cover visualization techniques and methods for a broad range of data types prevalent in engineering disciplines, life sciences, media, and business. Theoretical and practical aspects of information visualization and exploratory data visualization will be taught with a hands-on approach to give students experience in handling data with a set of tools and programming environments. Topics will include visual perception and distortions, color theory, preattentive processing, data types and models, visual variables, efficient visualizations, design principles, grammar of graphics, spatial visualization, maps, graph theory network visualization, data storytelling including hands-on programming to create plots, charts, heatmaps, spatial and network visualizations using R and Python libraries.


Please see below for the suggested schedule and course repository instructions. 

### Course Module Schedule


1. [Module One: Introduction to Data Visualization](modules/module1/Schedule.ipynb)

  - Introduction
  - Types of visualization: infovis vs. sciviz
  - Impact of visualization: classical examples
    - Anscombe's quartet, cholera map, Minard's map of 1812, etc.
  - Why visualize data
  - Practice and exercise: Anscombe's quartet recreation, simple scatter plot with ggplot2
---
2. [Module Two: Color and Human Visual Perception](modules/module2/Schedule.ipynb)

   - Perceptual distortions in color: simul. contrast, color blindness, etc.
   - Using colors for quantitative vs. qualitative data
     - choosing color maps: colorbrewer
     - bad example: avoid rainbow color map
   -  Preattentive Processing
    - basic visual properties
    - perceptual/spatial grouping
    - Gestalt principles
   - Practice and exercise: samples of color maps from colorbrewer, changing colormaps in ggplot2
---
3. [Module Three: Elements of visualization](modules/module3/Schedule.ipynb)

   - Data types: nominal, ordinal, quantitative
   - Data models vs. Conceptual models
   - Visual variables: position, size, color, shape, etc.
     - characteristics of visual variables
     - Steven's power law  
   - Efficient/effective vs. ineffective visualizations
   - Practice and exercise
     - bad example of visualization: scatterplots in R and Python with wrong visual variables     
     - Online: Design Test : https://www.perceptualedge.com/files/GraphDesignIQ.html
     - ggplot2 practice: scatter plots, line graphs, bar graphs, box and whisker plots.
---
4. [Module Four: Visualization Design Principles](modules/module4/Schedule.ipynb)

   - Tufte's design principles
   - Lie factor
   - Graphical integrity
   - Data-ink ratio
   - Data density, chart junk
   - Visualization analysis
   - Practice and exercise
     - introduction to plot.ly: topic 3 examples in plot.ly
     - ggplot2 practice: Histograms, heat maps     
---
5. [Module Five: Grammar of Graphics](modules/module5/Schedule.ipynb)

   - Components of the layered grammar  
   - Separating visual design into levels (Tamara Munzner)
   - ggplot2 examples of layered approach
   - Practice and exercise
     - population pyramids, bubble charts
     - Online: gapminder.org - recreating gapminder-like bubble charts
---
6. [Module Six: Spatial Visualization](modules/module6/Schedule.ipynb)

   - Maps and projections
   - Choropleth maps
   - Visualizing trajectory data
   - Practice and exercise
     - Choropleth map examples in R and plot.ly (election map, population map, etc.)
     - Spatial data example: NYC taxi or similar data in R
---
7. [Module Seven: Network Visualization](modules/module7/Schedule.ipynb)

   - Graph theory
   - Basic graph drawing and layouts
   - Network Visualization and analysis
   - Trees (hyperbolic, radial, treemaps)
   - Practice and exercise
     - treemaps
     - plot.ly and/or D3 examples of network visualization
     - igraph and/or graphviz examples 
---
8. [Module Eight: Storytelling with Visualizations](modules/module8/Schedule.ipynb)

  - Narrative visualizations
  - Data Storytelling
  - Online: www.presentationzen.com, http://www.storytellingwithdata.com/, http://style.org

---

### Resources
Weekly videos and discussion topics will be available on Canvas. Course help and communication will be on Slack. Each module will have reading links and other resources available here on JupyterHub. Please consult the weekly module schedule notebook for details and other particulars. 

This course requires you programming in R and Python using visualization libraries such as 
ggplot2, plotly, plotnine, seaborn, etc. All of these software resources are freely available on Internet and already installed on JupyterHub for your use. You do not need to install anything on your local computer other than a **current version of Chrome browser**. 


### Weekly Module Download
Each Saturday, the new module will become available. You will be using **```git pull```** to acquire the module material to your account on europa.dsa.missouri.edu (JupyterHub).
Clicking on the modules above before the content is pulled is expected to give you a 404 Error.
#### Steps:
1. Open Terminal in JupyterHub
2. Change into the course folder
```bash
cd f22dsa7040_yourusername
```
3. Execute command:
```bash
git pull --no-edit upstream master
```


### Weekly Module Schedule
Every module can be worked through at your own pace within the timeframe of the module. Modules are 7 days long from Saturday to the Saturday of the following week. **The only component that must be completed by a particular day is the discussion as the other students will be required to respond to several submissions. Posting by the intended date will allow others adequate time to respond to your post.** 

**Saturday - Wednesday** : 
 - Labs
 - Videos
 - Readings
 - Practices
 
**Wednesday**
 - Post Discussion 
 
**Thursday**
 - Respond to Discussions
 
**Wednesday - Saturday**
 - Exercises


**Module exercises will be collected midnight central time on Saturdays.**
  You will push your exercises notebook (and other notebooks if you choose to do so) to the git repository for submission of the exercises. 
#### Steps:
1. Open Terminal in JupyterHub
2. Change into the course folder

```bash
cd f22dsa7040_yourusername
```
3. Execute command:

```bash
git add . --all 
git commit -a -m "completed exercises module X"
git push origin master 
```
`X` stands for the module number. 
