Data literacy workshop
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
01-getting-started
02-cleaning-data
03-sharing-data
04-advanced-analysis
assets
epilogue
.gitignore
README.md

README.md

Have suggestions or feedback? Please let me know! rob [at] uohack [dot] com or file a pull request!

We'll be hosting this workshop Tuesdays from 6-8 p.m. in Knight Library room 144. Schedule below:

Date Subject
Oct. 17 Getting started, the basics of data
Oct. 24 Cleaning data, spreadsheets and formulas
Oct. 31 Halloween, we'll take this week off
Nov. 7 Sharing data, the basics of data visualization
Nov. 14 Advanced analysis with Python Pandas

Introduction

Data is everywhere.

We interact with data everyday in thousands of ways and data literacy is seldom taught in schools. No matter the industry you wish to enter, no matter what it is you want to do with your life, data will be a part of it.

Data is certainly nothing new.

But recently, processing data has become extremely cheap and efficient. More than 30 years ago, accountants had large spreadsheets laid out over big desks and counted numbers by hand. If one cell changed, it might take a day or more to update the rest of the sheet.

Now computers can render changes to a spreadsheet instantaneously and process thousands of data points in near-real time to give you complex analysis as the data changes.

Everyone should learn basic data literacy skills.

To emphasize the point that data has become an extremely important concept to learn about, not just in America but around the world, let's look at some data.

I did a few quick Google searches to see how influential three key phrases are on the internet, as a rough snapshot of their overall popularity. The following information was gathered in June 2017.

First, I tried searching for the phrase university of oregon, which returned 124 million results. Not bad, but let's expand that a little to a world-wide topic like football, which returns 1.4 billion results (11 times higher than university of oregon).

Now, let's try the word data. This returns 5.6 billion results, four times more than football and 45 times more than university of oregon.

These numbers are big and can seem fuzzy when compared anecdotally. Let's look at them in two other ways. First, as a table:

term results
university of oregon 124,000,000
football 1,400,000,000
data 5,600,000,000

Since this is a small, simple data set, seeing the figures in a table can help put them in perspective. In this case, right-aligning the numbers helps add context. Another way to visualize the data is, of course, a chart.

chart

This simple chart, made in Google Sheets, gives the reader an additional way to compare the numbers.

While all three ways of viewing data (anecdotally, table and chart) are all technically correct, they each provide a different experience. If you are going to be analyzing data, you also need to keep in mind how you will communicate your findings.

Data literacy is a skill that can be learned and should be practiced.

Course overview

This is a four-course introduction on data literacy. We will start from zero, with no prior knowledge required, and work our way up to advanced data analysis using Python.

My goal is to introduce you to these topics and give you the tools to begin working with data. You will see several examples, all of which use real-world data, and learn different techniques to work with the various types of information.

Like learning anything else, you will need to practice in order to get better. Unfortunately, this takes time and effort. Fortunately, the tools are largely open-source and free. If you pay attention to these four courses, you will be equipped with a solid foundation to tackle any data set.

Let's get going.

Course list

  • 01 - Getting started
    • Basic steps of working with data
    • CSV (Comma-Separated Values)
    • Get data into a spreadsheet
    • Example: Lane County pot shop delivery
  • 02 - Cleaning data
    • Basic spreadsheet formulas
    • Percent change
    • Example: Extract data from PDF (city budget) and clean
  • 03 - Sharing data
    • Being transparent with data
    • Types of data visualizations
    • Example: Query federal data and create map
  • 04 - Advanced analysis
    • Use Python pandas to analyze data
    • Example: Examine campus parking citations using Pandas

Other resources

Of course, my four courses here are a very short introduction to a massive world of information. Here are some additional resources depending on what you're interested in.

What great books and blog posts am I missing? Let me know! rob [at] uohack [dot] com