Instructor: Dustin White, PhD
Email: drwhite@unomaha.edu
This course is designed to help you learn about the most important topics and methods that make up modern data analytics. We will discuss how to use data tools to improve your understanding of business decisions, and how to communicate clearly with data analysts about the goals of your team, and conveying results from data analysis projects to managers. This course is intended to help you build a bridge between technical teams and management, providing value through an ability to communicate business needs to technical teams and technical results to managers who may not be familiar with analytics methods.
We will cover eight topics during this course. Lectures/videos will be relatively short. The rest of the time that you spend in this course will be a series of REQUIRED labs in which you will use data analytics to implement incremental class projects. The goal of this course is to improve your ability to take advantage of data analytics in your career. If there are things that we are not covering that you feel are critical, please bring them up to me so that I can help you with the topics you care most about!
The assignments for this class will be made up of reading assignments from two books:
-
Predictive Analytics, by Eric Siegel (ISBN:978-1119145677)
-
Data Science for Business by Foster Provost and Tom Fawcett (ISBN:978-1449361327).
For each topic, you will write a one to two page summary applying the readings to your job (or to your life or hobbies if you are a full-time student), and how they could be implemented by you or your team at work.
Topics 1-4 will cover the material needed for the first project in the course (which will be due after we begin topic 5), and topics 5-8 will correspond to the second semester project.
Introduction to Data Analytics – This week we will discuss what data analytics is and what it is not. Data analytics is kind of a hot thing right now, and we want to make sure that we understand some limitations of using data. Topics will include: defining data analytics, flaws in data, omitted variables, and asking the right questions. In lab, we will get acquainted with data by using Tableau software to visualize and get to know some interesting data.
Read PA: Intro and DSB: Ch. 1
SQL and Databases – This week we take a critical step in making use of very large datasets: we need to learn how to access small segments of them quickly, and to construct basic explanations of that data, so that we can understand what the data might be used for. I will walk you through what SQL is, and how to use it to query a database. In lab, we will use our time to find an interesting subset of a database for our first project, and construct summary statistics for that data.
Read PA: Ch. 1 and DSB: Ch. 2
Data Visualization – Now that we have done some basic work in both creating smaller, more tractable data, and in basic data visualization, we are ready to talk about the importance of visualizing data, and using that to inform our analysis. We will talk about good and bad visualizations, and how to tell the story of your data using visuals. In lab, we will prepare high-quality visualizations that we could use to present and explain important aspects of the data that we explored in Week 2.
Read PA: Ch. 2 and DSB: Ch. 3
Introduction to Regressions – After spending the past few weeks exploring our data, and getting our heads around the questions that we might want to ask of our data, we will talk about classical methods of data analysis. We will spend some time talking about what correlations and regressions are, and about when they are the best tools for the job. In lab, we will take our data, and explore the effects of variables that we choose on some outcome that we care about. We will be able to combine this information with our visualizations and summary statistics from previous weeks into a nice briefing for our “boss” about what we learned from our data, and why he/she should believe our discovery.
Read PA: Ch. 3 and DSB: Ch. 4
Classification and Supervised Learning – For the second half of the course, we will focus primarily on classification techniques. This week, we will discuss what classification is, and why it is so popular in data analytics. We will also define supervised and unsupervised learning, and look at an example of each. Lab this week will be focused on trying to visually classify individuals using plotting and other visualizations of our data.
Read PA: Ch. 4 and DSB: Ch. 5
Decision Trees – Decision trees are one of the simplest and most powerful analytic tools in use today. We will talk about what a decision tree is, what makes it different from a histogram classifier, how it is “grown”, and how it is “pruned” in order to come to the best tree for making out-of-sample classifications of observed data. Lab will consist of preparing data for use in decision tree algorithms, as well as training a decision tree in order evaluate its accuracy in out-of-sample predictions.
Read PA: Ch. 5 and DSB: Ch. 6
Ensemble Methods – Just like people, an individual algorithm might be biased. By combining many decision trees or other machine learning algorithms, we can come to a more accurate estimate based on the available data. We will discuss how ensemble methods work, their advantages, and their use in industry. In lab, we will generate random forests of decision trees in order to improve the accuracy of our predictions from last lab.
Read PA: Ch. 6 and DSB: Ch. 7
Data Maturity and Cutting Edge Models – The tools we have learned so far have been easy to apply, and they work in ways that are easy to visualize. More advanced techniques such as neural networks perform the same tasks, but can be applied to many more difficult problems such as natural language processing and image processing. We will go over the basic concepts of neural networks in class, and explore a simple neural net in lab in order to compare it to the other techniques we have learned.
Read PA: Ch. 7
All assignments are due by 11:59 PM, and will be submitted through Canvas. No late assignments will be accepted for any reason. S ee the Late Policy below.
Assignment | Point Value |
---|---|
Reading Assignments | 200 |
Project 1 | 350 |
Project 2 | 350 |
Discussion Participation | 100 |
Grade | Threshold | Grade | Threshold |
---|---|---|---|
A | 93-100% | C | 73-76.9% |
A- | 90-92.9% | C- | 70-72.9% |
B+ | 87-89.9% | D+ | 67-69.9% |
B | 83-86.9% | D | 63-66.9% |
B- | 80-82.9% | D- | 60-62.9% |
C+ | 77-79.9% | F | 59.9% or less |
I will not accept late assignments or quizzes for any reason, including technical or internet issues. Remember that work can be done ahead of time if you think that you will be unable to complete an assignment during the week that it is due! Having access to a computer and internet connection are your responsibility when enrolling in an online course.
- Demonstrate understanding of the most important methods underlying modern data analytics
- Demonstrate the ability to use data tools to make business decisions
- Clearly communicate results from data analysis projects to a lay audience
Reach out immediately! Please note technical issues are not a reason for accepting late assignments. Do not wait until the last minute to complete your assignments.
- Take your time to complete the readings, video modules, and projects. Waiting until the last minute to complete them is not recommended. I have taught this class often enough to know a last-minute project when I see one.
- Be involved in discussion. It's not required, but it IS where you can learn more and discuss applications!
Reasonable accommodations are provided for students who are registered with Disability Services and make their requests sufficiently in advance. For more information, contact Disability Services (MBSC 111, Phone: 554-2872, TTY: 554-3799) or go to the website: www.unomaha.edu/student-life/inclusion/disability-services. Please meet with me so we can discuss your unique accommodations.
All students are required to adhere to the highest standards of academic integrity and behavior and must satisfy the UNO Academic Integrity Policy and the Student Code of Conduct. It is the student’s responsibility to read, understand, and abide by these policies. Any cheating in the class (exams, quizzes, etc.) will result in the individual(s) involved failing the course and the name(s) of the individual(s) being reported to the UNO administration which may result in being expelled from the College or University. Academic dishonesty includes, but is not limited to:
- Copying or attempting to copy (in whole or in part) from another student’s assignment.
- Allowing a student to copy or attempt to copy (in whole or in part) from your assignment.
- Copying or downloading another student’s assignment (in whole or in part) and submitting it as your own work.
- Using or attempting to use unauthorized materials or notes during an exam.
- Sharing information during an examination.
- Engaging or attempting to engage the assistance of another individual in misrepresenting academic performance on any graded assignment.
- Attempting to take credit for the intellectual creation of others as one’s own work.
- Helping or attempting to help another student to commit an act of academic dishonesty.
- Changing or destroying scores or grading marks on any graded assignment.
- Fabricating an excuse (such as illness, injury, accident) to avoid academic work in order to avoid or delay timely submission of any graded assignment.
- Copying questions (by hand or electronically) from exams.
- Photographing exams.
- Giving or receiving information about a test, practice problem or assignment to/from students in your online course or in other sections of the course.