Skip to content

Latest commit

 

History

History
197 lines (132 loc) · 6.35 KB

syllabus.md

File metadata and controls

197 lines (132 loc) · 6.35 KB
title school year semester course_name coursenum slack
Syllabus
CU
2022
spring
Security Analytics
MSBX 5500
Sections

{% for item in page.coursenum %}- {{ item }} {% endfor %}

Instructor : Dave Eargle (contact)

Class : Fridays, 9:30am - 12:00am

Office Hours : See canvas

Slack : {{page.slack}}

Course Description

This class is offered within the security analytics track of the business analytics masters at CU Boulder.

This class explores the application of data analytics to the domain of information security. It uses python machine learning libraries to both build and deploy models for both supervised and unsupervised modeling algorithms. Business problem contexts include classifying the likelihood that a file or website is malicious based on either extracted static indicators or dynamic behavioral analysis (predictive analytics), as well as network anomaly detection on organizational network traffic data or on user account usage (unsupervised machine learning).

Consider this sage prediction from 2020, still relevant for us today:

The year 2020 expects to see an increase in the preventative approach of deep learning environments, which will become outdated and dangerous. TTPs will continue to evolve cyber threats; we’ll fight AI with AI. Drones hovering outside office windows will discuss ML and AI to combat the threat landscape. These AI will announce a strike over Twitter, the first monumental disruption in 2020.

Real-time data and analytics and machine learning and AI creates unpreparedness by corporations and Big Tech companies. Managed detection engines are built on human made logic, but keeping this up-to-date against the latest studies costs almost three million cyber security. Perhaps the most attention raised by increasingly employed AI-based solutions is our need to reconsider our notions of what makes a mistake.

-- Kelly Shortridge, VP of product strategy at Capsule8's, bot.

Prereqs

This class has the following prerequisites:

  • Proficiency in (or concurrent coursework on) concepts of computer networking and information security management.
  • Proficiency in the basics of statistics and machine learning, as well as in using Python to perform the same.

Students in the security analytics track of the MSBA pass both prerequistes. MBAs may take the class, but only if they demonstrate competency to me in the two above prereqs. This is not a "learn python ML from scratch" class.

Learning Outcomes

Synopsis: Students will use security-related datasets to practice and demonstrate comprehension of principles of reproducible data science and deploying machine learning models.

Use code versioning and collaboration tools : Includes:

  • Use Git and Github responsibly
  • Write markdown
  • Submit pull requests

Use cloud computing for data science : Includes:

  • Spin up Jupyter notebooks on cloud instances. Fast.

Do reproducible data science : Includes:

  • Share your code, its results, and (maybe) your data
  • "Works on my machine": specify environments using tools like venv, MyBinder, and Docker

Deploy machine learning models with python : Includes:

  • "Pickle" (serialize) models
  • Use "pipelines" for generalizable ML processes
  • Choose cutoff thresholds via optimizing F1
  • Create APIs to consume serialized models
  • Deploy models to cloud platforms, such as Heroku or AWS or GCP API ML endpoints

Communication

Canvas for announcements, and Slack for async watercooler-type banter and help-requests.

You aren't required to read and be aware of everything that happens on slack to complete assignments. If it comes up in slack that something in the assignments is wrong, or broken, I'll make an effort to update the assignment document webpage. Therefore, don't rely on offline, potentially out-of-date assignment documents.

You are required to be aware of all announcements I make via Canvas.

Technology Requirements

You need python3 on a computer that you bring to class. Besides that, you just need a stable internet connection to use cloud computing resources.

Text Materials

You must grok this book:

Foster Provost and Tom Fawcett (2013), Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking. O’Reilly Media. Available on Amazon

Warning -- the kindle version has crummy images. The book is not expensive.

Assignments

Labs

This is a labs-based class. I will typically give you at least a week to complete each assignment. I may assign as many as one lab per week.

Labs may include submitting zip files of your code repositories, or submitting screenshot-evidence of task completion.

Readings

I will assign you some readings and associated open-book quizzes from Provost and Fawcett, and from other sources.

Final Exam

The final exam will include conceptual questions covering topics from lecture.

Participation

Most students will earn 80% of these points. Students who are exceptional and go above and beyond in enhancing the classroom experience may receive a higher score.

The following list is not comprehensive, but rather an example of items considered for the class participation score:

  • Attend and participate in class sessions (attendance is required!)
    • Making good efforts to complete in-class activities
  • Participate on the class Slack workspace.
    • Be a community member.
    • Ask good questions
    • Answer questions
    • Use slack reactions

Point Distribution

Category Weight
Labs 65
Reading Quizzes 5
Participation 10
Final Exam 20

Late Work

All assignments and projects are to be submitted on time or early, so plan accordingly. If you must miss class, please submit your assignment early. On rare occasions, an exception may be granted, allowing the student to submit the work late with a 20% penalty. Under no circumstances will anything be accepted more than a week late.

{% include required-syllabus-statements-spring-2022.md %}