This is the repository for the LinkedIn Learning course Protecting Data for Analysis and Machine Learning. The full course is available from LinkedIn Learning.
Data security is the process of maintaining the confidentiality, integrity, and availability of an organization’s data. More simply put, it is the process of protecting data from unauthorized access, corruption, or theft. There are several potential consequences that organizations and individuals can face due to bad data security practices. In today’s tech landscape with the increase in use of data in analysis, machine learning models, and AI, it is more important than ever for everyone in an organization to have a solid understanding of data security practices in order to help keep organizations–and themselves–safe. In this course, learn the basics of data security and its potential consequences if ignored. Instructor Monica Royal explores the best practices for protecting the data analytics pipeline and demonstrates some of the most common data anonymization techniques.
This repository has several files you can use to protect your data!
This file is what you will use to follow along with the course code and write it yourself. You can add code and run it right in the Codespace. You may have to install extensions if prompted. If so, select the Python Kernel when asked which kernel to use.
This is the same as the Practice_Course-Notebook file, but with the code already filled out. You can run the code directly from this notebook if you'd prefer to have the code in its final state.
This is a helpful list of data security terms you can reference once the course is complete.
This has some professional organizations you can join if you want to stay current in data security.
Data Professional, Instructor, Host of Data Podcast for Nerds