# Submodule 2 - Data Science Life Cycles, FAIR Data Principles, Data-Centric AI/ML, and Responsible AI/M


## Overview
This submodule will cover Data Science life cycle, FAIR principles for responsible data management, systematically engineering the data used to build an AI/ML system, and understand fairness, transparency, and accountability in AI/ML development and deployment.

## Learning Objectives
At the end of this module, you should be able to:

+ Understand the data science life cycle
+ FAIR data principles and metrics to measure the FAIRness of a digital resource
+ The discipline of systematically engineering the data used to build an AI/ML system
+ The development and deployment of AI/ML systems that are ethical, fair, transparent, and accountable

## Prerequisites
* An AWS account with access to Amazon SageMaker
* Basic understanding of Python programming

## Get Started
- Watch the Lecture Videos.
- Complete the Quizzes to solidify your understanding.
- Enhance your programming skills with Tutorials.
- Challenge yourself with the Exercises.

## Data Science Life Cycle

Data Science has deep roots in the world of business analytics where decades of work has leveraged customer databases, financial data, and other metrics to optimize and predict the success of business models.  Many of the same concepts that apply in business are relevant to all applications of data science.  In this lecture we will examine the data science lifecycle and generalize it to a concept that can easily be applied to any scientific data analysis.

### Lecture Video

In [None]:
from IPython.display import YouTubeVideo

# Youtube
YouTubeVideo(id='data_science_life_cycles', height=200, width=400)

### Lecture Slides

Download the lecture slides [Data Science Life Cycle](Submodule_2/Lectures/Submodule_2_Lecture_1_Data_Science_Life_Cycle.pptx).

### Quizzes
+ [Data Science Life Cycle](Submodule_2/Quizzes/Data_Science_Life_Cycle_Quiz.ipynb)

## FAIR Data Principles and FAIRness Metrics
This section covers the principles of FAIR data analysis.  As data has become more and more prevalent and our ability to analyze and reanalyze data has increased, there has been a growing need for standards to make sure that data produced today will still be accessible and usable tomorrow.  The FAIR data principles are a set of guidelines for how to achieve this.  Funding agencies including the NIH and NSF are increasingly asking (or requiring) their funded research to comply to these standards.

### Lecture Video

In [None]:
from IPython.display import YouTubeVideo

# Youtube
YouTubeVideo(id='fair_data_principles_fairness_metrics', height=200, width=400)

### Lecture Slides

Download the lecture slides [FAIR Data Principles and FAIRness Metrics](Submodule_2/Lectures/Submodule_2_Lecture_2_FAIR_Data_Principles_and_FAIRness_Metrics.pptx).

### Quizzes
+ [FAIR Data Principles and FAIRness Metrics](Submodule_2/Quizzes/FAIR_Data_Principles_and_FAIRness_Metrics_Quiz.ipynb)

## Responsible AI/ML 
In this section we will introduce responsible AI/ML, common biases in AI/ML datasets, fairness in machine learning life cycle, how to measure fairness, how to do bias mitigation and finally some responsible AI/ML tools.

#### Lecture Video

In [None]:
from IPython.display import YouTubeVideo

# Youtube
YouTubeVideo(id='responsible_ai_ml', height=200, width=400)

### Lecture Slides

Download the lecture slides [Responsible AI/ML](Submodule_2/Lectures/Submodule_2_Lecture_4_Responsible_AL_ML.pptx).

### Quizzes
+ [Responsible AI/ML](Submodule_2/Quizzes/Responsible_AI_ML_Quiz.ipynb)

## Tutorials


## Exercises


## Conclusions
After this module, learners should have a basic understanding of AI/ML, its core concepts and applications.

## Clean up
A reminder to shutdown VM and delete any relevant resources. <br><br>