# Advice for applying machine learning

## Deciding what to try next

#### **Key Considerations in Machine Learning Projects**
1. **Effective Decision-Making**
   - Efficient use of machine learning tools depends on making good decisions about next steps.
   - Teams can vary in how quickly they can build effective systems—skill and strategy matter.

2. **Typical Steps When Facing Errors**
   - **Regularized Linear Regression Example:**
     - If predictions are poor, consider:
       - **More Training Data:** Can improve model performance, but not always guaranteed.
       - **Feature Reduction:** Simplify the model by using fewer features.
       - **Feature Expansion:** Add new features or polynomial terms (e.g., \( x_1^2 \), \( x_2^2 \)).
       - **Regularization Parameter (\(\lambda\)):** Adjust \(\lambda\) to control regularization strength.

3. **Diagnostic Testing**
   - **Purpose:** Gain insights into what is working or not, and guide improvements.
   - **Impact:** Diagnostics can save time by revealing whether efforts (e.g., collecting more data) are worthwhile.
   - **Example Diagnostics:**
     - **Data Collection Impact:** Determine if additional data will improve performance.
     - **Feature Engineering:** Assess whether new or modified features are beneficial.
     - **Regularization Tuning:** Check if adjusting \(\lambda\) leads to better results.

4. **Performance Evaluation**
   - Before proceeding with extensive changes, evaluate current performance to understand where improvements are needed.
   - Use diagnostic tests to guide decisions on model adjustments.

5. **Importance of Diagnostics**
   - Diagnostics may take time but are crucial for optimizing model performance and resource allocation.

This summary covers essential strategies for building effective machine learning systems and emphasizes the importance of diagnostic testing in making informed decisions.

## Evaluating a model

### Academic Summary: Evaluating Machine Learning Models

#### **Evaluating Model Performance**

1. **Model Evaluation Overview**
   - Systematic evaluation of a model’s performance helps identify how well it generalizes to new data.
   - Example: Predicting housing prices with a polynomial regression model.

2. **Challenges with Model Evaluation**
   - **High Order Polynomial Issue:**
     - A fourth-order polynomial may fit training data well but may not generalize to new examples (overfitting).
     - Visualization of high-dimensional models (more than two features) is difficult.

3. **Evaluation Procedure**
   - **Splitting Data:**
     - Divide the dataset into **training** and **test** sets (e.g., 70% training, 30% test).
     - **Training Set:** Used to fit model parameters.
     - **Test Set:** Used to evaluate model performance.

4. **Performance Metrics**
   - **Regression Problems:**
     - **Training Error (J_train):** Average squared error on the training set.
       $$J_{\text{train}} = \frac{1}{2m_{\text{train}}} \sum_{i=1}^{m_{\text{train}}} (\hat{y}_i - y_i)^2$$
     - **Test Error (J_test):** Average squared error on the test set, excluding regularization term.
       $$J_{\text{test}} = \frac{1}{2m_{\text{test}}} \sum_{i=1}^{m_{\text{test}}} (\hat{y}_i - y_i)^2$$
     - High training error with low test error suggests overfitting.

   - **Classification Problems:**
     - **Training Error:** Fraction of misclassified examples in the training set.
     - **Test Error:** Fraction of misclassified examples in the test set.
       $$J_{\text{test}} = \text{Fraction of test examples misclassified}$$
     - For logistic regression, compute both logistic loss and misclassification rates.

5. **Systematic Evaluation Process**
   - Splitting data and calculating both training and test errors helps evaluate and compare models.
   - Helps determine whether a model is overfitting or underfitting.

6. **Next Steps**
   - Use evaluation metrics to guide decisions on model complexity and feature engineering.
   - Explore further refinements to choose optimal models based on performance metrics.

This summary outlines the process and importance of evaluating machine learning models, including how to split data, calculate training and test errors, and interpret results for improving model performance.

## Model selection and training/cross validation/test sets

To evaluate and select the best model for your machine learning application, you need a systematic approach to gauge how well your model performs, especially when dealing with multiple model choices. Here’s a summary of the key steps and concepts discussed in the video:

### **1. Training and Test Error**

- **Training Error** ($J_{\text{train}}$): This is the average error of your model on the training set. It’s calculated as the average squared error (for regression) or average misclassification rate (for classification) over the training data. A very low training error indicates the model fits the training data well, but this alone does not guarantee good performance on new, unseen data.

- **Test Error** ($J_{\text{test}}$): This is the average error of the model on a separate test set, which wasn’t used during training. It provides an estimate of how well the model generalizes to new data. If the test error is high, it suggests the model may be overfitting the training data and not generalizing well.

### **2. Choosing the Model**

When choosing between different models (e.g., different polynomial degrees for regression), you might:
- Fit models with varying complexity (e.g., polynomial degrees) to your training data.
- Evaluate each model’s performance using the test set error ($J_{\text{test}}$).
- Select the model with the lowest test error as the best model.

However, evaluating model performance using the test set during model selection can lead to an optimistic estimate of the model’s true generalization error. This is because the test set error is calculated after choosing the model based on the test set, leading to a potential bias.

### **3. Cross-Validation**

To avoid this bias, you use a more robust procedure involving three datasets:
- **Training Set**: Used to train the model.
- **Cross-Validation Set (or Validation Set)**: Used to evaluate model performance and select the best model. This helps in tuning model parameters and choosing among different models without using the test set.
- **Test Set**: Used only at the end to get an unbiased estimate of the generalization error of the final model.

### **4. Model Selection Procedure**

1. **Split the Data**:
   - Split your data into training, cross-validation (CV), and test sets.

2. **Train Models**:
   - Train models with different complexities or configurations using the training set.

3. **Evaluate on CV Set**:
   - Evaluate each model on the cross-validation set to determine which model performs best in terms of cross-validation error ($J_{\text{cv}}$).

4. **Choose the Best Model**:
   - Select the model with the lowest cross-validation error.

5. **Estimate Generalization Error**:
   - After choosing the best model, evaluate it on the test set to estimate its generalization error.

### **5. Bias-Variance Tradeoff**

Finally, understanding bias and variance is crucial for diagnosing model performance. Bias refers to errors due to overly simplistic models that don’t capture the underlying patterns (underfitting). Variance refers to errors due to overly complex models that fit the noise in the training data rather than the underlying pattern (overfitting). Balancing bias and variance helps in selecting a model that generalizes well.

This approach ensures that your model selection process is robust and provides a fair estimate of how well your model will perform on new, unseen data.

# Bias and variance

## Diagnosing bias and variance

1. **Bias and Variance Overview**:
   - **Bias**: Error due to overly simplistic assumptions in the model. High bias means the model is underfitting the data.
   - **Variance**: Error due to excessive sensitivity to small fluctuations in the training data. High variance means the model is overfitting the data.

2. **Diagnostic Approach**:
   - **High Bias (Underfitting)**:
     - **Indicator**: High training error (`J_train`) and high cross-validation error (`J_cv`).
     - **Example**: A linear model for a dataset that requires more complexity (e.g., polynomial regression).

   - **High Variance (Overfitting)**:
     - **Indicator**: Low training error (`J_train`) but high cross-validation error (`J_cv`).
     - **Example**: A high-degree polynomial that fits the training data very well but performs poorly on unseen data.

3. **Bias-Variance Tradeoff**:
   - As the complexity of the model increases (e.g., increasing polynomial degree), `J_train` generally decreases because the model fits the training data better.
   - `J_cv` typically decreases up to a point but then increases as the model becomes too complex and starts overfitting.
   - The goal is to find the model complexity where both `J_train` and `J_cv` are low and close to each other, indicating a good balance.

4. **Visualization**:
   - Plotting `J_train` and `J_cv` against model complexity (e.g., polynomial degree) usually shows a curve where `J_train` decreases and `J_cv` decreases initially but then starts to increase.

5. **High Bias and High Variance Simultaneously**:
   - This scenario is less common but can occur, for example, in neural networks with inadequate architecture or when the model is not suitable for the problem. 
   - **Indicator**: Both `J_train` and `J_cv` are high, with `J_cv` being much higher than `J_train`.

6. **Model Selection**:
   - **Procedure**:
     1. Split data into training, cross-validation, and test sets.
     2. Train models of varying complexity (e.g., polynomial degrees).
     3. Use cross-validation to select the model with the lowest cross-validation error.
     4. Evaluate the final model on the test set for an unbiased estimate of performance.

7. **Regularization**:
   - Regularization can help manage the bias-variance tradeoff by penalizing model complexity, thus preventing overfitting and improving generalization.

### Key Takeaways

- **Bias** affects the ability of the model to fit the training data, while **variance** affects the ability of the model to generalize to new data.
- **High Bias**: Model too simple. Solution: Increase complexity (e.g., higher polynomial degree).
- **High Variance**: Model too complex. Solution: Regularize or simplify the model.
- **Best Practice**: Use a validation set (cross-validation) to tune model parameters and avoid overfitting, then test the final model on a separate test set.

Understanding and diagnosing bias and variance will help in making informed decisions on model complexity and regularization to achieve better performance.

## Regularization and bias/variance

1. **Effect of a Large $\lambda$**: 
   - When $\lambda$ is large (e.g., 10,000), the model will strongly prioritize keeping the parameters $w$ small, leading to underfitting. The model will have high bias, performing poorly on both the training set and the cross-validation set.
   - This results in both the training error $J_{\text{train}}$ and cross-validation error $J_{\text{cv}}$ being large.

2. **Effect of a Small $\lambda$**: 
   - With a very small $\lambda$ (even zero), regularization has little effect, and the model can overfit the training data. This leads to high variance, where $J_{\text{train}}$ is small but $J_{\text{cv}}$ is much larger due to poor generalization.

3. **Intermediate $\lambda$**: 
   - An optimal value of $\lambda$ balances bias and variance, resulting in both $J_{\text{train}}$ and $J_{\text{cv}}$ being small. This reflects a model that generalizes well.

4. **Choosing $\lambda$ Using Cross-Validation**:
   - Cross-validation allows you to test different values of $\lambda$ by computing the cross-validation error for each. The goal is to find the $\lambda$ that minimizes $J_{\text{cv}}$, indicating the best balance of bias and variance.

5. **Plotting Errors as a Function of $\lambda$**:
   - As $\lambda$ increases:
     - $J_{\text{train}}$ increases because the model fits the training set less accurately.
     - $J_{\text{cv}}$ forms a U-shape, with high values when $\lambda$ is too small (overfitting) or too large (underfitting). The minimum point of the curve represents the optimal $\lambda$.

6. **Comparison with Polynomial Degree**:
   - The behavior of the model with respect to $\lambda$ mirrors the behavior when choosing the degree of a polynomial. Low values of $\lambda$ or degree result in overfitting (high variance), while high values lead to underfitting (high bias).

In summary, regularization helps control overfitting, and cross-validation is key for selecting a good regularization parameter $\lambda$. This trade-off between bias and variance, controlled by $\lambda$, is central to building well-performing models.

## Establishing a baseline level of performance

In this example on speech recognition, we're using the error rates on the training set and cross-validation set to analyze whether a model suffers from high bias or high variance. The process of diagnosing bias and variance can be broken down as follows:

1. **Training error (J-train)**: This measures how well the algorithm performs on the training set. In the example, the training error is 10.8%.
2. **Cross-validation error (JCV)**: This measures how well the algorithm generalizes to unseen data. In this case, the cross-validation error is 14.8%.
3. **Human-level performance (baseline)**: This is the error rate that even humans achieve due to the inherent difficulty of the task. For speech recognition, humans might have a 10.6% error rate because of noisy or unclear audio.

### Analyzing Bias and Variance:
- **Bias**: If the training error is significantly higher than the baseline (human-level performance), the algorithm has a high bias problem. In this example, the training error (10.8%) is close to the human-level error (10.6%), so the bias is low.
- **Variance**: The variance is indicated by the gap between the training error and the cross-validation error. In this example, the cross-validation error (14.8%) is significantly higher than the training error (10.8%), suggesting a variance problem.

### Key Concepts:
- **Bias Problem**: If the training error is much higher than the baseline, this indicates the model is underfitting or is too simple to capture the underlying patterns of the data.
- **Variance Problem**: If the cross-validation error is significantly higher than the training error, it indicates the model is overfitting — it’s doing well on the training set but failing to generalize to new, unseen data.

### Example Judgments:
1. **Low Bias, High Variance**: In the speech recognition example, the small difference between the human-level error (10.6%) and the training error (10.8%) suggests low bias. However, the larger gap between the training error (10.8%) and cross-validation error (14.8%) suggests high variance.
   
2. **High Bias, Low Variance**: If the training error was, say, 15% and the cross-validation error was 16%, this would indicate a high bias (since the training error is much higher than the baseline) but low variance (because the cross-validation error isn’t much worse than the training error).

3. **High Bias, High Variance**: If both the training error and cross-validation error were much higher than the baseline, that would suggest the model suffers from both underfitting and overfitting at the same time.

### Conclusion:
By looking at the gap between the human-level error and the training error, you can diagnose bias, and by looking at the gap between the training error and cross-validation error, you can diagnose variance. If both gaps are large, the model may have both high bias and high variance. This analysis is crucial in machine learning for deciding whether to work on reducing bias (e.g., by choosing a more complex model) or variance (e.g., by regularizing the model).

## Learning curves

The concept of **learning curves** is crucial for understanding how well your machine learning algorithm performs as it gains more experience or more training data. When we plot learning curves, we typically track two key metrics: the **training error (J_train)** and the **cross-validation error (J_cv)** as a function of the number of training examples (**m_train**).

### Understanding the Learning Curves:
- **J_train**: Measures how well the model fits the training data. As the number of training examples increases, it becomes harder for the model to perfectly fit all the data points, so the training error tends to increase.
- **J_cv**: Reflects how well the model generalizes to unseen data. As you provide more training examples, the model typically becomes better at generalizing, so the cross-validation error decreases.

#### Key Insights from the Learning Curves:
1. **Training Error Increases**: Initially, with very few training examples, the model can fit the data perfectly, resulting in a near-zero training error. However, as the dataset size grows, it becomes more challenging to perfectly fit all examples, so the training error rises.

2. **Cross-Validation Error Decreases**: As the model is trained on more examples, it learns to generalize better, resulting in a decrease in cross-validation error. This decrease slows down after a certain point, as adding more data doesn’t necessarily improve generalization much beyond a certain level.

### High Bias (Underfitting):
- In this case, the model is too simple to capture the underlying structure of the data (e.g., fitting a linear model to data that requires a more complex function).
- **J_train** and **J_cv** both flatten at high values, indicating poor fit for both the training set and validation set. The gap between them remains small because the model performs similarly on both datasets (but poorly).
- More data doesn’t help much because the model’s capacity is insufficient.

### High Variance (Overfitting):
- Here, the model is too complex and fits the training data almost perfectly, but it struggles to generalize to new, unseen data.
- **J_train** is very low, but **J_cv** is much higher, indicating a large gap. The model is doing well on the training data but poorly on the validation data.
- In this case, adding more training data can help reduce variance and bring the cross-validation error closer to the training error, improving the model's ability to generalize.

### Practical Implications:
- **High bias**: More training data won’t help much. The focus should be on increasing model complexity (e.g., choosing a more flexible model).
- **High variance**: More data can significantly improve performance, as the model can generalize better with a larger dataset.
  
You can plot these learning curves by training your model on increasing subsets of data (e.g., 100, 200, 300 samples) and observing how the errors change. However, plotting learning curves is computationally expensive, so it’s not done frequently, but having a mental picture of these curves is helpful in diagnosing whether your model suffers from high bias or high variance.



## Deciding what to try next revisited

This lecture provides a detailed explanation of how to diagnose and address high bias and high variance issues in machine learning models. Here's a summary of the key points:

### Diagnosing Bias and Variance:
- **High Bias**: Your model performs poorly on the training set, indicating it's too simple or underfitting the data.
- **High Variance**: Your model performs well on the training set but poorly on the validation set, indicating it's too complex or overfitting the data.

### Approaches to Fix Bias and Variance:
1. **Getting More Training Data**: 
   - Helps with high variance (overfitting), as more data can reduce the model's tendency to overfit.
   - Doesn't help with high bias.

2. **Using a Smaller Set of Features**: 
   - Helps with high variance, as reducing the number of features can simplify the model and reduce overfitting.

3. **Adding Additional Features**:
   - Helps with high bias, as a more complex model can fit the training data better by incorporating more relevant features.

4. **Adding Polynomial Features**:
   - Helps with high bias, as this increases the flexibility of the model to capture complex patterns in the data.

5. **Decreasing Regularization (λ)**:
   - Helps with high bias by allowing the model more flexibility to fit the training data better.

6. **Increasing Regularization (λ)**:
   - Helps with high variance, as it prevents the model from fitting the training data too closely, improving generalization.

### General Guidelines:
- To fix **high bias** (underfitting), increase model complexity (e.g., add features or reduce regularization).
- To fix **high variance** (overfitting), simplify the model (e.g., reduce features or increase regularization).

Understanding bias and variance helps you decide what action to take next when tuning your model, and mastering this skill improves with experience.

## Bias/variance and neural networks

This explanation focuses on how neural networks, particularly large ones, can help alleviate the classic **bias-variance tradeoff**. Before neural networks gained popularity, balancing bias and variance was a major challenge in machine learning, where **high bias** results in underfitting, and **high variance** leads to overfitting. The goal was to find an optimal tradeoff between these two, often through model complexity adjustments or regularization.

Neural networks, when trained on sufficiently large datasets, tend to be **low-bias models**. This is because larger networks (with more hidden layers or units) are capable of fitting even complex patterns in the training set. The recommended strategy is to:

1. **Check the performance on the training set (Jtrain):** If the model does poorly, you likely have a high bias issue, and one approach is to increase the size of the network to reduce bias.
   
2. **Check the cross-validation performance (Jcv):** If there's a large gap between the training and cross-validation errors, you likely have a high variance problem. To reduce variance, the common approach is to **gather more data**.

Neural networks offer flexibility because, with the right regularization, even very large models can perform well without necessarily increasing the risk of overfitting. Regularization techniques, like **L2 regularization**, help prevent overfitting by adding a penalty to large weight values in the network, making it feasible to use larger networks without introducing high variance.

In practice, neural networks allow machine learning engineers to focus more on **variance issues** rather than bias. One caveat is the computational cost, as larger networks require more resources and time to train. However, advancements in hardware, such as **GPUs**, have made this more manageable.

The two key takeaways are:
1. **Larger networks are often advantageous** as long as you regularize them properly.
2. **Neural networks are typically low-bias models**, especially when they are large, so the primary challenge is managing variance.

This shift in perspective has driven the success of deep learning, particularly in fields where vast amounts of data are available.

# Machine learning development process

## Iterative loop of ML development

This lecture covers the iterative process of developing a machine learning system, illustrating the steps using an email spam classifier as an example.

### Key Steps in Machine Learning Development:
1. **Decide System Architecture**: Choose your model type, data, and hyperparameters.
2. **Train the Model**: Implement the model and train it with your data.
3. **Diagnostics**: Evaluate your model's performance using diagnostics like bias-variance analysis and error analysis (to be covered in the next video).
4. **Make Improvements**: Based on diagnostics, refine the model by changing the architecture, adding or removing features, adjusting regularization, or increasing data.
5. **Iterate**: Repeat this loop until you reach satisfactory performance.

### Spam Classifier Example:
- **Feature Engineering**: The example discusses using text features from emails, such as the presence or absence of specific words, to train a model. One method is to create binary features for words (1 if present, 0 if absent) from a dictionary of 10,000 words.
- **Challenges**: Handling misspelled words, deciding on word importance, and considering email routing data (headers) for additional features.
- **Next Steps**: Use diagnostics to guide improvement decisions, like whether to focus on adding more data, developing better features, or adjusting the model architecture.

Error analysis, which is introduced as a key part of diagnostics, will be discussed in more detail in the next video.

## Error analysis

This lecture introduces the concept of **error analysis**, a technique to guide the improvement of machine learning models by examining misclassified examples and understanding where the model is going wrong. Error analysis is second in importance only to bias-variance analysis in helping decide which model adjustments might improve performance.

### Key Concepts in Error Analysis:
1. **Manual Examination of Errors**: For a given set of cross-validation errors, manually review the misclassified examples to identify common themes or error types.
   - Example: In a spam classifier, categorize errors into groups like pharmaceutical spam, phishing emails, deliberate misspellings, or unusual email routing.
   
2. **Prioritizing Fixes**: Count how many errors fall into each category and use this information to prioritize what to fix. For example, if most errors are due to pharmaceutical spam, focusing on this category might yield the best performance improvements.
   - Low-impact issues, like deliberate misspellings, may not be worth significant effort if they account for only a small fraction of errors.
   
3. **Sampling Larger Error Sets**: If the number of errors is large, review a random subset (e.g., 100 examples) to get insights. This can still give a good indication of the most common error types.

4. **Inspiration for Fixes**: Error analysis can inspire specific actions to improve the model, such as collecting more data for problematic categories (like pharmaceutical spam) or creating new features (e.g., detecting suspicious URLs in phishing emails).

### Relation to Bias-Variance Analysis:
- **Bias-Variance** helps determine whether adding more data is useful or if other fixes (e.g., model complexity adjustments) are needed.
- **Error Analysis** gives insights into what types of errors are most common and what features or data can help improve performance.

The limitation of error analysis is that it’s most effective for tasks where humans can easily identify errors (e.g., spam detection). For more abstract tasks (e.g., predicting user clicks), it’s harder to apply. However, when applicable, error analysis can prevent wasted time by helping prioritize the most promising improvements.

## Adding data

This lecture discusses several strategies for increasing the amount of data for machine learning applications. Key points include:

1. **Targeted Data Collection**: Instead of gathering more data across the board, focus on areas identified by error analysis. For example, if a system struggles with pharmaceutical spam, collecting more examples of that specific type of data will improve performance more efficiently.

2. **Data Augmentation**: This involves generating new training examples by applying transformations to existing data, especially in image and audio recognition. For instance, rotating or resizing an image without changing its label, or adding background noise to audio clips, can enhance model robustness by mimicking real-world distortions.

3. **Data Synthesis**: This technique creates entirely new examples from scratch. A common use case is generating synthetic images in computer vision tasks, such as creating new fonts for optical character recognition (OCR) tasks. Synthetic data helps increase the training set size significantly.

4. **Data-Centric Approach**: Instead of solely focusing on improving the algorithm (model-centric approach), enhancing the quality and quantity of training data can lead to significant improvements in performance.

Finally, the lecture mentions **transfer learning**, which is useful when you don't have enough data. This involves leveraging data from a different task to improve your model's performance on the target task.

## Transfer learning: using data from a different task

This lecture provides an in-depth explanation of transfer learning, a powerful technique in machine learning that allows one to leverage a large dataset for a different but related task when there is limited data available for the specific task of interest. Here's a summary:

### How Transfer Learning Works:
1. **Initial Training on a Large Dataset**: Begin by training a neural network on a large dataset (e.g., one million images across various categories like cats, dogs, etc.). This process helps the model learn general features like edges, shapes, and corners.
   
2. **Transfer to Target Task**: After training the neural network, the first few layers (which have learned useful general image features) are reused. The final layer is replaced with a new layer that corresponds to the specific task (e.g., recognizing digits 0 through 9).
   
3. **Fine-tuning**: 
   - **Option 1**: Train only the new final layer while keeping the earlier layers fixed.
   - **Option 2**: Train all layers but initialize them using the learned parameters from the first task.

### When to Use Transfer Learning:
- If the dataset for the target task is small, transfer learning can significantly improve performance by reusing the general features learned from the larger dataset.
- The neural network for the first step (pre-training) should be trained on data with the same input type (e.g., images for images, audio for audio).

### Example Applications:
- Large pre-trained models like GPT-3, BERT, and ImageNet are well-known instances where transfer learning has been successfully applied. These models can be fine-tuned for specific tasks using smaller datasets.

### Key Points:
- **Supervised Pre-training**: Learning from a large dataset to extract general features.
- **Fine-tuning**: Adjusting the model's weights on the smaller, specific task dataset.
- **Community Contribution**: Many pre-trained models are available online, allowing researchers to build on each other’s work.

This technique is particularly valuable in applications where labeled data is scarce but pre-trained models can be adapted to specific tasks, enabling better model performance with less data.

## Full cycle of a machine learning project

### 1. **Scoping the Project**
   - **Define the goal**: Determine what problem you're solving (e.g., building a speech recognition system for voice search).
   - **Project planning**: Set clear objectives and what you aim to achieve.

### 2. **Data Collection**
   - **Gathering necessary data**: For example, collecting audio samples and their transcripts.
   - **Data quality**: Ensure your dataset is representative of the problem you're solving.

### 3. **Model Training**
   - **Initial training**: Train your machine learning model.
   - **Iterative improvement**: Perform error analysis, then return to the data collection stage if needed, such as gathering more data for specific scenarios (e.g., speech in noisy environments).

### 4. **Deploying the Model**
   - **Inference server**: Deploy the model in a server for real-time prediction (e.g., an API for the mobile app to send audio and receive a text transcript).
   - **Scalability and engineering**: Depending on user scale, you may need software engineering for efficient predictions and scalability.
   - **Monitor performance**: Continuously monitor the system to detect when performance drops (e.g., when new data, such as unfamiliar names, causes the model to falter).

### 5. **MLOps and Maintenance**
   - **Ongoing updates**: Retrain and update models as new data arrives.
   - **System reliability**: Ensure system uptime and cost-effective computation, particularly at scale.

### 6. **Ethical Considerations**
   - Building machine learning systems responsibly is critical, and understanding the ethical implications is essential, especially in applications affecting many people.

This approach provides a comprehensive view of the machine learning lifecycle, from scoping and data gathering to deployment, monitoring, and maintaining models over time.

## Fairness, bias, and ethics

### Ethical Considerations in Machine Learning

1. **Historical Examples of Bias and Unethical Use**
   - **Hiring Tools**: Systems that discriminated against women.
   - **Facial Recognition**: Higher misidentification rates for darker-skinned individuals.
   - **Loan Approval Systems**: Bias in bank loan approvals affecting certain subgroups.
   - **Deepfakes and Misinformation**: Ethical concerns about creating and using deepfakes without consent.
   - **Toxic Content**: Algorithms that spread harmful or incendiary content due to engagement optimization.

2. **Guidelines for Ethical Machine Learning**

   - **Diverse Teams**: Assemble teams with varied backgrounds to better anticipate potential issues and harms.
   - **Literature Search**: Review industry standards and guidelines for fairness and bias relevant to your application area.
   - **System Auditing**: Audit the system for biases and fairness issues before deployment.
   - **Mitigation Plan**: Develop a plan to address problems if they arise, including rolling back to previous versions if necessary.
   - **Ongoing Monitoring**: Continuously monitor the system after deployment to detect and address any emerging issues.

3. **Developing Ethical Systems**
   - **Different Implications**: Recognize that ethical considerations vary in significance depending on the application (e.g., roasting coffee vs. approving loans).
   - **Commitment to Improvement**: Aim to improve practices and avoid past mistakes, ensuring that systems do not cause harm.
