### Traditional Machine Learning vs. Deep Learning

In the context of model training, there are significant differences between traditional machine learning (ML) and deep learning (DL), especially in terms of the time required to achieve high accuracy and the types of data they handle effectively.

#### Traditional Machine Learning

- **Effective for Tabular Data**: Traditional ML algorithms, such as decision trees, random forests, and gradient boosting machines, are particularly effective for structured, tabular data.
- **Rapid Initial Improvement**: These algorithms tend to show rapid improvements in accuracy with relatively little computational effort. For example, they can quickly reach around 96% accuracy in the early stages of training.
- **Saturation Point**: Beyond a certain point, further improvements in accuracy become slow and incremental, often reaching a plateau.

#### Deep Learning

- **Powerful for High-Dimensional Data**: Deep learning models, such as neural networks, excel in handling unstructured data with high dimensionality, including images, text (LLMs), and audio. They are also used in generative AI (GenAI).
- **Higher Computational Cost**: Training deep learning models typically requires significantly more computational resources and time compared to traditional ML. These models ***might*** achieve higher accuracy then ML for the tabular data, but this is not always true.
- **Continued Improvement**: While deep learning models improve more slowly initially, they continue to make gains in accuracy over time by applying gradient descent and neural network, eventually surpassing traditional ML models for complex tasks.

#### Clarification on Machine Learning Terminology

In some contexts, the term **machine learning (ML)** is used broadly to include all forms of machine learning, including deep learning. However, in most cases, especially within this course, **machine learning** is used to refer to traditional statistical machine learning methods. Thus, we will distinguish between traditional ML and deep learning (DL) throughout this course.

The following graph illustrates these concepts by comparing the accuracy of traditional ML and DL models over time:

In [1]:
import numpy as np
import plotly.graph_objects as go

# Generate data for traditional machine learning
time = np.linspace(0, 100, 500)  # Time from 0 to 100 with 500 points
accuracy_ml = 0.96 * (1 - np.exp(-time/2)) + 0.01 
# Generate data for deep learning
accuracy_dl = 0.98 * (1 - np.exp(-time/20)) + 0.01  # Takes longer, reaches 98%, then slowly to 99%

# Create the plot
fig = go.Figure()

# Add traditional machine learning plot
fig.add_trace(go.Scatter(x=time, y=accuracy_ml, mode='lines', name='Traditional Machine Learning',
                         line=dict(color='blue')))

# Add deep learning plot
fig.add_trace(go.Scatter(x=time, y=accuracy_dl, mode='lines', name='Deep Learning',
                         line=dict(color='green')))

# Add annotations
fig.add_annotation(x=10, y=0.96, 
                   text="ML: Effective for Tabular Data",
                   showarrow=True,
                   arrowhead=2,
                   ax=40, ay=-30)

fig.add_annotation(x=100, y=0.98, 
                   text="DL: Powerful for High-Dimensional Data<br>(GenAI/Images/LLMs)",
                   showarrow=True,
                   arrowhead=2,
                   ax=40, ay=-30)

# Update layout
fig.update_layout(
    title='Model Accuracy vs. Time Needed: Traditional Machine Learning vs. Deep Learning',
    xaxis_title='Time Needed',
    yaxis_title='Accuracy of the Model',
    xaxis=dict(range=[0, 100]),  # Set x-axis range
    legend=dict(x=0.9, y=0.1),
    template='plotly_white'
)

# Show the plot
fig.show()


### The Power of Machine Learning, specifically XGBoost
source: https://www.linkedin.com/posts/tunguz_ive-worked-in-data-science-for-a-while-activity-7197641642722426880-b_yC/

![C:\Users\DTman\onlinecourse\ml_daniel\materials\images\xgboost.png](attachment:image.png)