# Online Machine Learning

Online machine learning is a real-time approach where models learn from incoming data, updating as it arrives. 🔄 Unlike batch learning, it adapts dynamically to changing data streams. It's great for rapid processing, like in streaming analytics 📈 or recommendation systems. This method efficiently handles large data volumes, but also faces challenges like concept drift 🌊 and choosing optimal learning rates. Overall, online ML is a powerful tool for building adaptive models from continuous data streams.

Examples include real-time stock market prediction 📈, dynamic pricing in e-commerce 💰, personalized news recommendations 📰, and fraud detection in financial transactions 🕵️‍♂️. Online ML is vital for scenarios where data is generated rapidly and needs immediate processing for decision-making.

In [7]:
#!pip install river
from river import datasets
from river import linear_model
from river import preprocessing
from river import metrics

dataset=datasets.Phishing()

model=preprocessing.StandardScaler() | linear_model.LogisticRegression()

metric=metrics.Accuracy()

for x,y in dataset:
    y_pred = model.predict_one(x)
    model.learn_one(x,y)
    metric.update(y,y_pred)

print(f"Accuracy : {metric.get()}")



Accuracy : 0.8928


1**. Online Machine Learning**
Definition: Online machine learning is a way for models to learn from data as it arrives, continuously updating themselves instead of being trained on a static dataset.

Key Point: It's like learning on the go, adapting quickly to new information.


**2.When to Use?
Usage Scenarios:**

Real-time Data: When data is constantly streaming, like live sensor data or social media feeds.
Changing Data Patterns: When the data changes over time, like stock prices or user behavior.
Limited Storage: When you can’t store large amounts of data and need to process it immediately.
Key Point: Use it when data comes in continuously and may change over time.


3**. How to Implement?**

Steps:

Choose a Model: Select a model that supports online updates (e.g., logistic regression, neural networks).
Stream Data: Feed the model new data points as they arrive.
Update Continuously: Adjust the model with each new piece of data to improve its predictions.
Key Point: Continuously train and update the model with new data as it arrives.



**4. Learning Rate**
Definition: The learning rate is a parameter that controls how much the model adjusts its weights in response to new data.

Key Point: It's like the speed at which the model learns. Too fast or too slow can be problematic.



**5. Out of Core Learning**
Definition: Out of core learning is used for handling data that doesn’t fit into memory all at once. The model processes small chunks of data sequentially.

Key Point: It's like learning from a huge book one page at a time because the whole book is too big to carry.


**6. Disadvantage**
Main Drawback:

Potential Overfitting: The model might adapt too much to recent data and lose its ability to generalize to new, unseen data.
Resource Intensive: Requires continuous processing power and can be complex to implement.
Key Point: The model might become too focused on recent data and miss the bigger picture.


**7. Batch vs Online Learning**
**Comparison:**

Batch Learning: The model is trained on a fixed dataset all at once.
Pros: Simpler and more stable.
Cons: Can’t adapt to new data after training.
Online Learning: The model updates continuously with each new data point.
Pros: Adapts to new data in real-time.
Cons: More complex and can be unstable if not managed well.
Key Point: Batch learning is like studying for a test all at once, while online learning is like learning continuously every day.

