You want to decide which class something belongs to (class 0 or class 1) using just one feature, called **X**. Logistic regression helps us do that.

1. **Basic Idea**
   If you look at many data points and group them by similar X values, you will see how often each class occurs within each group. As you move from low X values to high X values, the probability of belonging to class 1 usually changes smoothly.

2. **Goal**
   We want to find the point where both classes are equally likely — that is, the probability of class 1 is 0.5. That point becomes our decision boundary.

3. **Mathematical Setup**
   To derive logistic regression, we assume:

   * The feature X behaves like a normally distributed (Gaussian) variable inside each class.
   * Both classes have the **same variance** but different means.

4. **Probability Inside a Bin**
   For a small range of X values, we want the probability that the item is class 1. This probability depends on how likely class 1 is compared to class 0 at that value of X.
   The **ratio** of these two likelihoods is called the **odds ratio**.

5. **Using the Gaussian Assumption**
   When we substitute the normal distribution formulas for each class and simplify:

   * The odds ratio becomes an exponential of a **linear function** of X.
   * This is why logistic regression can use a straight-line formula inside an exponential.

6. **Arriving at the Logistic (Sigmoid) Function**
   From the odds ratio, we get the final formula for the probability of class 1:
   [
   \sigma(x) = \frac{1}{1 + e^{-z}}
   ]
   where (z = \beta_0 + \beta_1 x).
   This S-shaped curve is called the **sigmoid** or **logistic** function.

7. **Finding the Threshold**
   The threshold is where the probability equals 0.5.
   Setting σ(x) = 0.5 gives a simple condition:
   [
   \beta_0 + \beta_1 x = 0
   ]
   Solving this gives the decision boundary.

8. **How to Find β₀ and β₁ in Real Life**
   Earlier we assumed we knew the means and variances, but in reality we don’t. Instead, we use an optimization method to find the best values of β₀ and β₁.

9. **Likelihood and Cross-Entropy**

   * The **likelihood** tells us how well our model explains the actual data.
   * We want to pick β₀ and β₁ that make the data as probable as possible.
   * To make this easier, we take the **logarithm** of the likelihood.
   * This turns into a formula known as **cross-entropy**.
   * Instead of maximizing likelihood, we **minimize cross-entropy**.

10. **Final Logistic Regression Problem**
    Logistic regression becomes an optimization task:

* Find β₀ and β₁ that minimize cross-entropy.
* This gives us the best-fitting logistic curve.
* This curve gives us probabilities and the class boundary.

### **In One Sentence**

Logistic regression finds the best S-shaped curve that turns a single input value into a probability between 0 and 1, and it chooses its parameters by minimizing cross-entropy so it can correctly separate two classes.

