> # `Scalars`
> What are scalars?

<details>
<summary>Click to expand</summary>

A **scalar** is the simplest object in linear algebra.

* **Definition:** A scalar is just a single number.
* **Examples:** (5), (-3.14), (\sqrt{2}), or (0).
* **Notation:** Usually written with lowercase letters like (a, b, c).

### How scalars differ from other objects

* **Scalar vs Vector:**
  A vector has **magnitude + direction** (like an arrow in space). A scalar is just **magnitude** (no direction).
* **Scalar vs Matrix:**
  A matrix is a rectangular grid of numbers. A scalar is just one of those numbers by itself.

### Why scalars matter in ML

* Learning rate in gradient descent = scalar.
* Weights of a neuron = often scaled (multiplied) by scalars.
* Loss value = scalar (just one number representing error).

Think of scalars as the **atoms** of linear algebra — the smallest building blocks.

---

</details>

> # `Vectors`

<html lang="en">
<head>
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width,initial-scale=1" />
  <title></title>

  <!-- MathJax config: allow $...$ and $$...$$ -->
  <script>
    window.MathJax = {
      tex: {
        inlineMath: [['$', '$'], ['\\(', '\\)']],
        displayMath: [['$$','$$'], ['\\[','\\]']]
      },
      options: {
        skipHtmlTags: ['script','noscript','style','textarea','pre','code']
      }
    };
  </script>
  <script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js" async></script>

  <style>
    :root{
      --bg: #f7fbff;
      --card: #ffffff;
      --muted: #566274;
      --accent: #0b63d6;
      --mono: ui-monospace, SFMono-Regular, Menlo, Monaco, "Roboto Mono", "Courier New", monospace;
    }
    html,body{height:100%}
    body{
      margin:24px;
      font-family: Inter, system-ui, -apple-system, "Segoe UI", Roboto, "Helvetica Neue", Arial;
      background: linear-gradient(180deg, #ffffff 0%, var(--bg) 100%);
      color: #0b2030;
      line-height:1.55;
      -webkit-font-smoothing:antialiased;
    }
    .container{
      max-width:980px;
      margin:0 auto;
      background:var(--card);
      border-radius:12px;
      padding:22px;
      box-shadow:0 8px 30px rgba(11,30,45,0.06);
      border:1px solid rgba(11,99,214,0.04);
    }
    header{
      display:flex;
      align-items:baseline;
      justify-content:space-between;
      gap:14px;
      margin-bottom:14px;
    }
    header h1{margin:0; font-size:20px;}
    header p{margin:0; color:var(--muted); font-size:13px;}
    details{margin:6px 0;}
    summary{
      cursor:pointer;
      font-weight:700;
      font-size:15px;
      list-style:none;
      outline:none;
    }
    details > summary::-webkit-details-marker { display: none; }
    details[open] > summary::after { content: "▾"; padding-left:8px; color:var(--muted); }
    details > summary::after { content: "▸"; padding-left:8px; color:var(--muted); }
    section{margin:12px 0;}
    h2{font-size:16px; margin:8px 0 6px;}
    p{margin:6px 0;}
    .example{background:#f2f8ff; padding:10px; border-radius:8px; border:1px solid rgba(11,99,214,0.06);}
    pre{background:#f7fbff; padding:10px; border-radius:8px; overflow:auto; border:1px solid rgba(11,30,45,0.03);}
    code{font-family:var(--mono); background:#eef6ff; padding:2px 6px; border-radius:6px;}
    ul, ol { margin:8px 0 8px 20px; }
    .hint{font-size:13px; color:var(--muted);}
    hr{border:0; border-top:1px solid rgba(11,30,45,0.06); margin:18px 0;}
  </style>
</head>
<body>
  <div class="container">
    <!-- <header>
      <h1>Vector Notes</h1>
      <p class="hint">HTML + MathJax — display math uses <code>$$...$$</code></p>
    </header> -->
    <details open>
      <summary>Click to expand</summary>
      <section>
        <h2>1. What are Vectors</h2>
        <p><strong>Definition (short):</strong> an ordered list of numbers representing a point or arrow in $ \mathbb{R}^n $. </p>
        <p><strong>Math (stepwise):</strong></p>
        <ol>
          <li>A vector $ \mathbf{v} \in \mathbb{R}^n $ written $ \mathbf{v} = (v_1, v_2, \dots, v_n) $.</li>
          <li>Components $v_i$ are scalars.</li>
          <li>Vector operations are defined componentwise (addition, scalar multiplication).</li>
        </ol>
        <p><strong>Geometry:</strong> think of $ \mathbf{v} $ as an arrow from the origin to the point $(v_1, \dots, v_n)$.</p>
        <div class="example">
          <p><strong>NumPy snippet</strong></p>
          <pre><code>import numpy as np
v = np.array([2.0, -1.0, 0.5])  # a vector in R^3
</code></pre>
        </div>
        <p><strong>Exercise (try):</strong> Create two 3D vectors and print their components, shapes and types. <span class="hint">Hint: use <code>np.array</code> and <code>.shape</code>.</span></p>
        <p><strong>Pitfalls / summary:</strong> Vectors are not lists in math — their order matters and operations assume same dimension.</p>
      </section>
      <hr/>
      <section>
        <h2>2. Row Vector and Column Vector</h2>
        <p><strong>Definition:</strong> same components but different orientation.</p>
        <ul>
          <li>Row: $ \mathbf{r} = [r_1\ r_2\ \dots\ r_n] $ (1×n)</li>
          <li>Column:
            <div class="example">
              $$
              \mathbf{c} = \begin{bmatrix}
                c_1 \\
                \vdots \\
                c_n
              \end{bmatrix}
              \quad (n\times 1)
              $$
            </div>
          </li>
        </ul>
        <p><strong>Why it matters:</strong> matrix multiplication rules require correct orientation.</p>
        <p><strong>Stepwise:</strong></p>
        <ol>
          <li>Column vectors are the standard in linear algebra: treat $ \mathbf{x}\in\mathbb{R}^n $ as $n\times 1$.</li>
          <li>A row vector is transpose: $ \mathbf{r} = \mathbf{c}^\top $.</li>
        </ol>
        <div class="example">
          <p><strong>NumPy snippet</strong></p>
          <pre><code>x = np.array([1,2,3])        # 1D array (behaves like row/col depending on context)
x_col = x.reshape((3,1))     # explicit column vector
x_row = x.reshape((1,3))     # explicit row vector
</code></pre>
        </div>
        <p><strong>Exercise:</strong> reshape a vector into column and row and compute <code>x_col.T @ x_col</code>. <span class="hint">Hint: <code>.reshape((n,1))</code> and <code>.T</code>.</span></p>
        <p><strong>Pitfall:</strong> 1D <code>np.array</code> has no explicit row/col until you reshape — be careful with broadcasting.</p>
      </section>
      <hr/>
      <section>
        <h2>3. Distance from Origin</h2>
        <p><strong>Definition:</strong> length (Euclidean norm) of vector $ \mathbf{v} $:</p>
        <div class="example">
          $$
          \|\mathbf{v}\|_2 = \sqrt{\sum_i v_i^2}
          $$
        </div>
        <div class="example">
          <p><strong>NumPy</strong></p>
          <pre><code>np.linalg.norm(v)           # default is L2 norm
np.sqrt(np.dot(v,v))        # equivalent</code></pre>
        </div>
        <p><strong>Exercise:</strong> compute norm for <code>v=[3,4]</code> and confirm it equals 5. <span class="hint">Pythagorean theorem.</span></p>
        <p><strong>Pitfall:</strong> Norm can be other <code>p</code>-norms (see later); specify when needed.</p>
      </section>
      <hr/>
      <section>
        <h2>4. Euclidean Distance between 2 vectors</h2>
        <p><strong>Definition:</strong></p>
        <div class="example">
          $$
          d(\mathbf{u},\mathbf{v}) = \|\mathbf{u}-\mathbf{v}\|_2
          $$
        </div>
        <div class="example">
          <p><strong>NumPy</strong></p>
          <pre><code>dist = np.linalg.norm(u - v)</code></pre>
        </div>
        <p><strong>Exercise:</strong> compute distance between <code>[1,0,0]</code> and <code>[0,1,0]</code>. <span class="hint">Answer: $\sqrt{2}$.</span></p>
        <p><strong>Pitfall:</strong> Don’t forget to center or scale features before computing distances in ML (units matter).</p>
      </section>
      <hr/>
      <section>
        <h2>5. Scalar + Vector (Shifting)</h2>
        <p><strong>Definition:</strong> adding the same scalar to every component (less common); more commonly vector + vector (broadcasting).</p>
        <p><strong>Math (broadcast):</strong> when allowed,
          $$
          \mathbf{v} + c = (v_1 + c, v_2 + c, \dots)
          $$
        </p>
        <div class="example">
          <pre><code>v + 3    # adds 3 to every component (broadcast)
# But usually shift by vector:
v + np.array([1,2,3])</code></pre>
        </div>
        <p><strong>Exercise:</strong> center a dataset <code>X</code> by subtracting its column means. <span class="hint">Use <code>X - X.mean(axis=0)</code>.</span></p>
        <p><strong>Pitfall:</strong> Broadcasting can silently do unwanted shifts if shapes differ.</p>
      </section>
      <hr/>
      <section>
        <h2>6. Scalar × Vector (Scaling)</h2>
        <p><strong>Definition:</strong> multiply each component by scalar $a$: $a\mathbf{v}$.</p>
        <p><strong>Math:</strong>
          $$
          a\mathbf{v} = (a v_1, \dots, a v_n)
          $$
        </p>
        <div class="example">
          <pre><code>2.5 * v</code></pre>
        </div>
        <p><strong>Exercise:</strong> scale <code>v=[2, -1]</code> by -3; check new length equals 3× original. <span class="hint">Use <code>np.linalg.norm</code>.</span></p>
        <p><strong>Pitfall:</strong> Mistaking scalar multiplication for elementwise product of two vectors.</p>
      </section>
      <hr/>
      <section>
        <h2>7. Vector + Vector / − Vector</h2>
        <p><strong>Definition:</strong> componentwise addition/subtraction.</p>
        <p><strong>Math:</strong>
          $$
          \mathbf{u}+\mathbf{v} = (u_1+v_1, \dots)
          $$
        </p>
        <div class="example">
          <pre><code>u + v
u - v</code></pre>
        </div>
        <p><strong>Exercise:</strong> draw (mentally or plot) <code>u=[1,0]</code>, <code>v=[0,2]</code>; compute <code>u+v</code> and verify head-to-tail geometry.</p>
        <p><strong>Pitfall:</strong> Vectors must have same dimension.</p>
      </section>
      <hr/>
      <section>
        <h2>8. Dot Product of 2 vectors</h2>
        <p><strong>Definition:</strong>
          $$
          \mathbf{u}\cdot\mathbf{v} = \sum_i u_i v_i
          $$
        </p>
        <p>Result is a scalar. Equivalently, $\mathbf{u}^\top \mathbf{v}$ (for column vectors).</p>
        <div class="example">
          <pre><code>np.dot(u, v)    # or u @ v</code></pre>
        </div>
        <p><strong>Exercise:</strong> compute <code>dot([1,2,3],[4,0,-1])</code> by hand and check code. <span class="hint">1*4 + 2*0 + 3*(-1) = 1.</span></p>
        <p><strong>Pitfall:</strong> For multi-dimensional arrays, <code>np.dot</code>/<code>@</code> have broadcasting rules—use shapes carefully.</p>
      </section>
      <hr/>
      <section>
        <h2>9. Angle between 2 vectors</h2>
        <p><strong>Formula:</strong></p>
        <div class="example">
          $$
          \cos\theta = \frac{\mathbf{u}\cdot\mathbf{v}}{\|\mathbf{u}\|\,\|\mathbf{v}\|}, \qquad
          \theta = \arccos\Big(\frac{\mathbf{u}\cdot\mathbf{v}}{\|\mathbf{u}\|\,\|\mathbf{v}\|}\Big)
          $$
        </div>
        <div class="example">
          <pre><code>cos_theta = np.dot(u,v) / (np.linalg.norm(u) * np.linalg.norm(v))
theta = np.arccos(np.clip(cos_theta, -1, 1))</code></pre>
        </div>
        <p><strong>Exercise:</strong> check angle between <code>[1,0]</code> and <code>[0,1]</code> is $\pi/2$. <span class="hint">dot=0 → cos=0 → angle=90°.</span></p>
        <p><strong>Pitfall:</strong> Floating errors can push cos slightly out of [-1,1]; use <code>np.clip</code>.</p>
      </section>
      <hr/>
      <section>
        <h2>10. Unit Vectors</h2>
        <p><strong>Definition:</strong> vector of length 1. Unit vector in direction $\mathbf{v}$:</p>
        <div class="example">
          $$
          \hat{\mathbf{v}} = \frac{\mathbf{v}}{\|\mathbf{v}\|}
          $$
        </div>
        <div class="example">
          <pre><code>v_hat = v / np.linalg.norm(v)</code></pre>
        </div>
        <p><strong>Exercise:</strong> convert <code>[3,4]</code> to a unit vector and check norm = 1. <span class="hint">norm is 5.</span></p>
        <p><strong>Pitfall:</strong> dividing by zero if $\mathbf{v}=\mathbf{0}$.</p>
      </section>
      <hr/>
      <section>
        <h2>11. Projection of a Vector</h2>
        <p><strong>Definitions:</strong></p>
        <p>Scalar (length of projection):
          $$
          \mathrm{comp}_{\mathbf{v}}(\mathbf{u}) = \frac{\mathbf{u}\cdot\mathbf{v}}{\|\mathbf{v}\|}
          $$
        </p>
        <p>Vector projection (shadow):
          $$
          \mathrm{proj}_{\mathbf{v}}(\mathbf{u}) = \frac{\mathbf{u}\cdot\mathbf{v}}{\|\mathbf{v}\|^2}\,\mathbf{v}
          $$
        </p>
        <div class="example">
          <pre><code>proj_u_on_v = (np.dot(u,v) / np.dot(v,v)) * v</code></pre>
        </div>
        <p><strong>Exercise:</strong> project <code>u=[1,2,2]</code> onto <code>v=[2,0,1]</code> and confirm algebraically and with code.</p>
        <p><strong>Pitfall:</strong> If <code>v</code> is zero or near-zero, projection is undefined / unstable.</p>
      </section>
      <hr/>
      <section>
        <h2>12. Basis Vectors</h2>
        <p><strong>Definition:</strong> a set of vectors $\{b_1,\dots,b_k\}$ that are linearly independent and span a space. If they span $\mathbb{R}^n$ with $k=n$, they form a basis.</p>
        <p><strong>Stepwise:</strong></p>
        <ol>
          <li>To express $\mathbf{x}$ in basis $B$, find coordinates $\mathbf{c}$ such that $\mathbf{x} = B \mathbf{c}$ where $B$ has basis vectors as columns.</li>
          <li>If $B$ is invertible ($n\times n$), $\mathbf{c} = B^{-1}\mathbf{x}$.</li>
        </ol>
        <div class="example">
          <pre><code>B = np.column_stack([b1,b2,b3])    # basis as columns
coords = np.linalg.solve(B, x)     # coordinates in basis B</code></pre>
        </div>
        <p><strong>Exercise:</strong> Verify standard basis <code>e1,e2</code> spans $\mathbb{R}^2$. Express <code>[3,4]</code> in that basis.</p>
        <p><strong>Pitfall:</strong> Basis must be independent — redundant vectors are not a basis.</p>
      </section>
      <hr/>
      <section>
        <h2>13. Equation of a Line in n-D</h2>
        <p><strong>Parametric form (most general):</strong></p>
        <div class="example">
          $$
          \mathbf{x}(t) = \mathbf{p} + t\mathbf{d},\quad t\in\mathbb{R}
          $$
        </div>
        <p>where $\mathbf{p}$ is a point and $\mathbf{d}$ a direction vector.</p>
        <div class="example">
          <pre><code>p = np.array([1,2,3])
q = np.array([3,4,6])
d = q - p
# point on line for t=0.5
x_t = p + 0.5 * d</code></pre>
        </div>
        <p><strong>Exercise:</strong> parametric equation of line through (1,0,0) and (0,1,0).</p>
        <p><strong>Pitfall:</strong> In dimensions &gt;2, lines are 1D subspaces + translation — don’t confuse with planes (2D).</p>
      </section>
      <hr/>
      <section>
        <h2>14. Vector Norms (general)</h2>
        <p><strong>Definition (p-norm):</strong></p>
        <div class="example">
          $$
          \|\mathbf{v}\|_p = \left(\sum_i |v_i|^p\right)^{1/p}
          $$
        </div>
        <p>Special cases: $p=2$ (Euclidean), $p=1$ (Manhattan), $p=\infty$ (max).</p>
        <div class="example">
          <pre><code>np.linalg.norm(v, ord=2)   # L2
np.linalg.norm(v, ord=1)   # L1
np.linalg.norm(v, ord=np.inf)  # Linf</code></pre>
        </div>
        <p><strong>Exercise:</strong> compute L1 and L2 norms for <code>[1,-2,2]</code>.</p>
        <p><strong>Pitfall:</strong> Norms induce different geometry and hence different model behavior.</p>
      </section>
      <hr/>
      <section>
        <h2>15. Linear Independence</h2>
        <p><strong>Definition:</strong> vectors $v_1,\dots,v_k$ are independent if $c_1 v_1 + \dots + c_k v_k = 0$ implies all $c_i=0$.</p>
        <p><strong>Stepwise test:</strong></p>
        <ol>
          <li>Put vectors as columns into matrix $A$.</li>
          <li>Compute rank: if <code>rank(A) = k</code>, independent; otherwise dependent.</li>
        </ol>
        <div class="example">
          <pre><code>A = np.column_stack([v1,v2,v3])
np.linalg.matrix_rank(A)  # compare to number of vectors</code></pre>
        </div>
        <p><strong>Exercise:</strong> check independence of <code>v1=[1,0]</code>, <code>v2=[2,0]</code>. <span class="hint">rank = 1 → dependent.</span></p>
        <p><strong>Pitfall:</strong> Floating point noise may make near-dependent vectors show full rank — check condition numbers.</p>
      </section>
      <hr/>
      <section>
        <h2>16. Vector Spaces</h2>
        <p><strong>Definition:</strong> a set $V$ with vector addition and scalar multiplication, closed under those operations and satisfying the vector space axioms.</p>
        <p><strong>Common examples:</strong> $\mathbb{R}^n$, subspaces (plane through origin), polynomial spaces.</p>
        <p><strong>To verify a subspace $W \subset V$:</strong></p>
        <ol>
          <li>Contains the zero vector.</li>
          <li>Closed under addition.</li>
          <li>Closed under scalar multiplication.</li>
        </ol>
        <p><strong>Exercise:</strong> show that the set of vectors orthogonal to a given vector $a$ forms a subspace. <span class="hint">If $x\cdot a = 0$ and $y\cdot a = 0$ then $(x+y)\cdot a = 0$ and $(\lambda x)\cdot a = 0$.</span></p>
        <p><strong>Pitfall:</strong> an affine set (like a line not through the origin) is NOT a subspace.</p>
      </section>
    </details>
    <hr/>
  </div>
</body>
</html>


> # `Matrix`

<html lang="en">
<head>
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width,initial-scale=1" />
  <title></title>

  <!-- MathJax configuration: allow $...$ and $$...$$ -->
  <script>
    window.MathJax = {
      tex: {
        inlineMath: [['$', '$'], ['\\(', '\\)']],
        displayMath: [['$$','$$'], ['\\[','\\]']]
      },
      options: {
        skipHtmlTags: ['script','noscript','style','textarea','pre','code']
      }
    };
  </script>
  <script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js" async></script>

  <style>
    :root{
      --bg:#fbfcfd;
      --card:#ffffff;
      --muted:#657786;
      --accent:#0b63d6;
      --mono: "Courier New", Courier, monospace;
    }
    html,body{height:100%}
    body{
      margin:24px;
      font-family: Inter, ui-sans-serif, system-ui, -apple-system, "Segoe UI", Roboto, "Helvetica Neue", Arial;
      background:linear-gradient(180deg, #ffffff 0%, var(--bg) 100%);
      color: #0b2030;
      line-height:1.5;
    }
    .container{
      max-width:900px;
      margin:0 auto;
      background:var(--card);
      border-radius:12px;
      padding:22px;
      box-shadow:0 6px 20px rgba(13,30,45,0.08);
      border:1px solid rgba(11,99,214,0.04);
    }
    header{
      display:flex;
      align-items:center;
      gap:14px;
      margin-bottom:18px;
    }
    header h1{
      margin:0;
      font-size:20px;
      letter-spacing:-0.2px;
    }
    summary{cursor:pointer; font-weight:600}
    details > summary::-webkit-details-marker { display: none; }
    details[open] > summary::after { content: "▾"; padding-left:8px; color:var(--muted); }
    details > summary::after { content: "▸"; padding-left:8px; color:var(--muted); }
    section{margin:14px 0;}
    h2{font-size:16px; margin:10px 0 8px;}
    p{margin:6px 0;}
    .example, .note { background:#f7fbff; padding:10px; border-radius:8px; border:1px solid rgba(11,99,214,0.06); font-size:14px; }
    code{font-family:var(--mono); background:#f2f6fa; padding:2px 6px; border-radius:4px;}
    hr{border:0; border-top:1px solid rgba(11,30,45,0.06); margin:18px 0;}
    ol, ul { margin:8px 0 8px 20px; }
    .small{font-size:13px; color:var(--muted);}
  </style>
</head>
<body>
  <div class="container">
    <!-- <header>
      <h1>Matrix Notes — HTML + MathJax</h1>
      <div class="small">Rendered with MathJax (supports <code>$...$</code> and <code>$$...$$</code>)</div>
    </header> -->
    <details open>
      <summary>Click to expand — Matrix Notes</summary>
      <section>
        <h2>1. <strong>What are Matrices?</strong></h2>
        <p>A <strong>matrix</strong> is a rectangular array of numbers with rows and columns.</p>
        <p>Notation: <code>A = [a_{ij}]</code>, where <code>a_{ij}</code> is the entry in row <code>i</code> and column <code>j</code>.</p>
        <p>Dimensions: <code>m &times; n</code> (rows &times; columns).</p>
        <div class="example">
          <p><strong>Example:</strong></p>
          <p>
            $$
            A =
            \begin{bmatrix}
            1 & 2 & 3 \\
            4 & 5 & 6
            \end{bmatrix}
            \quad\text{is a }2\times 3\text{ matrix.}
            $$
          </p>
        </div>
      </section>
      <hr/>
      <section>
        <h2>2. <strong>Types of Matrices</strong></h2>
        <ul>
          <li><strong>Square matrix:</strong> $n\times n$ (same rows and columns).</li>
          <li><strong>Row matrix:</strong> $1\times n$ (only one row).</li>
          <li><strong>Column matrix:</strong> $m\times 1$ (only one column).</li>
          <li><strong>Zero (null) matrix:</strong> all entries $=0$.</li>
          <li><strong>Diagonal matrix:</strong> only diagonal entries may be nonzero.</li>
          <li><strong>Identity matrix $I_n$:</strong> diagonal entries $=1$, others $=0$.</li>
          <li><strong>Upper / Lower triangular:</strong> all entries below / above the main diagonal are zero.</li>
        </ul>
      </section>
      <hr/>
      <section>
        <h2>3. <strong>Orthogonal Matrices</strong></h2>
        <p>A square matrix $Q$ is <strong>orthogonal</strong> if</p>
        <p>
          $$
          Q^\top Q = Q Q^\top = I.
          $$
        </p>
        <p>Equivalently, $Q^{-1} = Q^\top$. Columns (and rows) of $Q$ form an orthonormal set. Orthogonal transforms preserve lengths and angles.</p>
      </section>
      <hr/>
      <section>
        <h2>4. <strong>Symmetric Matrices</strong></h2>
        <p>A matrix $A$ is symmetric if</p>
        <p>
          $$
          A = A^\top.
          $$
        </p>
        <p>Example:</p>
        <div class="example">
          $$
          \begin{bmatrix}
          1 & 2 \\
          2 & 3
          \end{bmatrix}.
          $$
        </div>
        <p>Symmetric matrices commonly appear as covariance matrices in ML and have real eigenvalues.</p>
      </section>
      <hr/>
      <section>
        <h2>5. <strong>Diagonal Matrices</strong></h2>
        <p>A diagonal matrix has entries only on the main diagonal:</p>
        <div class="example">
          $$
          D =
          \begin{bmatrix}
          2 & 0 & 0 \\
          0 & 5 & 0 \\
          0 & 0 & 7
          \end{bmatrix}.
          $$
        </div>
        <p>Diagonal matrices are easy to work with (scaling each coordinate).</p>
      </section>
      <hr/>
      <section>
        <h2>6. <strong>Matrix Equality</strong></h2>
        <p>Two matrices $A$ and $B$ are equal if:</p>
        <ol>
          <li>They have the same dimensions, and</li>
          <li>$a_{ij} = b_{ij}$ for every entry $(i,j)$.</li>
        </ol>
      </section>
      <hr/>
      <section>
        <h2>7. <strong>Scalar Operations on Matrices</strong></h2>
        <p>For scalar $\alpha$ and matrix $A=[a_{ij}]$:</p>
        <ul>
          <li>$\alpha A = [\alpha a_{ij}]$ (multiply each entry by $\alpha$).</li>
          <li>$A + B = [a_{ij}+b_{ij}]$ (entrywise) when $A$ and $B$ share dimensions.</li>
        </ul>
      </section>
      <hr/>
      <section>
        <h2>8. <strong>Matrix Addition and Subtraction</strong></h2>
        <p>Addition/subtraction is element-wise and requires identical dimensions:</p>
        <div class="example">
          $$
          (A\pm B)_{ij} = a_{ij}\pm b_{ij}.
          $$
        </div>
      </section>
      <hr/>
      <section>
        <h2>9. <strong>Matrix Multiplication</strong></h2>
        <p>If $A$ is $m\times n$ and $B$ is $n\times p$, then the product $AB$ is $m\times p$ with entries</p>
        <div class="example">
          $$
          (AB)_{ij}=\sum_{k=1}^{n} a_{ik}\,b_{kj}.
          $$
        </div>
        <p>Matrix multiplication is generally <strong>not</strong> commutative: $AB\neq BA$ in general.</p>
      </section>
      <hr/>
      <section>
        <h2>10. <strong>Transpose</strong></h2>
        <p>The transpose of $A$ is $A^\top$ with</p>
        <div class="example">
          $$
          (A^\top)_{ij} = A_{ji}.
          $$
        </div>
      </section>
      <hr/>
      <section>
        <h2>11. <strong>Determinant</strong></h2>
        <p>The determinant is defined for square matrices. For a $2\times 2$ matrix</p>
        <div class="example">
          $$
          \det\begin{bmatrix} a & b \\ c & d \end{bmatrix} = ad - bc.
          $$
        </div>
        <p>A matrix $A$ is invertible exactly when $\det(A)\neq 0$.</p>
      </section>
      <hr/>
      <section>
        <h2>12. <strong>Minor and Cofactor</strong></h2>
        <ul>
          <li><strong>Minor</strong> $M_{ij}$: determinant of the submatrix obtained by deleting row $i$ and column $j$.</li>
          <li><strong>Cofactor</strong> $C_{ij}$:
            <div class="example">
              $$
              C_{ij} = (-1)^{i+j} M_{ij}.
              $$
            </div>
          </li>
        </ul>
      </section>
      <hr/>
      <section>
        <h2>13. <strong>Adjugate (Classical Adjoint)</strong></h2>
        <p>The adjugate (classical adjoint) of $A$ is the transpose of the cofactor matrix:</p>
        <div class="example">
          $$
          \operatorname{adj}(A) = \big(C_{ij}\big)^\top.
          $$
        </div>
      </section>
      <hr/>
      <section>
        <h2>14. <strong>Inverse</strong></h2>
        <p>If $A$ is invertible ($\det(A)\neq 0$), then</p>
        <div class="example">
          $$
          A^{-1}=\frac{1}{\det(A)}\,\operatorname{adj}(A).
          $$
        </div>
      </section>
      <hr/>
      <section>
        <h2>15. <strong>Rank</strong></h2>
        <p>The <strong>rank</strong> of $A$ is the dimension of the column space (same as row space dimension). It equals the maximum number of linearly independent rows or columns.</p>
      </section>
      <hr/>
      <section>
        <h2>16. <strong>Column Space and Null Space</strong></h2>
        <ul>
          <li><strong>Column space (range):</strong> $\operatorname{Col}(A)=\operatorname{span}\{\text{columns of }A\}$.</li>
          <li><strong>Null space (kernel):</strong> $\operatorname{Null}(A)=\{\mathbf{x}:A\mathbf{x}=\mathbf{0}\}$.</li>
        </ul>
      </section>
      <hr/>
      <section>
        <h2>17. <strong>Change of Basis</strong></h2>
        <p>Change-of-basis expresses coordinates of vectors in a different basis. Used in PCA and diagonalization when switching to eigenvector coordinates.</p>
      </section>
      <hr/>
      <section>
        <h2>18. <strong>Solving Linear Systems</strong></h2>
        <p>Matrix form: $A\mathbf{x}=\mathbf{b}$. Common solution methods:</p>
        <ul>
          <li>Gaussian elimination (row reduction).</li>
          <li>If $A$ is invertible: $\mathbf{x}=A^{-1}\mathbf{b}$.</li>
          <li>LU decomposition for efficient solving.</li>
        </ul>
      </section>
      <hr/>
      <section>
        <h2>19. <strong>Linear Transformations</strong></h2>
        <p>A matrix defines a linear map $T(\mathbf{x})=A\mathbf{x}$. Examples: rotations, scalings, reflections.</p>
      </section>
      <hr/>
      <section>
        <h2>20. <strong>3D Linear Transformations</strong></h2>
        <p>Matrices in $\mathbb{R}^3$ can rotate, scale, and reflect 3D objects — used in graphics and embeddings.</p>
      </section>
      <hr/>
      <section>
        <h2>21. <strong>Matrix Multiplication as Composition</strong></h2>
        <p>Applying $AB$ to a vector $\mathbf{x}$ means first apply $B$ then $A$:</p>
        <div class="example">
          $$
          (AB)\mathbf{x}=A\big(B\mathbf{x}\big).
          $$
        </div>
      </section>
      <hr/>
      <section>
        <h2>22. <strong>Non-square Linear Maps</strong></h2>
        <p>A non-square matrix still maps between vector spaces of different dimensions. Example: a $3\times 2$ matrix maps $\mathbb{R}^2\to\mathbb{R}^3$.</p>
      </section>
      <hr/>
      <section>
        <h2>23. <strong>Dot Product (Matrix Form)</strong></h2>
        <p>If $\mathbf{u},\mathbf{v}$ are column vectors, the dot product is</p>
        <div class="example">
          $$
          \mathbf{u}\cdot \mathbf{v} = \mathbf{u}^\top \mathbf{v}.
          $$
        </div>
      </section>
      <hr/>
      <section>
        <h2>24. <strong>Cross Product</strong></h2>
        <p>Defined in $\mathbb{R}^3$. For $\mathbf{u},\mathbf{v}\in\mathbb{R}^3$:</p>
        <ul>
          <li>$\mathbf{u}\times\mathbf{v}$ is orthogonal to both $\mathbf{u}$ and $\mathbf{v}$.</li>
          <li>Magnitude:
            <div class="example">
              $$
              \|\mathbf{u}\times\mathbf{v}\| = \|\mathbf{u}\|\,\|\mathbf{v}\|\,\sin\theta,
              $$
              where $\theta$ is the angle between $\mathbf{u}$ and $\mathbf{v}$.
            </div>
          </li>
          <li>Determinant (component) representation:
            <div class="example">
              $$
              \mathbf{u}\times\mathbf{v} =
              \begin{vmatrix}
              \mathbf{i} & \mathbf{j} & \mathbf{k} \\
              u_1 & u_2 & u_3 \\
              v_1 & v_2 & v_3
              \end{vmatrix}.
              $$
            </div>
          </li>
        </ul>
      </section>
    </details>
    <hr/>
  </div>
</body>
</html>


> # `Tensors`

<details>
<summary>Click to expand</summary>

## 1. What are Tensors?

* A **Tensor** is a **mathematical object** that generalizes **scalars, vectors, and matrices** to higher dimensions.
* Think of it as a **container for numbers**, arranged in a certain number of dimensions (also called **axes**).
* The number of dimensions of a tensor is called its **rank** (or order).

### Examples of Tensors by rank:

* **Rank-0 Tensor** → Scalar (e.g., `7`)
* **Rank-1 Tensor** → Vector (e.g., `[3, 5, 7]`)
* **Rank-2 Tensor** → Matrix (e.g., `[[1,2],[3,4]]`)
* **Rank-3 Tensor** → Cube of numbers (e.g., image with RGB channels)
* **Rank-n Tensor** → Higher dimensional generalization

**Visual:**

```
Scalar → 5
Vector → [5, 2, 3]
Matrix → [[1,2,3],
          [4,5,6]]
3D Tensor → [[[...]]]
```

---

## 2. Importance of Tensors in Deep Learning

* **Data Representation:**

  * Images → 3D tensors (height × width × channels).
  * Videos → 4D tensors (frames × height × width × channels).
  * Text → tensors (sequence length × embedding dimensions).
* **Model Parameters:** Neural network weights are stored as tensors.
* **Operations:** All forward & backward computations (matrix multiplication, dot products, gradients) use tensor algebra.
* **Hardware Acceleration:** GPUs & TPUs are optimized for tensor operations (fast parallel computation).

**Key Point:** Without tensors, deep learning wouldn’t be possible—tensors are the **language of neural networks**.

---

## 3. Tensor Operations

Just like vectors and matrices, tensors support operations:

* **Element-wise Operations:** Add, subtract, multiply, divide (performed element by element).
* **Reshaping:** Change the shape of a tensor without changing its data.

  * Example: reshape a `2×3` tensor into a `3×2`.
* **Transpose / Permutation:** Swap dimensions.

  * Example: image tensor (height × width × channels) → (channels × height × width).
* **Slicing / Indexing:** Extract sub-tensors (like cutting parts of a dataset).
* **Broadcasting:** Automatically expand tensors to perform operations.

  * Example: `[1,2,3] + 5 → [6,7,8]`
* **Dot Products / Matrix Multiplication:** Extend naturally to tensors.

---

## 4. Data Representation using Tensors

Different ML data types as tensors:

* **Scalars:** Single measurement (e.g., temperature = 35°C).
* **Vectors (1D tensor):** Features of one sample (e.g., `[height, weight, age]`).
* **Matrices (2D tensor):** Batch of samples (e.g., 100 students × 3 features each).
* **3D Tensor:** Images (batch × height × width × channels).
* **4D/5D Tensor:** Videos, 3D medical scans, etc.

**Example:**

* Black & white image: `28 × 28` (2D tensor).
* Colored image: `28 × 28 × 3` (3D tensor).
* Batch of 100 colored images: `100 × 28 × 28 × 3` (4D tensor).

---

</details>

> # `Eigen Values and Vectors`

<html lang="en">
<head>
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width,initial-scale=1" />
  <title></title>

  <!-- MathJax config: allow $...$ and $$...$$ -->
  <script>
    window.MathJax = {
      tex: {
        inlineMath: [['$', '$'], ['\\(', '\\)']],
        displayMath: [['$$','$$'], ['\\[','\\]']]
      },
      options: {
        skipHtmlTags: ['script','noscript','style','textarea','pre','code']
      }
    };
  </script>
  <script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js" async></script>

  <style>
    :root{
      --bg: #fbfdff;
      --card: #ffffff;
      --muted: #6b7a90;
      --accent: #0b63d6;
      --mono: ui-monospace, SFMono-Regular, Menlo, Monaco, "Roboto Mono", "Courier New", monospace;
    }
    html,body{height:100%}
    body{
      margin:20px;
      font-family: Inter, system-ui, -apple-system, "Segoe UI", Roboto, "Helvetica Neue", Arial;
      background: linear-gradient(180deg, #ffffff 0%, var(--bg) 100%);
      color:#0b2030;
      -webkit-font-smoothing:antialiased;
      line-height:1.55;
    }
    .wrap{max-width:920px; margin:0 auto; background:var(--card); border-radius:12px; padding:22px; box-shadow:0 8px 30px rgba(11,30,45,0.06); border:1px solid rgba(11,99,214,0.04);}
    header{display:flex; justify-content:space-between; align-items:baseline; gap:10px; margin-bottom:12px;}
    header h1{margin:0; font-size:20px;}
    header p{margin:0; color:var(--muted); font-size:13px;}
    details{margin:10px 0;}
    summary{cursor:pointer; font-weight:700; font-size:15px;}
    details>summary::-webkit-details-marker { display: none; }
    details[open] > summary::after { content: "▾"; padding-left:8px; color:var(--muted); }
    details > summary::after { content: "▸"; padding-left:8px; color:var(--muted); }
    section{margin:12px 0;}
    h2{font-size:16px; margin:8px 0 6px;}
    p{margin:6px 0;}
    .example{background:#f1f8ff; padding:10px; border-radius:8px; border:1px solid rgba(11,99,214,0.06);}
    pre{background:#f7fbff; padding:10px; border-radius:6px; overflow:auto; border:1px solid rgba(11,30,45,0.03);}
    code{font-family:var(--mono); background:#eef6ff; padding:2px 6px; border-radius:6px;}
    ul, ol { margin:8px 0 8px 20px; }
    .hint{font-size:13px; color:var(--muted);}
    hr{border:0; border-top:1px solid rgba(11,30,45,0.06); margin:16px 0;}
  </style>
</head>
<body>
  <div class="wrap">
    <!-- <header>
      <h1>Eigenvalues & Eigenvectors — Notes</h1>
      <p class="hint">Math rendered with MathJax (use <code>$$...$$</code> for display math)</p>
    </header> -->
    <details open>
      <summary>Click to expand</summary>
      <section>
        <h2>1. Eigenvalues & Eigenvectors</h2>
        <ul>
          <li><strong>Definition.</strong> For a square matrix $A$, a nonzero vector $v$ is an <em>eigenvector</em> if
            $$
            A v = \lambda v,
            $$
            where $\lambda$ is a scalar called the <em>eigenvalue</em>.</li>
          <li><strong>Intuition.</strong> Applying $A$ to $v$ does not change $v$'s direction — only its magnitude (and possibly sign): it gets stretched/compressed (or flipped) by factor $\lambda$.</li>
        </ul>
        <div class="example">
          <p><strong>Quick check (worked):</strong> Let
            $$
            A = \begin{bmatrix}2 & 0 \\[4pt] 0 & 3 \end{bmatrix}.
            $$
            Test the standard basis vectors $e_1 = \begin{bmatrix}1\\0\end{bmatrix}$ and $e_2=\begin{bmatrix}0\\1\end{bmatrix}$:
          </p>
          <p>
            $$
            A e_1 = \begin{bmatrix}2 & 0\\ 0 & 3\end{bmatrix}\begin{bmatrix}1\\0\end{bmatrix} = \begin{bmatrix}2\\0\end{bmatrix} = 2 e_1,
            $$
            so $e_1$ is an eigenvector with eigenvalue $\lambda=2$.
          </p>
          <p>
            $$
            A e_2 = \begin{bmatrix}2 & 0\\ 0 & 3\end{bmatrix}\begin{bmatrix}0\\1\end{bmatrix} = \begin{bmatrix}0\\3\end{bmatrix} = 3 e_2,
            $$
            so $e_2$ is an eigenvector with eigenvalue $\lambda=3$.
          </p>
          <p class="hint"><strong>Answer:</strong> eigenvalues $\{2,3\}$ with eigenvectors along the coordinate axes (any nonzero scalar multiples of $e_1$ and $e_2$).</p>
        </div>
      </section>
      <hr/>
      <section>
        <h2>2. Why eigenvalues / eigenvectors matter</h2>
        <ul>
          <li>They reveal the <strong>principal directions</strong> of a linear transformation — the directions that are only scaled, not rotated.</li>
          <li>In machine learning and applied math they appear in many places:
            <ul>
              <li>Stability analysis of dynamical systems (eigenvalues tell growth/decay rates).</li>
              <li>Principal Component Analysis (PCA) — eigenvectors of covariance show main variation directions.</li>
              <li>Dimensionality reduction, feature extraction, spectral clustering.</li>
            </ul>
          </li>
        </ul>
      </section>
      <hr/>
      <section>
        <h2>3. Eigenfaces (computer vision)</h2>
        <p><strong>Idea:</strong> Treat each face image as a vector (flatten pixels). Compute the covariance matrix of training images and its eigenvectors — the top eigenvectors are <em>Eigenfaces</em>, i.e. the dominant modes of variation.</p>
        <ul>
          <li>Each face can be approximated by a linear combination of a few eigenfaces.</li>
          <li>Applications: face recognition (project into eigenface subspace), compression.</li>
        </ul>
      </section>
      <hr/>
      <section>
        <h2>4. Principal Component Analysis (PCA)</h2>
        <p><strong>Goal:</strong> reduce dimensionality while retaining most variance.</p>
        <p><strong>Standard steps:</strong></p>
        <ol>
          <li>Center the data: subtract the column mean from each column (so each feature has zero mean).</li>
          <li>Compute covariance matrix. One common unbiased (or population) form is
            $$
            C = \frac{1}{n}\, X^\top X
            $$
            where $X$ is the data matrix (rows = observations, columns = features) after centering. (Note: some authors use $\frac{1}{n-1}$ for sample covariance; be consistent with your convention.)
          </li>
          <li>Find eigenvectors and eigenvalues of $C$. Eigenvectors are principal components.</li>
          <li>Project the data onto the top $k$ eigenvectors (those with largest eigenvalues).</li>
        </ol>
        <p><strong>Connection:</strong> eigenvalues measure the variance explained by each component; sorting eigenvalues descending gives principal directions of decreasing variance.</p>
      </section>
      <hr/>
      <section>
        <h2>Visualization tip</h2>
        <p>Think of PCA as rotating the coordinate axes so that the new axes align with directions of maximum variance. Projecting onto the first $k$ axes keeps the most important variation.</p>
      </section>
    </details>
    <hr/>
  </div>
</body>
</html>


> # `Matrix Factorization`

<html lang="en">
<head>
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width,initial-scale=1" />
  <title></title>

  <!-- MathJax: allow $...$ and $$...$$ -->
  <script>
    window.MathJax = {
      tex: {
        inlineMath: [['$', '$'], ['\\(', '\\)']],
        displayMath: [['$$','$$'], ['\\[','\\]']]
      },
      options: {
        skipHtmlTags: ['script','noscript','style','textarea','pre','code']
      }
    };
  </script>
  <script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js" async></script>

  <style>
    :root{
      --bg:#fbfdff;
      --card:#ffffff;
      --muted:#6b7a90;
      --accent:#0b63d6;
      --mono: ui-monospace, SFMono-Regular, Menlo, Monaco, "Roboto Mono", "Courier New", monospace;
    }
    html,body{height:100%}
    body{
      margin:22px;
      font-family: Inter, system-ui, -apple-system, "Segoe UI", Roboto, "Helvetica Neue", Arial;
      background: linear-gradient(180deg,#ffffff 0%,var(--bg) 100%);
      color:#0b2030;
      -webkit-font-smoothing:antialiased;
      line-height:1.55;
    }
    .container{
      max-width:980px;
      margin:0 auto;
      background:var(--card);
      border-radius:12px;
      padding:22px;
      box-shadow:0 8px 28px rgba(11,30,45,0.06);
      border:1px solid rgba(11,99,214,0.04);
    }
    header{display:flex; justify-content:space-between; align-items:baseline; gap:12px; margin-bottom:14px;}
    header h1{margin:0; font-size:20px;}
    header p{margin:0; color:var(--muted); font-size:13px;}
    details{margin:8px 0;}
    summary{cursor:pointer; font-weight:700; font-size:15px;}
    details>summary::-webkit-details-marker{display:none;}
    details[open] > summary::after { content: "▾"; padding-left:8px; color:var(--muted); }
    details>summary::after { content: "▸"; padding-left:8px; color:var(--muted); }
    section{margin:12px 0;}
    h2{font-size:16px; margin:8px 0 6px;}
    p{margin:6px 0;}
    .example{background:#f1f8ff; padding:10px; border-radius:8px; border:1px solid rgba(11,99,214,0.06);}
    pre{background:#f7fbff; padding:10px; border-radius:8px; overflow:auto; border:1px solid rgba(11,30,45,0.03);}
    code{font-family:var(--mono); background:#eef6ff; padding:2px 6px; border-radius:6px;}
    ul, ol { margin:8px 0 8px 20px; }
    .hint{font-size:13px; color:var(--muted);}
    hr{border:0; border-top:1px solid rgba(11,30,45,0.06); margin:18px 0;}
    .answer{background:#fff8e1; border-left:4px solid #ffb020; padding:8px 12px; border-radius:6px; margin-top:8px;}
  </style>
</head>
<body>
  <div class="container">
    <!-- <header>
      <h1>Matrix Factorizations — Notes</h1>
      <p class="hint">MathJax enabled — block math uses <code>$$...$$</code></p>
    </header> -->
    <details open>
      <summary>Click to expand</summary>
      <section>
        <h2>1. LU Decomposition</h2>
        <p><strong>What it is:</strong> Factor a square matrix $A$ into</p>
        <div class="example">
          $$
          A = L\,U
          $$
          where $L$ is lower-triangular (often with 1s on the diagonal) and $U$ is upper-triangular.
        </div>
        <p><strong>Use:</strong> Efficiently solve linear systems, e.g. once you have $L$ and $U$ you solve $Ax=b$ by forward/back substitution:</p>
        <ol>
          <li>solve $L y = b$ (forward substitution)</li>
          <li>solve $U x = y$ (back substitution)</li>
        </ol>
        <p><strong>ML connection:</strong> Speed up repeated solves (e.g. many right-hand sides) — useful in regression and numerical optimization.</p>
        <div class="answer"><strong>Quick check — why is LU easier?</strong>
        Because triangular systems are cheap to solve (forward/back substitution cost $\mathcal{O}(n^2)$ each), while direct solving with $A$ (e.g. Gaussian elimination each time) repeats work. Once you factorize $A=LU$, reusing $L$ and $U$ is much cheaper when solving multiple $b$'s.</div>
      </section>
      <hr/>
      <section>
        <h2>2. QR Decomposition</h2>
        <p><strong>What it is:</strong> Factor a matrix $A$ as</p>
        <div class="example">
          $$
          A = Q R
          $$
          where $Q$ is orthogonal ($Q^\top Q = I$) and $R$ is upper-triangular.
        </div>
        <p><strong>Use:</strong> Numerically stable solving of least squares. If $A$ is $m\times n$ and full rank, the least-squares solution to $\min_x\|Ax-b\|_2$ can be obtained from $R x = Q^\top b$.</p>
        <p><strong>ML connection:</strong> regression solutions, Gram–Schmidt orthogonalization, constructing orthonormal features.</p>
        <div class="answer"><strong>Check — what does “orthogonal” mean for columns of $Q$?</strong>
        The columns are mutually orthonormal: their pairwise dot products are zero and each column has unit length. Formally, for columns $q_i, q_j$, $q_i^\top q_j = 0$ if $i\neq j$, and $q_i^\top q_i = 1$.</div>
      </section>
      <hr/>
      <section>
        <h2>3. Eigen Decomposition</h2>
        <p><strong>What it is:</strong> Factor a square matrix $A$ (diagonalizable case) as</p>
        <div class="example">
          $$
          A = V \Lambda V^{-1}
          $$
          where columns of $V$ are eigenvectors and $\Lambda$ is diagonal with eigenvalues.
        </div>
        <p><strong>Use:</strong> analyze linear transformations, solve systems in modal coordinates, exponentiate matrices (e.g., $e^{At}$ via eigen decomposition when possible).</p>
        <p><strong>ML connection:</strong> PCA (covariance eigen-decomposition), spectral clustering, covariance/correlation analyses.</p>
        <div class="answer"><strong>Check — if an eigenvalue is very small (≈ 0), what does that say?</strong>
        It indicates almost no variance (or negligible action) in that eigenvector direction — the data (or transformation) compresses that direction strongly, so it contributes little to variance and might be dropped for dimensionality reduction.</div>
      </section>
      <hr/>
      <section>
        <h2>4. Singular Value Decomposition (SVD)</h2>
        <p><strong>What it is:</strong> Factor any matrix $A$ (square or rectangular) as</p>
        <div class="example">
          $$
          A = U \Sigma V^\top
          $$
          where $U$ (left singular vectors) and $V$ (right singular vectors) are orthogonal, and $\Sigma$ is diagonal with non-negative singular values (usually ordered descending).
        </div>
        <p><strong>Use:</strong> dimension reduction, low-rank approximation (Eckart–Young theorem), denoising, pseudoinverse computation.</p>
        <p><strong>ML connection:</strong> PCA (SVD on centered data), Latent Semantic Analysis (LSA) in NLP, collaborative filtering in recommender systems.</p>
        <div class="answer"><strong>Check — why is SVD more general than eigen decomposition?</strong>
        SVD applies to any $m\times n$ matrix (even non-square, non-symmetric). Eigen decomposition requires a square matrix (and diagonalizability for $A=V\Lambda V^{-1}$). SVD gives orthonormal bases for domain and codomain simultaneously and always exists for real matrices.</div>
      </section>
      <hr/>
      <section>
        <h2>5. Non-Negative Matrix Factorization (NMF)</h2>
        <p><strong>What it is:</strong> For a non-negative matrix $A$, find non-negative factors $W,H$ such that</p>
        <div class="example">
          $$
          A \approx W H \qquad\text{with } W,H \ge 0.
          $$
        </div>
        <p><strong>Use:</strong> parts-based, interpretable decomposition where components and coefficients are non-negative (often sparser and easier to interpret).</p>
        <p><strong>ML connection:</strong> topic modeling (topics × document weights), image decomposition (parts of an image), recommender systems with non-negative constraints.</p>
        <div class="answer"><strong>Check — why non-negativity helps interpretation?</strong>
        Forcing non-negativity prevents canceling positive and negative components (no subtractive combinations). Components add up to form data, making factors more "parts-based" and often easier to map to real-world parts (e.g., topics, facial parts), unlike SVD where singular vectors can have positive and negative entries and require cancellations to form original data.</div>
      </section>
    </details>
    <hr/>
  </div>
</body>
</html>


> # `Advanced Topics`

<html lang="en">
<head>
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width,initial-scale=1" />
  <title></title>

  <!-- MathJax configuration (supports $...$ and $$...$$) -->
  <script>
    window.MathJax = {
      tex: {
        inlineMath: [['$', '$'], ['\\(', '\\)']],
        displayMath: [['$$','$$'], ['\\[','\\]']]
      },
      options: {
        skipHtmlTags: ['script','noscript','style','textarea','pre','code']
      }
    };
  </script>
  <script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js" async></script>

  <style>
    :root{
      --bg:#fbfdff;
      --card:#ffffff;
      --muted:#657786;
      --accent:#0b63d6;
      --mono: ui-monospace, SFMono-Regular, Menlo, Monaco, "Roboto Mono", "Courier New", monospace;
    }
    html,body{height:100%}
    body{
      margin:22px;
      font-family:Inter, system-ui, -apple-system, "Segoe UI", Roboto, "Helvetica Neue", Arial;
      background: linear-gradient(180deg,#ffffff 0%,var(--bg) 100%);
      color:#0b2030;
      -webkit-font-smoothing:antialiased;
      line-height:1.55;
    }
    .wrap{
      max-width:980px;
      margin:0 auto;
      background:var(--card);
      border-radius:12px;
      padding:22px;
      box-shadow:0 10px 30px rgba(11,30,45,0.06);
      border:1px solid rgba(11,99,214,0.04);
    }
    header{display:flex; justify-content:space-between; align-items:baseline; gap:12px; margin-bottom:12px;}
    header h1{margin:0; font-size:20px;}
    header p{margin:0; color:var(--muted); font-size:13px;}
    details{margin:10px 0;}
    summary{cursor:pointer; font-weight:700; font-size:15px;}
    details>summary::-webkit-details-marker { display: none; }
    details[open] > summary::after { content: "▾"; padding-left:8px; color:var(--muted); }
    details>summary::after { content: "▸"; padding-left:8px; color:var(--muted); }
    section{margin:12px 0;}
    h2{font-size:16px; margin:8px 0 6px;}
    p{margin:6px 0;}
    .example{background:#f1f8ff; padding:12px; border-radius:8px; border:1px solid rgba(11,99,214,0.06);}
    pre{background:#f7fbff; padding:10px; border-radius:8px; overflow:auto; border:1px solid rgba(11,30,45,0.03);}
    code{font-family:var(--mono); background:#eef6ff; padding:2px 6px; border-radius:6px;}
    .answer{background:#fff8e6; border-left:4px solid #ffb146; padding:10px; border-radius:6px; margin-top:10px;}
    ul, ol { margin:8px 0 8px 20px; }
    .hint{font-size:13px; color:var(--muted);}
    hr{border:0; border-top:1px solid rgba(11,30,45,0.06); margin:16px 0;}
  </style>
</head>
<body>
  <div class="wrap">
    <details open>
      <summary>Click to expand</summary>
      <section>
        <h2>1. Moore–Penrose Pseudoinverse</h2>
        <p><strong>What it is:</strong> a generalization of the matrix inverse for matrices that are non-square or singular. For $A\in\mathbb{R}^{m\times n}$, its pseudoinverse $A^+$ satisfies a set of Moore–Penrose conditions and gives best-fit/optimal solutions for least squares problems.</p>
        <p><strong>SVD formula:</strong> if
          $$
          A = U \Sigma V^\top,
          $$
          then
          $$
          A^+ = V\,\Sigma^+\,U^\top,
          $$
          where $\Sigma^+$ is formed by taking reciprocals of the nonzero singular values (and transposing the shape).</p>
        <div class="example">
          <p><strong>NumPy:</strong></p>
          <pre><code>import numpy as np
A = np.array([[1., 2., 3.],
              [4., 5., 6.]])   # 2x3
A_pinv = np.linalg.pinv(A)</code></pre>
        </div>
        <div class="answer">
          <strong>Check — why is pseudoinverse more flexible than a normal inverse?</strong>
          <p>Because $A^+$ exists for any matrix (real matrices always have an SVD), whereas $A^{-1}$ exists only for square, full-rank matrices. The pseudoinverse gives the minimum-norm solution to under/overdetermined least squares problems and therefore handles non-square or singular cases gracefully.</p>
        </div>
      </section>
      <hr/>
      <section>
        <h2>2. Quadratic Forms</h2>
        <p><strong>Definition:</strong> A quadratic form is</p>
        <div class="example">
          $$
          Q(x) = x^\top A x,
          $$
          where $x\in\mathbb{R}^n$ and typically $A$ is symmetric (we can always symmetrize $A$ without changing $Q$).</div>
        <p><strong>Geometric meaning:</strong> Quadratic forms describe ellipsoids, hyperboloids, etc.; level sets $x^\top A x = c$ are conic sections in general.</p>
        <div class="answer">
          <strong>Check — if $A=I$, what is $Q(x)=x^\top I x$?</strong>
          <p>Then
            $$
            Q(x)=x^\top x = \|x\|_2^2,
            $$
            the squared Euclidean norm. Its level sets are spheres (centered at the origin).</p>
        </div>
        <div class="example">
          <p><strong>NumPy (compute quadratic form)</strong></p>
          <pre><code>x = np.array([1.,2.,3.])
A = np.eye(3)
Q = x.T @ A @ x   # equals np.dot(x,x)</code></pre>
        </div>
      </section>
      <hr/>
      <section>
        <h2>3. Positive Definite Matrices</h2>
        <p><strong>Definition:</strong> Symmetric $A$ is <em>positive definite</em> if</p>
        <div class="example">
          $$
          x^\top A x > 0 \quad\text{for all nonzero } x.
          $$
        </div>
        <p><strong>Properties:</strong></p>
        <ul>
          <li>All eigenvalues are positive.</li>
          <li>$A$ is invertible and well-behaved in optimization.</li>
          <li>Cholesky decomposition exists ($A = R^\top R$ for some upper-triangular $R$).</li>
        </ul>
        <div class="answer">
          <strong>Check — why do optimization algorithms “like” positive definite Hessians?</strong>
          <p>Because a positive definite Hessian at a point implies the objective is locally strictly convex there — a unique local minimum. Algorithms (Newton, quasi-Newton) rely on inverting or factoring the Hessian; positive definiteness guarantees invertibility and well-conditioned quadratic approximations, giving reliable descent steps and fast (often quadratic) convergence near the minimum.</p>
        </div>
      </section>
      <hr/>
      <section>
        <h2>4. Hadamard Product</h2>
        <p><strong>Definition:</strong> The Hadamard product (element-wise product) of same-size matrices $A$ and $B$ is</p>
        <div class="example">
          $$
          (A\circ B)_{ij} = A_{ij}\,B_{ij}.
          $$
        </div>
        <p><strong>Not the same as</strong> matrix multiplication — it multiplies corresponding entries.</p>
        <div class="example">
          <p><strong>Quick check:</strong> If</p>
          $$
          A = \begin{bmatrix}1 & 2\\[4pt] 3 & 4\end{bmatrix},\quad
          B = \begin{bmatrix}5 & 6\\[4pt] 7 & 8\end{bmatrix},
          $$
          then
          $$
          A\circ B = \begin{bmatrix}1\cdot 5 & 2\cdot 6\\[4pt] 3\cdot 7 & 4\cdot 8\end{bmatrix}
                    = \begin{bmatrix}5 & 12\\[4pt] 21 & 32\end{bmatrix}.
          $$
        </div>
        <div class="example">
          <p><strong>NumPy:</strong></p>
          <pre><code>A * B   # element-wise product in NumPy (Hadamard)</code></pre>
        </div>
      </section>
    </details>
    <hr/>
</body>
</html>
