# Stochastic Gradient Descent PageRank (SGD PageRank)

This document provides a detailed algorithmic description for a SGD PageRank implementation tailored for graphs derived from single-cell multiomic data. The aim is to prioritize the nodes (cells) based on their topological importance, refined by SGD to handle large-scale data efficiently.

## Preliminary Steps

### Dynamic Neighborhood Hopping

To capture the extended topological features of the graph, we implement dynamic neighborhood hopping:

$$
A^{(\alpha)} = A^{\alpha}
$$

where \( A \) is the adjacency matrix, and \( \alpha \) is the predefined number of hops ensuring all nodes have direct paths to all other nodes within \( \alpha \) hops.

### Scaling Factor Calculation (Si)

The scaling factor for each node is calculated to down-weight the influence of highly connected nodes:

$$
S_i = \frac{1}{\text{degree}(i)} + C(D_i)
$$

where \( C(D_i) \) represents a correction based on the node's adjacency set.

### Matrix Scaling (Mij)

We scale the k-nearest neighbors (KNN) matrix by applying a dot product of \( S_i \) to both incoming and outgoing connections:

$$
M_{ij} = S_i \cdot M \cdot S_i
$$

where \( M \) is the KNN matrix.

## SGD PageRank Algorithm

The main algorithm proceeds as follows:

```python
def SGDpagerank(M, num_iterations, mini_batch_size, initial_learning_rate, tolerance, d, full_batch_update_iters, dip_window, plateau_iterations, sampling_method, init_vect=None, **kwargs):
    # ... (implementation details)
```

### Learning Rate (Alpha)

The learning rate is updated at each iteration to ensure convergence:

$$
\alpha = \frac{1}{1 + \text{decay_rate} \cdot \text{iteration}}
$$

### PageRank Initialization

A random rank vector \( v \) is initialized and normalized:

$$
v = \frac{\text{rand}(N, 1)}{\| \text{rand}(N, 1) \|_1}
$$

### Mini-Batch SGD Iterations

At each iteration, a subset of nodes is selected, and the PageRank vector is updated:

$$
v_{\text{mini_batch}} = d \cdot (\alpha \cdot M_{\text{mini_batch}} @ v) + \left(\frac{1 - d}{N}\right)
$$

where \( @ \) denotes matrix-vector multiplication.

## Convergence Check

We monitor the L2 norm of the PageRank vector difference for convergence:

$$
\| v_{\text{iter}} - v_{\text{prev}} \|_2 < \text{tolerance}
$$

## Full-Batch Updates

After the main SGD iterations, we perform a number of full-batch updates for fine-tuning:

$$
v = d \cdot (M @ v) + \left(\frac{1 - d}{N}\right)
$$

## Post-Processing

### Softmax Transformation

Once the PageRank vector is obtained, a softmax transformation is applied to obtain a probability distribution for sampling:

$$
P_i = \frac{e^{v_i}}{\sum_{j=1}^N e^{v_j}}
$$

### Sampling Strategy

Finally, we perform sampling from the softmax-transformed PageRank scores to select nodes:

- Sampling is done 100 times using the PageRank scores.
- The observed probability of sampling is used to derive the output node indices at a given proportion.

This concludes the detailed algorithmic description of the SGD PageRank methodology.
