<a href="https://colab.research.google.com/github/y-oth/dst_assessment2/blob/main/report/Scaling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Scaling

While we have already discussed why interpretability is essential, especially in medical settings where model transparency is critical, we now consider how different interpretability methods scale to industrial or clinical deployment. In a hospital environment, scaling does not only mean processing data from more patients. MRI scans are 3D volumes consisting of approximately 100–250 slices depending on the resolution of the scanner. In this project, we used a single slice per patient, but in practice, a full MRI volume would be analysed. Therefore, scaling CNNs and interpretability methods requires handling both more patients and more slices per patient.
Below we evaluate the scalability of each interpretability method used in this project.

**Vanilla gradients** compute the gradient of the class score with respect to each pixel (i, j), measuring how sensitive the prediction is to small changes in that pixel. This requires only a single backward pass and is therefore extremely fast and highly scalable to large medical datasets.

**SmoothGrad** adds Gaussian noise to the input, computes vanilla gradients for N noisy samples, and averages the resulting saliency maps. This requires N backward passes per image (e.g., 50× more computation), making it computationally expensive and generally infeasible for large-scale MRI datasets.

**Integrated gradients** starts with a baseline image (eg all pixels are zero) and changes the image m times along the path toward the input image. At each step, it computes the gradient of the class score with respect to the pixels. This requires m gradient evaluations (commonly 50–300). IG can scale to larger datasets if m is kept modest (around 50); otherwise the cost grows significantly.

**SG-IG** combines SmoothGrad and Integrated Gradients. For N noisy samples, IG is computed using m steps for each, resulting in N × m gradient evaluations per image. This method is computationally heavy and does not scale well to datasets with many MRI slices or many patients.

**Grad-CAM** computes the gradient of the class score with respect to the final convolutional feature maps, uses these gradients to weight the feature maps, and upsamples the result to produce a heatmap. Because it requires only one backward pass and operates on low-resolution feature maps rather than pixels, Grad-CAM is extremely efficient and highly scalable in industrial applications.

**LRP** stores activations during the forward pass and then backpropagates relevance layer by layer using specially designed propagation rules. This process is more memory-intensive and slower than gradient-based methods, making LRP less efficient and harder to scale to large MRI datasets or full 3D volumes.
