# Analisi FLOPs per ResNet-18 e caching
Questo notebook calcola i FLOPs dei primi layer di ResNet-18 e analizza scenari di caching per ridurre il carico computazionale.

In [1]:
# Definizione dei layer convoluzionali nei primi 3 blocchi della ResNet-18 come da documento

layers = [
    ("Conv1", 112, 112, 7, 3, 64, 1),
    ("Block1_Conv1", 56, 56, 3, 64, 64, 1),
    ("Block1_Conv2", 56, 56, 3, 64, 64, 1),
    ("Block2_Conv1", 56, 56, 3, 64, 64, 1),
    ("Block2_Conv2", 56, 56, 3, 64, 64, 1),
    ("Block3_Conv1", 28, 28, 3, 64, 128, 1),
    ("Block3_Conv2", 28, 28, 3, 128, 128, 1),
    ("Block4_Conv1", 28, 28, 3, 128, 128, 1),
    ("Block4_Conv2", 28, 28, 3, 128, 128, 1),
    ("Block5_Conv1", 14, 14, 3, 128, 256, 1),
    ("Block5_Conv2", 14, 14, 3, 256, 256, 1),
    ("Block6_Conv1", 14, 14, 3, 256, 256, 1),
    ("Block6_Conv2", 14, 14, 3, 256, 256, 1),
]

In [2]:
# Calcolo FLOPs per ciascuna convoluzione
results = []
for name, H, W, K, C_in, C_out, n_convs in layers:
    flops = 2 * H * W * (K ** 2) * C_in * C_out * n_convs
    results.append((name, flops))

for name, flops in results:
    print(f"{name}: {flops / 1e9:.4f} GFLOPs")

total_flops = sum(flops for _, flops in results)
print(f"Total GFLOPs: {total_flops / 1e9:.4f} GFLOPs")

Conv1: 0.2360 GFLOPs
Block1_Conv1: 0.2312 GFLOPs
Block1_Conv2: 0.2312 GFLOPs
Block2_Conv1: 0.2312 GFLOPs
Block2_Conv2: 0.2312 GFLOPs
Block3_Conv1: 0.1156 GFLOPs
Block3_Conv2: 0.2312 GFLOPs
Block4_Conv1: 0.2312 GFLOPs
Block4_Conv2: 0.2312 GFLOPs
Block5_Conv1: 0.1156 GFLOPs
Block5_Conv2: 0.2312 GFLOPs
Block6_Conv1: 0.2312 GFLOPs
Block6_Conv2: 0.2312 GFLOPs
Total GFLOPs: 2.7793 GFLOPs


### Optimistic Scenario:
In this scenario we assume that's ideally possible to perform a truncation during the convolutional operation, expecially we cna assume that we can truncate the output channel to avoid the computation of the 20% of the channels that we cached. (No way to do it using actual libraries and i think hard to implement ... does it make sense to do it?)


In [32]:
# Scenario ottimistico con caching del 20%
fraction_cached = 0.2
fraction_uncached = 1.0 - fraction_cached
student_model_cost = 0.5 * total_flops
flops_after_caching = fraction_uncached * total_flops + student_model_cost
flops_saved = total_flops - flops_after_caching

print("\n --- Caching Analysis in optimistic scenario ---")
print(f"Total FLOPs before caching: {float(total_flops) / 1e9:.4f} GFLOPs")
print(f"FLOPs of cached layers: {float(fraction_uncached * total_flops) / 1e9:.4f} GFLOPs ({-fraction_cached * 100:.2f}% of total FLOPs)")
print(f"Total FLOPs after caching: {float(flops_after_caching) / 1e9:.4f} GFLOPs, which is {float(flops_after_caching) / float(total_flops) * 100:.2f}% of the total FLOPs")
print(f"|-> Considering sum of \t the caching optimization {float(fraction_uncached * total_flops) / 1e9:.4f} and \n \t\t\t the student model cost {float(student_model_cost) / 1e9:.4f}")
print(f"FLOPs saved: {float(flops_saved) / 1e9:.4f} GFLOPs, which is {float(flops_saved) / float(total_flops) * 100:.2f}% of the total FLOPs")


 --- Caching Analysis in optimistic scenario ---
Total FLOPs before caching: 2.7793 GFLOPs
FLOPs of cached layers: 2.2235 GFLOPs (-20.00% of total FLOPs)
Total FLOPs after caching: 3.6132 GFLOPs, which is 130.00% of the total FLOPs
|-> Considering sum of 	 the caching optimization 2.2235 and 
 			 the student model cost 1.3897
FLOPs saved: -0.8338 GFLOPs, which is -30.00% of the total FLOPs


### Realistic Scenario:
In this scenario we assume that we can cache the output of the first layers and reuse it for the next, but we still need to compute the full output of the first layer. This is a more realistic approach as it reflects current capabilities in deep learning frameworks.

In [28]:
# Scenario pessimista: nessun risparmio da caching
flops_after_caching_pessimistic = total_flops + student_model_cost
flops_saved_pessimistic = total_flops - flops_after_caching_pessimistic

print(f"Total FLOPs before caching: {float(total_flops) / 1e9:.4f} GFLOPs")
print(f"FLOPs of cached layers: {total_flops / 1e9:.4f} GFLOPs (no caching)")
print(f"Total FLOPs after caching: {flops_after_caching_pessimistic / 1e9:.4f} GFLOPs, which is {flops_after_caching_pessimistic / total_flops * 100:.2f}% of the total FLOPs")
print(f"|-> Considering sum of \t the caching optimization {float(total_flops) / 1e9:.4f} and \n \t\t\t the student model cost {float(student_model_cost) / 1e9:.4f}")
print(f"FLOPs saved: {flops_saved_pessimistic / 1e9:.4f} GFLOPs, which is {flops_saved_pessimistic / total_flops * 100:.2f}% of the total FLOPs")

Total FLOPs before caching: 2.7793 GFLOPs
FLOPs of cached layers: 2.7793 GFLOPs (no caching)
Total FLOPs after caching: 4.1690 GFLOPs, which is 150.00% of the total FLOPs
|-> Considering sum of 	 the caching optimization 2.7793 and 
 			 the student model cost 1.3897
FLOPs saved: -1.3897 GFLOPs, which is -50.00% of the total FLOPs


### Optimistic Scenario with Caching:
In this scenario, we assume that we can reuse all the cached outputs from the first layers for the following, effectively reducing the computational load significantly. The document discuss show that this approach leads to a significant reduction in accuracy and performance.

In [29]:
# Scenario ottimistico con caching completo
fraction_cached_optimistic = 1.0
fraction_uncached_optimistic = 0.0
flops_after_caching_optimistic = fraction_uncached_optimistic * total_flops + student_model_cost
flops_saved_optimistic = total_flops - flops_after_caching_optimistic

print(f"Total FLOPs before caching: {float(total_flops) / 1e9:.4f} GFLOPs")
print(f"FLOPs of cached layers: {float(fraction_uncached_optimistic * total_flops) / 1e9:.4f} GFLOPs ({-fraction_cached_optimistic * 100:.2f}% of total FLOPs)")
print(f"Total FLOPs after caching: {float(flops_after_caching_optimistic) / 1e9:.4f} GFLOPs, which is {float(flops_after_caching_optimistic) / float(total_flops) * 100:.2f}% of the total FLOPs")
print(f"|-> Considering sum of \t the caching optimization {float(fraction_uncached_optimistic * total_flops) / 1e9:.4f} and \n \t\t\t the student model cost {float(student_model_cost) / 1e9:.4f}")
print(f"FLOPs saved: {float(flops_saved_optimistic) / 1e9:.4f} GFLOPs, which is {float(flops_saved_optimistic) / float(total_flops) * 100:.2f}% of the total FLOPs")

Total FLOPs before caching: 2.7793 GFLOPs
FLOPs of cached layers: 0.0000 GFLOPs (-100.00% of total FLOPs)
Total FLOPs after caching: 1.3897 GFLOPs, which is 50.00% of the total FLOPs
|-> Considering sum of 	 the caching optimization 0.0000 and 
 			 the student model cost 1.3897
FLOPs saved: 1.3897 GFLOPs, which is 50.00% of the total FLOPs
