HanMoonSub · SEOYUNJE · Jun 1, 2026 · May 30, 2026 · Jun 1, 2026 · Jun 1, 2026
diff --git a/deepguard/MS_EffGCViT.md b/deepguard/MS_EffGCViT.md
@@ -45,32 +45,20 @@ much larger SOTA models while using a fraction of the parameters and compute.
 <br>
 <img src="../docs/benchmarks/celeb_df_v2_gcvit_2.png" width="100%">
 
-| Variant | Test@Acc | Test@AUC | Test@LogLoss |
-| :------ | :------: | :------: | :----------: |
-| ms_eff_gcvit_b0 | 0.9842 | 0.9965 | 0.0283 |
-| ms_eff_gcvit_b5 | 0.9981 | 0.9984 | 0.0089 |
 </details>
 
 <details>
 <summary><b>📊 FaceForensics++ — Accuracy & Efficiency</b></summary>
 <br>
 <img src="../docs/benchmarks/ff_gcvit.png" width="100%">
 
-| Variant | Test@Acc | Test@AUC | Test@LogLoss |
-| :------ | :------: | :------: | :----------: |
-| ms_eff_gcvit_b0 | 0.9808 | 0.9969 | 0.0637 |
-| ms_eff_gcvit_b5 | 0.9850 | 0.9974 | 0.0492 |
 </details>
 
 <details>
 <summary><b>📊 KoDF Competition — Accuracy Ranking</b></summary>
 <br>
 <img src="../docs/benchmarks/kodf_gcvit.png" width="100%">
 
-| Variant | Test@Acc | Test@AUC | Test@LogLoss |
-| :------ | :------: | :------: | :----------: |
-| ms_eff_gcvit_b0 | 0.9655 | 0.9792 | 0.1237 |
-| ms_eff_gcvit_b5 | 0.9792 | 0.9974 | 0.0492 |
 </details>
 
 ## Model Indroduction

diff --git a/deepguard/MS_EffViT.md b/deepguard/MS_EffViT.md
@@ -12,6 +12,8 @@ This Repository presents the PyTorch implementation of **Multi Scale Efficient V
 
 This model is a **frame-level** and **spatial-domain** architecture, designed to perform classification tasks on both **static images** and **video sequences**
 
+<img src="../docs/benchmarks/celeb_df_v2_vit.png" width="900">
+
 ## 💥 News 💥
 
 - [**02.03.2026**] 🔥🔥 We have released **FaceForensics++** fine-tuned **MS-Eff-ViT B5** model weightes for **384X384**
@@ -21,18 +23,32 @@ This model is a **frame-level** and **spatial-domain** architecture, designed to
 
 ## Model Performance
 
-MS_Eff_ViT achieves state-of-the-art(SOTA) results across deepfake video classification. On Celeb_DF(v2) dataset, MS_EFF_GCViT variants with `5.9M`, `52.0M` parameters achieve `0.9742`, `0.9900` Accuracy. Notably, the MS_EFF_ViT_B0 variant demonstrates exceptional efficiency, matching or exceeding SOTA performance even with a siginificantly lower parameter
+**MS-EFF-ViT achieves state-of-the-art (SOTA) results across two DeepFake benchmarks.**
+The model ships in two variants from a single architecture — **Fast (b0)** for real-time / edge
+deployment and **Pro (b5)** for enterprise-grade accuracy. Notably, **Fast** matches or exceeds
+much larger SOTA models while using a fraction of the parameters and compute.
 
+<p align="center">
+  <img src="../docs/benchmarks/vit_summary_bars.png" width="100%">
+</p>
 
-### Test Result of Celeb_DF(v2)
 
-<img src="../docs/benchmarks/celeb_df_v2_vit.png" width="900">
+> On **Celeb-DF(v2)**, Pro reaches **0.9900 Acc** (rank #2) and Fast **0.9742** (rank #4) among 20 architectures.
+
+<details>
+<summary><b>📊 Celeb-DF (v2) — Accuracy & Efficiency</b></summary>
+<br>
+<img src="../docs/benchmarks/celeb_df_v2_vit_2.png" width="100%">
+
+</details>
 
 <details>
-<summary><span style="font-size: 1.25em; font-weight: bold;">Test Result of FaceForensics++</span></summary>
-<img src="../docs/benchmarks/ff_vit.png" width="900">
+<summary><b>📊 FaceForensics++ — Accuracy & Efficiency</b></summary>
+<br>
+<img src="../docs/benchmarks/ff_vit.png" width="100%">
 </details>
 
+
 ## Model Introduction
 
 Multi Scale Efficient Vision Transformer is an optimized multi-scale hybrid architecture that integrates CNN-driven spatial inductive bias with self-attention mechanisms to effectively identify subtle(local) artifacts and macro(global) artifacts for robust deepfake forensics."

diff --git a/docs/benchmarks/celeb_df_v2_vit.png b/docs/benchmarks/celeb_df_v2_vit.png
diff --git a/docs/benchmarks/celeb_df_v2_vit_2.png b/docs/benchmarks/celeb_df_v2_vit_2.png
diff --git a/docs/benchmarks/ff_vit.png b/docs/benchmarks/ff_vit.png
diff --git a/docs/benchmarks/vit_summary_bars.png b/docs/benchmarks/vit_summary_bars.png