roadmap

yp-edu · Jun 25, 2024 · ef14d2e · ef14d2e
1 parent 60e43d2
commit ef14d2e
Show file tree

Hide file tree

Showing 2 changed files with 108 additions and 2 deletions.
diff --git a/assets/publications/css/index.css b/assets/publications/css/index.css
@@ -133,5 +133,9 @@ body {
   font-size: smaller;
 }
 
-
+.checkbox-list {
+  list-style-type: none !important; 
+  padding-left: 0;
+  float: left;
+}
 
diff --git a/pages/_publications/lczero-planning.html b/pages/_publications/lczero-planning.html
@@ -71,7 +71,7 @@ <h1 class="title is-1 publication-title">Contrastive Sparse Autoencoders for Int
                   <span class="icon">
                     <i class="fas fa-pen"></i>
                   </span>
-                  <span>OpenReview</span>
+                  <span>OpenReview (RLC)</span>
                 </a>
               </span>
 
@@ -163,7 +163,109 @@ <h2 class="subtitle has-text-centered">
 </section>
 <!-- End image carousel -->
 
+<section class="section hero">
+  <div class="container is-max-desktop">
+    <div class="columns is-centered has-text-centered">
+      <div class="column is-four-fifths">
+        <h2 class="title is-3">Reviews</h2>
+        <div class="is-size-5 publication-authors">
+          <span class="author-block">Mechanistic Interpretability Workshop @ ICML-2024 </span>
+        </div>
+        <div class="column has-text-centered">
+          <div class="publication-links">
+            <span class="link-block">
+              <a href="https://openreview.net/forum?id=tXe9BqcjNY" target="_blank"
+              class="external-link button is-normal is-rounded is-dark">
+                <span class="icon">
+                  <i class="fas fa-pen"></i>
+                </span>
+                <span>OpenReview (ICML)</span>
+              </a>
+            </span>
+          </div>
+        </div>
+        <div>
+          <span>Main criticisms:</span>
+        </div>
+        <div class="content has-text-justified">
+          <ul>
+            <li>
+              <strong>Not enough training details:</strong> The training of the CSAE is under-detailed, particularly concerning the choice of hyperparameters, 
+              data generation, and evaluation metrics. 
+              Specific layers used for training and the integration of the contrast loss are not clearly explained, making it difficult to replicate or understand the methodology fully.
+            </li>
+            <li>
+              <strong>Lack of comparison with other methods:</strong> The paper fails to compare the proposed CSAE 
+              method with other feature extraction techniques, such as standard Sparse Autoencoders (SAE), Independent Component Analysis (ICA), and other clustering or probing methods. 
+              This comparison is crucial to validate the efficacy and novelty of the CSAE over existing methods.
+            </li>
+            <li>
+              <strong>Lack of feature interpretation:</strong> The interpretation of features generated by CSAE is inadequate. The paper does not convincingly demonstrate that the identified features correspond to meaningful chess concepts, 
+              as only a few cherry-picked examples are provided without thorough validation of monosemanticity or broader representativeness.
+            </li>
+            <li>
+              <strong>No utilisation of the proposed clustering:</strong>   Although clustering and dendrogram techniques are mentioned, they are not effectively used to enhance the understanding of the feature space. 
+              The paper does not provide labeled clusters or investigate the similarity within clusters to help readers understand the model’s internal representation.
+            </li>
+            <li>
+              <strong>Insufficient qualitative and quantitative evaluations:</strong>   The qualitative assessments do not scale well 
+              with human effort, and there's a lack of extensive qualitative evidence to support the interpretability of learned features. Quantitatively, the performance metrics like F1, precision, and recall are barely above threshold levels, and no robust statistical analysis is provided to support the findings. Moreover, there's a gap in demonstrating 
+              how these features impact chess-playing decisions in practical scenarios.
+            </li>
+          </ul>
+        </div>
+      </div>
+    </div>
+  </div>
+</section>
 
+<section class="section hero">
+  <div class="container is-max-desktop">
+    <div class="columns is-centered has-text-centered">
+      <div class="column is-four-fifths">
+        <h2 class="title is-3">Roadmap</h2>
+        <div class="publication-authors">
+          <span>I propose the following roadmap to address the reviews:</span>
+        </div>
+        <div class="content has-text-justified">
+          <ul class="checkbox-list">
+            <li>
+              <input type="checkbox" onclick="return false;"/> <strong> Enhanced method evaluation:</strong> 
+              Currently, the paper lacks comparative analysis with other methods. 
+              To address this, I plan to conduct evaluations against simple heuristic models to determine how 
+              effectively my method extracts meaningful features. Additionally, I will compare the performance of Contrastive Sparse Autoencoders (CSAEs) with standard Sparse Autoencoders (SAE) and other relevant techniques to establish a clear benchmark.
+            </li>
+            <li>
+              <input type="checkbox" onclick="return false;"/> <strong>Feature ablation study:</strong> I will carry out an ablation study to assess the impact of individual features extracted by the CSAE on the decision-making process of the chess agent. 
+              This will help quantify the contribution of each feature towards enhancing the agent’s planning capabilities.
+            </li>
+            <li>
+              <input type="checkbox" onclick="return false;"/> <strong>Expose more features:</strong> In addition to the Huggingface space created for the paper, I will provide a more detailed
+              analysis of the features extracted by the CSAE.
+            </li>
+            <li>
+              <input type="checkbox" onclick="return false;"/> <strong>Remove or rethink the clustering approach:</strong> While theoretically the clustering approach would 
+              be interesting to scale human analysis it is not clear how it would be used in practice. 
+              I have no clear idea on how to address this issue yet and might remove it from the paper.
+            </li>
+          </ul>
+        </div>
+        <div>
+          <span>What I might leave out for further work:</span>
+        </div>
+        <div class="content has-text-justified">
+          <ul class="checkbox-list">
+            <li>
+              <input type="checkbox" onclick="return false;"/> <strong>Establishing a proper benchmark: </strong> Setting up a robust benchmark, particularly in the context of chess, 
+                would address many criticisms and provide a more objective evaluation of the CSAE. However, due to the significant work required, 
+                I consider this an essential next step for future research rather than for inclusion in the current paper revision
+            </li>
+          </ul>
+        </div>
+      </div>
+    </div>
+  </div>
+</section>
 
 
 <!-- Youtube video -->