From fa9bbb0075701e7bc3fc955b5f4aa8c70df3f285 Mon Sep 17 00:00:00 2001 From: Yoann Poupart Date: Wed, 5 Jun 2024 17:19:20 +0200 Subject: [PATCH] TLDR --- pages/_publications/lczero-planning.html | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/pages/_publications/lczero-planning.html b/pages/_publications/lczero-planning.html index 503a705..17ee3d0 100644 --- a/pages/_publications/lczero-planning.html +++ b/pages/_publications/lczero-planning.html @@ -1,6 +1,6 @@ --- -title: LCZero Planning -tldr: TLDR +title: Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents +tldr: We propose contrastive sparse autoencoders (CSAE), a novel framework for studying pairs of game trajectories. Using CSAE, we are able to extract and interpret concepts that are meaningful to the chess-agent plans. We primarily focused on a qualitative analysis of the CSAE features before proposing an automated feature taxonomy. Furthermore, to evaluate the quality of our trained CSAE, we devise sanity checks to wave spurious correlations in our results. tags: - Chess - XAI @@ -11,9 +11,9 @@ editedOn: 2024-06-05 authors: - "[[Yoann Poupart]]" -readingTime: +readingTime: 25 image: /assets/publications/images/lczero-planning_thumbnail.png -description: TL;DR> TLDR +description: TL;DR> We propose contrastive sparse autoencoders (CSAE), a novel framework for studying pairs of game trajectories. Using CSAE, we are able to extract and interpret concepts that are meaningful to the chess-agent plans. We primarily focused on a qualitative analysis of the CSAE features before proposing an automated feature taxonomy. Furthermore, to evaluate the quality of our trained CSAE, we devise sanity checks to wave spurious correlations in our results. ---