# Chain-of-thought

- 📺 **Video:** [https://youtu.be/tNGu3EqJbKc](https://youtu.be/tNGu3EqJbKc)

## Overview
Discusses the approach of prompting models to generate a step-by-step reasoning chain before the final answer This ties with the Wei et al. 2022 paper (Google's “Chain-of-thought prompting elicits reasoning”) which found that if you prompt e.g., “Let's think step by step” or give examples where the solution is reached via intermediate steps, large LMs (especially with 100B+ params) can solve significantly harder problems (like multi-step math, logical puzzles) compared to giving direct answer.

In [None]:
import os, random
random.seed(0)
CI = os.environ.get('CI') == 'true'

## Key ideas
- The video likely shows an example: Q: “If there are 3 apples and you take away 2, how many left?” A normal prompt might confuse model (some might answer 1, which is correct, but for more complex Q it fails).
- With CoT: "Let's think step by step: Initially 3 apples.
- If you take 2, that means 3-2 = 1.
- So answer is 1.” The model outputs the reasoning and final.

## Demo

In [None]:
print('Try the exercises below and follow the linked materials.')

## Try it
- Modify the demo
- Add a tiny dataset or counter-example


## References
- [The Mythos of Model Interpretability](https://arxiv.org/pdf/1606.03490.pdf)
- [Deep Unordered Composition Rivals Syntactic Methods for Text Classification](https://www.aclweb.org/anthology/P15-1162/)
- [Analysis Methods in Neural Language Processing: A Survey](https://arxiv.org/pdf/1812.08951.pdf)
- ["Why Should I Trust You?" Explaining the Predictions of Any Classifier](https://arxiv.org/pdf/1602.04938.pdf)
- [Axiomatic Attribution for Deep Networks](https://arxiv.org/pdf/1703.01365.pdf)
- [BERT Rediscovers the Classical NLP Pipeline](https://arxiv.org/pdf/1905.05950.pdf)
- [What Do You Learn From Context? Probing For Sentence Structure In Contextualized Word Represenations](https://arxiv.org/pdf/1905.06316.pdf)
- [Annotation Artifacts in Natural Language Inference Data](https://www.aclweb.org/anthology/N18-2017/)
- [Hypothesis Only Baselines in Natural Language Inference](https://www.aclweb.org/anthology/S18-2023/)
- [Did the Model Understand the Question?](https://www.aclweb.org/anthology/P18-1176/)
- [Swag: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference](https://www.aclweb.org/anthology/D18-1009.pdf)
- [Generating Visual Explanations](https://arxiv.org/pdf/1603.08507.pdf)
- [e-SNLI: Natural Language Inference with Natural Language Explanations](https://arxiv.org/abs/1812.01193)
- [Explaining Question Answering Models through Text Generation](https://arxiv.org/pdf/2004.05569.pdf)
- [Program Induction by Rationale Generation : Learning to Solve and Explain Algebraic Word Problems](https://arxiv.org/abs/1705.04146)
- [Chain-of-Thought Prompting Elicits Reasoning in Large Language Models](https://arxiv.org/abs/2201.11903)
- [The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning](https://arxiv.org/abs/2205.03401)
- [Large Language Models are Zero-Shot Reasoners](https://arxiv.org/abs/2205.11916)
- [Complementary Explanations for Effective In-Context Learning](https://arxiv.org/pdf/2211.13892.pdf)
- [PAL: Program-aided Language Models](https://arxiv.org/abs/2211.10435)
- [Measuring and Narrowing the Compositionality Gap in Language Models](https://arxiv.org/abs/2210.03350)


*Links only; we do not redistribute slides or papers.*