## GitHub CI-compass Organization
- [GitHub Link: https://github.com/ci-compass/AI-working-scientist/tree/main](https://github.com/ci-compass/AI-working-scientist/tree/main)

# A guide to Generative AI for the Working Scientist

> This video is meant to be a rough guide to some of the concepts and to help understand generative AI and preparation for the NSF CyberInfrastructure Center of Excellence [CI-Compass](https://ci-compass.org/) [Virtual Workshop - AI Meets CI: Intelligent Infrastructure for Major & Midscale Facilities](https://ci-compass.org/news-and-events/events/virtual-workshop-ai-meets-ci-intelligent-infrastructure-for-major-and-midscale-facilities/). The purpose is to start from the beginning and try to de-mystify **chatbot** based Generative AI.

![](https://ci-compass.org/assets/629872/300x/ai_meets_ci_recreation_1.png)

## "Prompt Engineering" vs "Context Engineering"

> "Context refers to the set of tokens included when sampling from a large-language model (LLM). The engineering problem at hand is optimizing the utility of those tokens against the inherent constraints of LLMs in order to consistently achieve a desired outcome. Effectively wrangling LLMs often requires thinking in context — in other words: considering the holistic state available to the LLM at any given time and what potential behaviors that state might yield."

![](https://www.anthropic.com/_next/image?url=https%3A%2F%2Fwww-cdn.anthropic.com%2Fimages%2F4zrzovbb%2Fwebsite%2Ffaa261102e46c7f090a2402a49000ffae18c5dd6-2292x1290.png&w=3840&q=75)
---
- [Anthropic Engineering, "Effective context engineering for AI agents", Sep 29, 2025, https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents)

## "Context Window -- Trained Model"
Think of an LLM as a kind of supercharged text predictor: you give it a sequence of tokens (words or subwords) and it predicts the next token, then the next, etc. What we call the context window is simply how many tokens the model can look back at when making each prediction.

- If a model has a context window of, say, 4,096 tokens, then when it’s about to predict token N, it only “knows” about tokens N-4,095 through N-1 (plus whatever internal state) — it cannot directly “see” tokens older than that.

- Everything the model uses to ground its prediction must be inside that window — the user prompt, the system instructions, examples, retrieved documents, conversation history, etc.

### Why does the Context Window Matter?
1. **Scope of what the model knows in this invocation -- Stateless Model**
> Because the model cannot remember everything ever said, only what fits into its window — if you want it to reference a piece of text, you must include it (or a compressed version of it) in the window.

2. **Management of context = performance trade-offs**
> The more tokens you feed (longer history, more retrieved docs, more examples), the richer the information the model has — but you are limited by the window size. If you exceed it, older tokens get truncated (lost). If you fill it with irrelevant stuff, you can confuse the model (context noise) rather than help it. Karpathy calls this “the delicate art and science of filling the context window with just the right information for the next step.”

3. **Analogy: human coworker with short-term memory**
> Karpathy uses an analogy: the LLM is like a coworker who has anterograde amnesia — they forget everything beyond a short timeframe. So if you want them to reference something older, you must remind them (i.e., re-include it in the window).

### Multimodal Context Window
>Now imagine we extend that idea: Instead of feeding the model just text tokens, we also feed in image tokens, audio tokens, video frame tokens, sensor tokens, etc. 

**Each modality has its own tokenizer:**

- Text → word/subword tokens

- Images → small patch tokens (like 16×16 pixels each)

- Audio → waveform chunks or spectrogram tokens

All of those get projected into the same vector space and concatenated into one long sequence.
That sequence is the multimodal context window.

## Recall and large context windows
-[N. F. Liu et al., “Lost in the middle: How language models use long contexts,” Trans. Assoc. Comput. Linguist., vol. 12, pp. 157–173, Feb. 2024.https://ar5iv.labs.arxiv.org/html/2307.03172](https://ar5iv.labs.arxiv.org/html/2307.03172)

<hr>

![](https://ar5iv.labs.arxiv.org/html/2307.03172/assets/x1.png)

In [None]:
from dialoghelper import *
fc_tool_info()

In [None]:
from fastcore.tools import *

Tools available from `fastcore.tools`:

- &`rg`: Run the `rg` command with the args in `argstr` (no need to backslash escape)
- &`sed`: Run the `sed` command with the args in `argstr` (e.g for reading a section of a file)
- &`view`: View directory or file contents with optional line range and numbers
- &`create`: Creates a new file with the given content at the specified path
- &`insert`: Insert new_str at specified line number
- &`str_replace`: Replace first occurrence of old_str with new_str in file
- &`strs_replace`: Replace for each str pair in old_strs,new_strs
- &`replace_lines`: Replace lines in file using start and end line-numbers

In [None]:
from fastcore.tools import *

Tools available from `fastcore.tools`:

- &`rg`: Run the `rg` command with the args in `argstr` (no need to backslash escape)
- &`sed`: Run the `sed` command with the args in `argstr` (e.g for reading a section of a file)
- &`view`: View directory or file contents with optional line range and numbers
- &`create`: Creates a new file with the given content at the specified path
- &`insert`: Insert new_str at specified line number
- &`str_replace`: Replace first occurrence of old_str with new_str in file
- &`strs_replace`: Replace for each str pair in old_strs,new_strs
- &`replace_lines`: Replace lines in file using start and end line-numbers

In [None]:
tool_info()

Tools available from `dialoghelper`:

- &`curr_dialog`: Get the current dialog info.
- &`msg_idx`: Get absolute index of message in dialog.
- &`add_html`: Send HTML to the browser to be swapped into the DOM using hx-swap-oob.
- &`find_msg_id`: Get the current message id.
- &`find_msgs`: Find messages in current specific dialog that contain the given information.
  - (solveit can often get this id directly from its context, and will not need to use this if the required information is already available to it.)
- &`read_msg`: Get the message indexed in the current dialog.
  - To get the exact message use `n=0` and `relative=True` together with `msgid`.
  - To get a relative message use `n` (relative position index).
  - To get the nth message use `n` with `relative=False`, e.g `n=0` first message, `n=-1` last message.
- &`del_msg`: Delete a message from the dialog.
- &`add_msg`: Add/update a message to the queue to show after code execution completes.
- &`update_msg`: Update an existing message.
- &`url2note`: Read URL as markdown, and add a note below current message with the result
- &`msg_insert_line`: Insert text at a specific location in a message.
- &`msg_str_replace`: Find and replace text in a message.
- &`msg_strs_replace`: Find and replace multiple strings in a message.
- &`msg_replace_lines`: Replace a range of lines in a message with new content.
  - Always first use `read_msg( msgid=msgid, n=0, relative=True, nums=True)` to view the content with line numbers.

### "Thinking Models and Chain of Thought" (Deepseek-R1 from Karpathy Video)
![](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVgTjwA0IzKekrQoMziCmDXjO10QKjdDdzK1Oj8bZToPOI6VjVzTKXZ6vnWvAGOdVnWznJK2ZZjfBuTLojobayI_yrvlFzE3dCErF2j5wKLGFWAkuGP9-r-hMrqFivnjYhbCIu7HFINSmHu4wUjlKHfJxWHZ8Y7CYUowWvxTeRJhQEAUswGh2fUd3VHA/s2500/chainofthought.png)

- [Language Models Perform Reasoning via Chain of Thought (May 2022)https://research.google/blog/language-models-perform-reasoning-via-chain-of-thought/](https://research.google/blog/language-models-perform-reasoning-via-chain-of-thought/)

## ReACT
![](https://react-lm.github.io/files/diagram.png)

- [ReAct: Synergizing Reasoning and Acting in Language Models Blog: https://react-lm.github.io/](https://react-lm.github.io/)
- [S. Yao et al., “ReAct: Synergizing reasoning and acting in language models,” Int Conf Learn Represent, vol. abs/2210.03629, Oct. 2022. https://openreview.net/forum?id=WE_vluYUL-X](https://openreview.net/forum?id=WE_vluYUL-X)