# Semantic Categories

In [None]:
import { START, END, StateGraph, MemorySaver } from "@langchain/langgraph";
import { SystemMessage, HumanMessage } from "@langchain/core/messages";
import { JsonOutputParser } from "@langchain/core/output_parsers";
import { readFileSync } from 'node:fs';


import { EXPERIMENTS_DIR, SERVER_DATA_DIR } from '../server/src/util/fileUtils.ts';
import { getNotebookLogger } from '../server/src/Logger.ts';
import { StateInfo, responseContent } from '../server/src/agents/agent.ts';
import { modelNode } from '../server/src/agents/nodes/modelNode.ts';

// Define a new graph
const workflow = new StateGraph(StateInfo)
  .addNode("model", modelNode)
  .addEdge(START, "model")
  .addEdge("model", END);

// Add memory
const memory = new MemorySaver();
const graph = workflow.compile({ checkpointer: memory });

const logger = getNotebookLogger();
const userUUID: string = "0";
const config = { configurable: { thread_id: userUUID } };
const lhsText = readFileSync(`${SERVER_DATA_DIR}/SHA-1/selected-text.txt`, 'utf-8');
const PROMPT = readFileSync(`${EXPERIMENTS_DIR}/annotateNodePromptCategories3.txt`, 'utf-8');

var userInput = lhsText;
var output = await graph.invoke({ messages: [
  new SystemMessage(PROMPT),
  new HumanMessage(userInput)
], logger: logger}, config);
logger.info(responseContent(output));

Sending messages to LLM.
Received response from LLM.
LLM usage: 2809 input tokens, 2700 output tokens
```json
[
  {
    "description": "Section header for preprocessing steps in SHA-1",
    "text": "\\section*{5. PREPROCESSING}",
    "label": "Boilerplate"
  },
  {
    "description": "Overview of the three preprocessing steps for SHA-1",
    "text": "Preprocessing consists of three steps: padding the message, \\(M\\) (Sec. 5.1), parsing the message into",
    "label": "Definitions"
  },
  {
    "description": "Subsection header for message padding",
    "text": "\\subsection*{5.1 Padding the Message}",
    "label": "Boilerplate"
  },
  {
    "description": "Explanation of the purpose of padding in SHA-1",
    "text": "The purpose of this padding is to ensure that the padded message is a multiple of 512 or 1024 bits, de",
    "label": "Intent"
  },
  {
    "description": "Subsection header for SHA-1, SHA-224 and SHA-256 padding",
    "text": "\\subsection*{5.1.1 SHA-1, SHA-224 and SHA-2

In [3]:
userInput = "Can you rewrite the prompt that I sent earlier so that it works on any algorithm (besides SHA-1) from any FIPS specification?";
output = await graph.invoke({ messages: [ new HumanMessage(userInput) ], logger: logger}, config);
logger.info(responseContent(output));

Sending messages to LLM.
Received response from LLM.
LLM usage: 5513 input tokens, 372 output tokens
# Semantic Decomposition of Cryptographic Standards

You are an expert in semantic decomposition of cryptographic standards written in natural language. You will be provided with a section from a FIPS publication that specifies a cryptographic algorithm. This text is written in LaTeX format, and is already decomposed into multi-line segments which are separated by new-lines.

Your task is to label each segment with one of the following categories:

- Boilerplate: LaTeX preamble code that does not contain semantic content, e.g. section headers.
- Intent: Motivation for how the algorithm is designed or explanations of its purpose
- Definitions: Formal definitions of algorithm components, transformations, or parameters
- Elaborations: Additional descriptions or clarifications of existing definitions
- Assumptions: What is assumed about the inputs or use cases of the algorithm
- Requirement