<a href="https://colab.research.google.com/github/SamarthJ03/AI-Assisted-Parameter-Extraction-For-RISC-V-SPEC/blob/main/notebooks/zicntr_zihpm_validation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
pip install -q langchain langchain-community transformers accelerate pydantic pyyaml langchain_huggingface langchain_core langchain-text-splitters


[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.0/61.0 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.0/44.0 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m78.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.0/12.0 MB[0m [31m86.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m566.3/566.3 kB[0m [31m30.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m44.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m64.7/64.7 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m51.0/51.0 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does n

In [None]:
import yaml
from pathlib import Path
from typing import List

from pydantic import BaseModel, Field
from langchain_core.output_parsers import PydanticOutputParser
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from langchain_core.prompts import PromptTemplate


In [None]:
from langchain_text_splitters import RecursiveCharacterTextSplitter



In [None]:
class Parameter(BaseModel):
    name: str = Field(description="Concise parameter name given in the specification or derived from the description")
    description: str = Field(description="Description derived strictly from the specification explaining what behavior or capability varies due to this implementation-defined parameter")
    type: str = Field(description="type of values the parameter takes : integer | boolean | enum | bitfield | range | structural")
    constraints: str = Field(description="Explicit constraints on the values that the parameter can take or 'unspecified'")

class ParameterList(BaseModel):
    parameters: List[Parameter]


In [None]:
parser = PydanticOutputParser(pydantic_object=ParameterList)
format_instructions = parser.get_format_instructions()


In [None]:
models = {
    "Qwen/Qwen2.5-14B-Instruct" : ChatHuggingFace(llm = HuggingFaceEndpoint(
        repo_id="Qwen/Qwen2.5-14B-Instruct",
        task="text-generation",
        max_new_tokens=1024,
        temperature=0.0,
        seed=42
    )),
    "meta-llama/Llama-3.1-8B-Instruct": ChatHuggingFace(llm = HuggingFaceEndpoint(
        repo_id="meta-llama/Llama-3.1-8B-Instruct",
        task="text-generation",
        max_new_tokens=1024,
        temperature=0.0,
        seed=42
    )
    )
}


In [None]:
def append_yaml_entries(entries, file_path="results_ex.yaml"):
    path = Path(file_path)
    if path.exists():
        data = yaml.safe_load(path.read_text())
    else:
        data = []

    data.extend(entries)
    path.write_text(yaml.safe_dump(data, sort_keys=False))





In [None]:
def parse_output(text):
    try:
        return parser.parse(text).dict()
    except Exception as e:
        return {
            "parse_error": True,
            "error": str(e),
            "raw_output": text
        }


In [None]:
def run_models_on_snippets(prompting_technique, prompt, snippets, models, format_instructions):
    all_results = []

    for snippet in snippets:

        current_full_prompt = prompt.format(spec_snippet=snippet.strip(), format_instructions=format_instructions)

        entry = {
            "prompting_technique": prompting_technique.strip(),
            "prompt": current_full_prompt,
            "input": { "text": snippet.strip() },
            "models": []
        }

        for model_name, llm in models.items():
            try:

                raw = llm.invoke(current_full_prompt)
                content = getattr(raw, 'content', str(raw))


                parsed_data = parse_output(content)

                entry["models"].append({
                    "model_name": model_name,
                    "output": parsed_data
                })
            except Exception as e:
                print(f"Error with {model_name}: {e}")
                entry["models"].append({
                    "model_name": model_name,
                    "output": {"error": str(e)}
                })

        all_results.append(entry)


        append_yaml_entries([entry])

    return all_results

In [None]:
Zicntr_and_Zihpm_full_text = """"Zicntr" and "Zihpm" Extensions for Counters, Version 2.0
RISC-V ISAs provide a set of up to thirty-two 64-bit performance counters and timers that are accessible via unprivileged XLEN-bit read-only CSR registers 0xC00–0xC1F (when XLEN=32, the upper 32 bits are accessed via CSR registers 0xC80–0xC9F). These counters are divided between the "Zicntr" and "Zihpm" extensions.

"Zicntr" Extension for Base Counters and Timers
The Zicntr standard extension comprises the first three of these counters (CYCLE, TIME, and INSTRET), which have dedicated functions (cycle count, real-time clock, and instructions retired, respectively). The Zicntr extension depends on the Zicsr extension.

We recommend provision of these basic counters in implementations as they are essential for basic performance analysis, adaptive and dynamic optimization, and to allow an application to work with real-time streams. Additional counters in the separate Zihpm extension can help diagnose performance problems and these should be made accessible from user-level application code with low overhead.

Some execution environments might prohibit access to counters, for example, to impede timing side-channel attacks.

Unresolved include directive in modules/chapters/pages/counters.adoc - include::images/wavedrom/counters-diag.adoc[]

For base ISAs with XLEN≥64, CSR instructions can access the full 64-bit CSRs directly. In particular, the RDCYCLE, RDTIME, and RDINSTRET pseudoinstructions read the full 64 bits of the cycle, time, and instret counters.

The counter pseudoinstructions are mapped to the read-only csrrs rd, counter, x0 canonical form, but the other read-only CSR instruction forms (based on CSRRC/CSRRSI/CSRRCI) are also legal ways to read these CSRs.

For base ISAs with XLEN=32, the Zicntr extension enables the three 64-bit read-only counters to be accessed in 32-bit pieces. The RDCYCLE, RDTIME, and RDINSTRET pseudoinstructions provide the lower 32 bits, and the RDCYCLEH, RDTIMEH, and RDINSTRETH pseudoinstructions provide the upper 32 bits of the respective counters.

We required the counters be 64 bits wide, even when XLEN=32, as otherwise it is very difficult for software to determine if values have overflowed. For a low-end implementation, the upper 32 bits of each counter can be implemented using software counters incremented by a trap handler triggered by overflow of the lower 32 bits. The sample code given below shows how the full 64-bit width value can be safely read using the individual 32-bit width pseudoinstructions.

The RDCYCLE pseudoinstruction reads the low XLEN bits of the cycle CSR which holds a count of the number of clock cycles executed by the processor core on which the hart is running from an arbitrary start time in the past. RDCYCLEH is only present when XLEN=32 and reads bits 63-32 of the same cycle counter. The underlying 64-bit counter should never overflow in practice. The rate at which the cycle counter advances will depend on the implementation and operating environment. The execution environment should provide a means to determine the current rate (cycles/second) at which the cycle counter is incrementing.

RDCYCLE is intended to return the number of cycles executed by the processor core, not the hart. Precisely defining what is a "core" is difficult given some implementation choices (e.g., AMD Bulldozer). Precisely defining what is a "clock cycle" is also difficult given the range of implementations (including software emulations), but the intent is that RDCYCLE is used for performance monitoring along with the other performance counters. In particular, where there is one hart/core, one would expect cycle-count/instructions-retired to measure CPI for a hart.

Cores don’t have to be exposed to software at all, and an implementor might choose to pretend multiple harts on one physical core are running on separate cores with one hart/core, and provide separate cycle counters for each hart. This might make sense in a simple barrel processor (e.g., CDC 6600 peripheral processors) where inter-hart timing interactions are non-existent or minimal.

Where there is more than one hart/core and dynamic multithreading, it is not generally possible to separate out cycles per hart (especially with SMT). It might be possible to define a separate performance counter that tried to capture the number of cycles a particular hart was running, but this definition would have to be very fuzzy to cover all the possible threading implementations. For example, should we only count cycles for which any instruction was issued to execution for this hart, and/or cycles any instruction retired, or include cycles this hart was occupying machine resources but couldn’t execute due to stalls while other harts went into execution? Likely, "all of the above" would be needed to have understandable performance stats. This complexity of defining a per-hart cycle count, and also the need in any case for a total per-core cycle count when tuning multithreaded code led to just standardizing the per-core cycle counter, which also happens to work well for the common single hart/core case.

Standardizing what happens during "sleep" is not practical given that what "sleep" means is not standardized across execution environments, but if the entire core is paused (entirely clock-gated or powered-down in deep sleep), then it is not executing clock cycles, and the cycle count shouldn’t be increasing per the spec. There are many details, e.g., whether clock cycles required to reset a processor after waking up from a power-down event should be counted, and these are considered execution-environment-specific details.

Even though there is no precise definition that works for all platforms, this is still a useful facility for most platforms, and an imprecise, common, "usually correct" standard here is better than no standard. The intent of RDCYCLE was primarily performance monitoring/tuning, and the specification was written with that goal in mind.

The RDTIME pseudoinstruction reads the low XLEN bits of the "time" CSR, which counts wall-clock real time that has passed from an arbitrary start time in the past. RDTIMEH is only present when XLEN=32 and reads bits 63-32 of the same real-time counter. The underlying 64-bit counter increments by one with each tick of the real-time clock, and, for realistic real-time clock frequencies, should never overflow in practice. The execution environment should provide a means of determining the period of a counter tick (seconds/tick). The period should be constant within a small error bound. The environment should provide a means to determine the accuracy of the clock (i.e., the maximum relative error between the nominal and actual real-time clock periods).

On some simple platforms, cycle count might represent a valid implementation of RDTIME, in which case RDTIME and RDCYCLE may return the same result.

It is difficult to provide a strict mandate on clock period given the wide variety of possible implementation platforms. The maximum error bound should be set based on the requirements of the platform.

The real-time clocks of all harts must be synchronized to within one tick of the real-time clock.

As with other architectural mandates, it suffices to appear "as if" harts are synchronized to within one tick of the real-time clock, i.e., software is unable to observe that there is a greater delta between the real-time clock values observed on two harts.

The RDINSTRET pseudoinstruction reads the low XLEN bits of the instret CSR, which counts the number of instructions retired by this hart from some arbitrary start point in the past. RDINSTRETH is only present when XLEN=32 and reads bits 63-32 of the same instruction counter. The underlying 64-bit counter should never overflow in practice.

Instructions that cause synchronous exceptions, including ECALL and EBREAK, are not considered to retire and hence do not increment the instret CSR.

The following code sequence will read a valid 64-bit cycle counter value into x3:x2, even if the counter overflows its lower half between reading its upper and lower halves.

Sample code for reading the 64-bit cycle counter when XLEN=32.
    again:
        rdcycleh     x3
        rdcycle      x2
        rdcycleh     x4
        bne          x3, x4, again
"Zihpm" Extension for Hardware Performance Counters
The Zihpm extension comprises up to 29 additional unprivileged 64-bit hardware performance counters, hpmcounter3-hpmcounter31. When XLEN=32, the upper 32 bits of these performance counters are accessible via additional CSRs hpmcounter3h- hpmcounter31h. The Zihpm extension depends on the Zicsr extension.

In some applications, it is important to be able to read multiple counters at the same instant in time. When run under a multitasking environment, a user thread can suffer a context switch while attempting to read the counters. One solution is for the user thread to read the real-time counter before and after reading the other counters to determine if a context switch occurred in the middle of the sequence, in which case the reads can be retried. We considered adding output latches to allow a user thread to snapshot the counter values atomically, but this would increase the size of the user context, especially for implementations with a richer set of counters.

The implemented number and width of these additional counters, and the set of events they count, is platform-specific. Accessing an unimplemented or ill-configured counter may cause an illegal-instruction exception or may return a constant value.

The execution environment should provide a means to determine the number and width of the implemented counters, and an interface to configure the events to be counted by each counter.

For execution environments implemented on RISC-V privileged platforms, the privileged architecture manual describes privileged CSRs controlling access by lower privileged modes to these counters, and to set the events to be counted.

Alternative execution environments (e.g., user-level-only software performance models) may provide alternative mechanisms to configure the events counted by the performance counters.

It would be useful to eventually standardize event settings to count ISA-level metrics, such as the number of floating-point instructions executed for example, and possibly a few common microarchitectural metrics, such as "L1 instruction cache misses".
"""

In [None]:
splitter = RecursiveCharacterTextSplitter(
    chunk_size=1200,
    chunk_overlap=200,
    separators=["\n\n", "\n", "###", ". "]
)


snippets = splitter.split_text(Zicntr_and_Zihpm_full_text)

In [None]:
snippets1 = snippets[0:6]
snippets2 = snippets[6:]

In [None]:
template14 = f"""
RISC-V is an Open Standard Instruction Set Architecture (ISA).
To put it simply, an ISA is the language that a computer's hardware speaks. It defines the set of instructions (like add, subtract, load, or store) that a processor can execute.
Architectural Parameters are the variables of a hardware design. They are the specific values or behaviors that the hardware designer must decide on and the software must account for.
In the most basic form, a parameter is a choice between say A or B as value of a field which changes the behaviour of processor.
If it is a standard or convention that we have to follow it giving us no choice it is not a parameter.
You are a strict, pedantic expert in RISC-V.

Task:
You will be given excerpts from the RISC-V Instruction Set Manual (RISC-V ISA). You MUST extract architectural parameters from them.
You are required to AGGRESSIVELY focus on the following trigger words, as they usually imply a parameter:
may/might/should,
optional/optionally,
implementation defined/implementation specific,
can/either
However, parameters may exist even when these words are absent.
If an excerpt contains no parameters, return an empty list.

Instructions on how to format the output:
{{format_instructions}}

Strict Rules:
- NEVER output the schema definition.
- If a constraint refers to an external source (e.g. "see section 4.2"), NEVER WRITE IT; instead write "unspecified".
- Output ONLY the requested data.
- Do not add conversational text or explanations.
- The output MUST be a JSON object with a single key named "parameters".

Reasoning Process (Follow this strictly Step-by-Step):
1. FILTER: Is this a fixed rule, standard, or address mapping set by the ISA? If it MUST be followed for compliance, DISCARD it immediately.
2. CHECK WARL: Is a field labeled WARL (Write Any values, Reads Legal values)? If yes, this is ALWAYS a parameter. Extract it.
3. IDENTIFY CHOICE: Does the text allow for different 'BEHAVIOUR' or 'VALUES' across different chips? If the hardware designer has a CHOICE (e.g., between A or B), you MUST extract it.
4. VALIDATE: Once a choice is extracted, check if the spec specifies a type or constraint on its values.

Examples:

- Text: 'The funct3 field of the ADDI instruction is always bits [14:12] and must be set to 000. Any other value is reserved.'
  Reasoning: These bits are fixed by the ISA to define the opcode. A hardware designer cannot change which bits represent funct3 or what value they hold for ADDI. There is NO choice.
  Result: 'parameters: []'

- Text: 'The MXLEN parameter can be either 32 or 64 bits. In some implementations, the upper bits [63:32] of a register may be ignored when operating in RV32 mode.'
  Reasoning: The word 'either' and 'may be ignored' indicates a CHOICE. One hardware design might support 64 bits, while another only 32.
  Output:
  - name: mxlen_width
    description: The supported bit-width of the machine architecture.
    type: enum
    constraints: 32, 64
  - name: upper_bit_behavior
    description: Behavior of bits [63:32] when in RV32 mode.
    type: string
    constraints: may be ignored

Excerpt from RISC-V (extract parameters from the following text):
{{spec_snippet}}
"""

In [None]:
run_models_on_snippets("test1",template14,snippets1,models,format_instructions)

/tmp/ipython-input-3546129513.py:3: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.12/migration/
  return parser.parse(text).dict()


[{'prompting_technique': 'test1',
  'prompt': '\nRISC-V is an Open Standard Instruction Set Architecture (ISA).\nTo put it simply, an ISA is the language that a computer\'s hardware speaks. It defines the set of instructions (like add, subtract, load, or store) that a processor can execute.\nArchitectural Parameters are the variables of a hardware design. They are the specific values or behaviors that the hardware designer must decide on and the software must account for.\nIn the most basic form, a parameter is a choice between say A or B as value of a field which changes the behaviour of processor.\nIf it is a standard or convention that we have to follow it giving us no choice it is not a parameter.\nYou are a strict, pedantic expert in RISC-V.\n\nTask:\nYou will be given excerpts from the RISC-V Instruction Set Manual (RISC-V ISA). You MUST extract architectural parameters from them.\nYou are required to AGGRESSIVELY focus on the following trigger words, as they usually imply a para

In [None]:
run_models_on_snippets("test2",template14,snippets2,models,format_instructions)

/tmp/ipython-input-3546129513.py:3: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.12/migration/
  return parser.parse(text).dict()


[{'prompting_technique': 'test2',
  'prompt': '\nRISC-V is an Open Standard Instruction Set Architecture (ISA).\nTo put it simply, an ISA is the language that a computer\'s hardware speaks. It defines the set of instructions (like add, subtract, load, or store) that a processor can execute.\nArchitectural Parameters are the variables of a hardware design. They are the specific values or behaviors that the hardware designer must decide on and the software must account for.\nIn the most basic form, a parameter is a choice between say A or B as value of a field which changes the behaviour of processor.\nIf it is a standard or convention that we have to follow it giving us no choice it is not a parameter.\nYou are a strict, pedantic expert in RISC-V.\n\nTask:\nYou will be given excerpts from the RISC-V Instruction Set Manual (RISC-V ISA). You MUST extract architectural parameters from them.\nYou are required to AGGRESSIVELY focus on the following trigger words, as they usually imply a para