In [1]:
import fire
import torch
from llama_cpp import Llama
import os
from docx import Document
from tqdm.auto import tqdm
import pandas as pd
import ast

In [2]:
import fire
import torch
from llama_cpp import Llama
import os
from docx import Document
from tqdm.auto import tqdm
import pandas as pd
import ast


# 0,77
SYSTEM_PROMPT = """
You are a language model tasked with evaluating how closely a technical specifications document (SSTS) aligns with a requirements document (UC). The input format is:

UC: {scenario requirements content}
SSTS: {technical specifications for implementation}

Definitions:

    UC: The requirements document detailing the system’s scenario requirements, including preconditions, main scenarios, postconditions, and alternative scenarios.
    SSTS: The technical document outlining the system’s functionality in a structured textual description.

Your task is to compare SSTS against UC and assess compliance based on the following criteria. Provide your response only in the specified dictionary format, with no introductory or closing text. Use these keys:

    Name: Extract and include the name of the UC document.

    Differences: Highlight key mismatches or omissions in SSTS relative to UC, focusing on:
        - Missing or different output devices.
        - Discrepancies in user interaction elements, like control methods or UI components.
        - Missing functional details such as status displays, error handling, or audio requirements.
    Compliance Level: Indicate the compliance level of SSTS with UC using one of these codes:
    
    **FC - Fully Compliant**  
    The SSTS document fully aligns with all functional requirements in the UC document, with no missing components or deviations. Every requirement, scenario, and interaction outlined in UC is addressed in the SSTS document with precise alignment. Use this rating only if the SSTS document provides complete and exact coverage of UC requirements, meaning that no revisions are necessary.
    
    **LC - Largely Compliant**

    The SSTS document meets **most** of the functional requirements outlined in the UC document. However, there are **minor deviations** that do not substantially impact the core functionality, usability, or effectiveness of the system. These deviations are typically related to **small differences in descriptions, wording, or non-essential elements**. The key features and requirements defined in the UC document are largely represented in the SSTS, but the implementation may include **slight inconsistencies** or **missing clarifications** in specific areas.
    
    ### When to use **LC**:
    - **Minor inconsistencies**: For example, the SSTS might describe the same functionality as in the UC but with slight wording or detail differences that do not affect the overall outcome.
    - **Non-critical details**: Certain optional elements, like UI components, specific error handling behaviors, or output devices, might not be perfectly aligned with the UC but are not essential to the system’s core operations.
    - **Minor omissions**: Some non-critical requirements, such as specific display formats or additional user interactions, may be omitted or underdefined in the SSTS but do not significantly impact the user experience or system function.
    
    ### Examples of deviations that justify **LC**:
    - A **UI control** might be described in the UC as a dropdown menu, while in the SSTS, it could be described as a simple text field—still functional, but not identical.
    - **Minor output differences**, such as using “mobile speakers” instead of “vehicle speakers”, which do not change the fundamental interaction but could require minor clarification or adjustment.
    - A **slightly different control method** for user interaction, such as replacing a manual button with a voice command for the same task.
    
    ### Recommended Actions:
    The SSTS document does **not** require a complete overhaul but would benefit from **minor refinements**. These may include:
    - Clarifying language or descriptions for greater alignment with UC.
    - Adding or refining optional features, UI elements, or error-handling mechanisms.
    - Minor adjustments to output devices or interaction methods to fully match the UC specification.

    **PC - Partially Compliant**  
    The SSTS document addresses some of the UC requirements but has notable gaps or deviations that could affect intended functionality or usability. These gaps may include missing functional elements, major differences in interaction design, or unaddressed scenarios that are critical to UC. Use this rating if the SSTS covers certain functionalities but falls short on others, requiring specific revisions to achieve alignment with the UC document.
    
    **NC - Non-Compliant**

    The SSTS document **fails to align** with the key functional requirements of the UC document, exhibiting **significant deviations** in areas critical to the system’s performance or usability. These misalignments can include the absence of foundational functional elements, major differences in interaction design, or completely unaddressed scenarios that are essential for meeting UC requirements. The SSTS document, in its current form, **lacks the necessary depth and accuracy** to fulfill the core expectations of the UC and would require **extensive revisions** to meet the outlined specifications.
    
    ### When to use **NC**:
    - **Major functional omissions**: If the SSTS document is missing critical functional requirements, such as the inability to support basic user interactions or scenarios that are defined in the UC.
    - **Severe interaction misalignments**: For instance, if the interaction design in the SSTS deviates significantly from the UC document, such as drastically different workflows or unaddressed user behavior scenarios.
    - **Unaddressed UC requirements**: If certain core features or requirements outlined in the UC are completely absent or inaccurately represented in the SSTS, causing the system to fail in fulfilling the expected functions.
    
    ### Examples of deviations that justify **NC**:
    - **Missing core functionality**: A system feature described in the UC, such as an essential user authentication process or data validation step, is entirely absent or inaccurately described in the SSTS.
    - **Incorrect interaction design**: The UC specifies a multi-step process for user input, but the SSTS describes a completely different or overly simplified flow that would break user expectations or reduce usability.
    - **Incomplete error handling**: The UC defines specific error scenarios and recovery processes, but the SSTS either omits these completely or includes incorrect solutions that fail to resolve critical issues in practice.
    
    ### Recommended Actions:
    The SSTS document requires **substantial revisions** to align with the UC. Key areas for improvement include:
    - **Addressing missing functionality**: Ensuring that all functional elements and user interaction scenarios defined in the UC are included and accurately described.
    - **Revising interaction flows**: Updating the interaction designs to reflect the UC specifications, ensuring that the user experience aligns with UC expectations.
    - **Improving error handling**: Ensuring that error scenarios and recovery methods are fully defined and consistent with the UC, preventing critical system failures.
      
    In summary, **NC** indicates that the SSTS is significantly misaligned with the UC and cannot be considered compliant without major revisions. It is a signal that foundational changes are necessary to bring the two documents into alignment.

    **NA - Not Applicable**  
    The SSTS and UC documents are fundamentally incomparable due to a clear difference in purpose, scope, or content. This rating should be used only if it is evident that the UC requirements are outside the scope or relevance of the SSTS document. When applying this rating, provide a brief explanation of why a comparison is not feasible.
    
    Be sure to assess the SSTS against UC based on the descriptions of requirements, scenarios, devices, interactions, outputs, error handling, and any other relevant details. The compliance level should reflect the overall alignment of the documents in terms of both completeness and accuracy.

Return your response *only* in the following dictionary format, with no additional text:

```{
    "Name": "name of the UC document",
    "Differences": "Summary of key differences here",
    "Description": "UC - Key requirements from the interface scenario, SSTS - Description of corresponding content in the technical document",
    "Compliance Level": "FC, LC, PC, NC, or NA"
}```

Example format:

```{
    "Name": "UC Example Document",
    "Differences": "Summary of key differences between SSTS and UC",
    "Description": "UC - Key requirements missing from the technical document, SSTS - Relevant technical document details",
    "Compliance Level": "FC"
}```"""


SYSTEM_PROMPT_UC = """
Create an HMI (Human-Machine Interface) specification document based on the provided textual description, detailing requirements for the system interaction scenario. The document should include the following sections:

1. **Preconditions**: Describe the conditions or system states that must be met or activated before the scenario begins.
2. **Main Scenario**: Provide a detailed description of the primary steps the user takes to achieve their goal, including the exact sequence of actions and the system's expected responses.
3. **Postconditions**: Define the expected result of the scenario, including the system's state and any anticipated changes.
4. **Alternative Scenarios**: Describe any possible variations from the main scenario, such as errors or non-standard user actions, and specify how the system should respond to them.

The final document should have a clear and complete structure, allowing for a thorough understanding of all aspects of user-system interaction and accounting for any potential deviations from the main flow.
"""

SYSTEM_PROMPT_SSTS = """
Create an STS (System Technical Specification) document that provides a structured textual description of the functionality of each system block. The document should include the following sections:

1. **Block Overview**: Provide a summary of the block's purpose and its role within the overall system.
2. **Inputs and Outputs**: Describe the inputs received and outputs generated by this block, including their types, formats, and expected ranges or constraints.
3. **Functional Description**: Detail the specific functions and processes performed by the block, outlining key operations, decision logic, and any dependencies on other blocks.
4. **Performance Requirements**: Define performance metrics or requirements, such as speed, latency, or throughput, that the block must meet.
5. **Error Handling and Fault Tolerance**: Describe how the block handles errors or exceptions, including any recovery mechanisms or fault tolerance features.
6. **Interfaces and Dependencies**: Outline connections with other blocks or systems, specifying any dependencies, protocols, and data exchanges.

The final document should provide a complete, structured view of the functionality and technical requirements of each block, supporting a comprehensive understanding of the system's architecture and interactions.
"""

model_path = "Yi-Coder-9B-Chat-Q4_K_M.gguf" # путь к модели

model = Llama(
        model_path=model_path,
        n_ctx=4000,
        n_gpu_layers=-1,
        n_parts=1,
        verbose=False,
    )


#  функция для инференса
def interact(
    user_message,SYSTEM_PROMPT,
    top_k=30,
    top_p=0.9,
    temperature=0.1, # регулирует на сколько модель лаконично отвечает
    repeat_penalty=1.1
    
):

    messages = [{"role": "system", "content": SYSTEM_PROMPT}]
    messages.append({"role": "user", "content": user_message})
    result = []
    for part in model.create_chat_completion(
            messages,
            temperature=temperature,
            top_k=top_k,
            top_p=top_p,
            repeat_penalty=repeat_penalty,
            stream=True,
        ):
            delta = part["choices"][0]["delta"]
            if "content" in delta:
                result.append(delta["content"])
                
    return ''.join(result).strip()
# interact(model_path)

In [3]:
HMI_path = 'train data/HMI'  # путь к файлам UC
SSTS_path = 'train data/SSTS' # путь к файлам SSTS

HMI_fail = os.listdir(HMI_path)
SSTS_fail = [f'SSTS-{os.path.splitext(f)[0][3:]}.docx' for f in os.listdir(HMI_path)]
len(HMI_fail)

12

In [4]:
HMI_fail[11]

'UC-26160.docx'

In [5]:
SSTS_fail[11]

'SSTS-26160.docx'

In [6]:
def read_document(path):
    doc = Document(path)
    full_text = []
    for para in doc.paragraphs:        
        full_text.append(para.text)

    full_text = '\n'.join(full_text)
    return full_text

In [7]:
#Генерируем датафрейм
df=pd.DataFrame()
for HMI, SSTS in zip(tqdm(HMI_fail), SSTS_fail):
    
    path1 = f'{HMI_path}/{HMI}'
    path2 = f'{SSTS_path}/{SSTS}'

    # Проверяем наличие SSTS
    if not os.path.exists(path2):
        data = {
                'Number': os.path.splitext(HMI)[0][3:],
                'Name': [Document(path1).paragraphs[0].text],
                'Differences': ["ssts hasn't info about this"],
                'Description': ['-'],
                'Compliance Level': ['NA']
            }
        df = pd.concat([df, pd.DataFrame([data])])
        continue
        
    text1 = read_document(path1)
    text2 = read_document(path2)

    while True:  #Цикл до успешного выполнения заданного выхода модели
        try:
            # text1 = interact(text1, SYSTEM_PROMPT_UC)
            # text2 = interact(text1, SYSTEM_PROMPT_SSTS)
            res = interact(f'text1: {text1}\n\n\ntext2: {text2}', SYSTEM_PROMPT)
            cleaned_text = res.strip('```python\n').strip('\n```')
            data = ast.literal_eval(cleaned_text)
            data['Number'] = os.path.splitext(HMI)[0][3:]
            df = pd.concat([df, pd.DataFrame([data])])
            break  # Если всё прошло успешно, выходим из цикла
        except Exception as e:
            print(f"Ошибка при обработке {HMI} и {SSTS}: {e}")
df

  0%|          | 0/12 [00:00<?, ?it/s]

Ошибка при обработке UC-25957.docx и SSTS-25957.docx: unterminated string literal (detected at line 5) (<unknown>, line 5)
Ошибка при обработке UC-25957.docx и SSTS-25957.docx: invalid syntax (<unknown>, line 5)
Ошибка при обработке UC-8800.docx и SSTS-8800.docx: invalid syntax (<unknown>, line 1)


Unnamed: 0,Name,Differences,Description,Compliance Level,Number
0,UC 25957: Mute/unmute the FM Radio playback,The SSTS does not explicitly address user inte...,UC - User interaction via in_2/in_5 button and...,PC,25957
0,UC-28561: Setting Hotspot name & password,The SSTS describes a general function that use...,UC - Key requirements: Users can modify the ho...,PC,28561
0,UC Example Document,SSTS doesn't mention the 'Add to Favorites' fe...,UC - User wants to add internet radio station ...,PC,31523
0,UC-6583 Driver Initiate a Call through SWP,The SSTS document does not provide details on ...,UC - The system should support voice commands ...,LC,6583
0,UC-11467 Revoke Access to Vehicle,SSTS does not clearly align with the UC in ter...,UC - The owner needs to revoke access for othe...,LC,11467
0,Turn on and off hotspot via VA,"Key differences include: in the Use Case (UC),...",UC - Users can turn on or off the vehicle hots...,LC,26771
0,UC_I-8800_Receiving Call Notifications,The SSTS does not provide exact description of...,UC - The system should notify about an incomin...,LC,8800
0,UC Example Document,In the UC it's clearly stated that media sourc...,UC - User selects multimedia via in_5 or in_2/...,LC,8604
0,Emergency Service Communication (ERA-Glonass),Key differences include the use of the SOS but...,UC - Key requirements such as automatic volume...,LC,8692
0,ERA Self-diagnosis,The SSTS document does not explicitly mention ...,UC - The system must perform self-diagnosis ev...,LC,30371


In [8]:
df = df.rename(columns={'Compliance Level': 'Complience Level'})
df['Complience Level'] = df['Complience Level'].apply(lambda x: x[0] if type(x) == list else x)
df['Number'] = df['Number'].astype(int)

In [9]:
gt = pd.read_excel('train data/train_data_markup.xlsx', keep_default_na=False)

In [10]:
from sklearn.metrics import mean_squared_error
def calc_score(gt, sub):
  sub = sub.drop_duplicates(subset='Number', keep='last')

  mapping = {'FC': 1, 'LC': 2, 'PC': 3, 'NC': 4, 'NA': 5}
  gt['категории_числа'] = gt['Complience Level'].map(mapping)
  sub['категории_числа'] = sub['Complience Level'].map(mapping)


  merge_df = pd.merge(gt, sub, on='Number', how='left')
  merge_df['категории_числа_y'] = merge_df['категории_числа_y'].fillna(mapping['NA'])
  mse = mean_squared_error(merge_df['категории_числа_x'], merge_df['категории_числа_y'])

  score = max(0, 1.5-mse)/1.5
  print(mse)

  return score


In [11]:
calc_score(gt, df)

0.9166666666666666


0.3888888888888889

"mathstral-7B-v0.1-Q4_K_M.gguf" - 0.22222222222222218 temperature=0.5

"Yi-Coder-9B-Chat-Q4_K_M.gguf" - 0.3333333333333333 temperature=0.5