# Get Consensus from 3 LLMs on CWE Assignment

See https://cybersecai.github.io/Vulnrichment/Vulnrichment/ for context.

To optimize API and token usage and stay within limits, CVE-CWE pairs are processed in batches of batch_size CVE-CWE pairs.

This reduces the 
1. need to send the prompt for each CVE-CWE pair
2. number of calls to the API
   
The number needs to be small enough such that the Rationale from multiple CVE-CWE pairs fits within the output token limit for the LLM.


## LLM Info

### JSON Mode info 
1. https://ai.google.dev/gemini-api/docs/json-mode?lang=python 
2. https://platform.openai.com/docs/guides/json-mode
3. https://github.com/anthropics/anthropic-cookbook/blob/main/misc/how_to_enable_json_mode.ipynb "Claude doesn't have a formal "JSON Mode" with constrained sampling."

### Usage / Plan
1. OpenAI
   1. https://platform.openai.com/account/usage
   2. https://platform.openai.com/docs/guides/rate-limits/usage-tiers?context=tier-one
2. Gemini
   1. https://cloud.google.com/gemini/docs/quotas
3. Claude
   1. https://support.anthropic.com/en/articles/8324991-about-claude-pro-usage
   2. https://support.anthropic.com/en/articles/8325614-how-can-i-maximize-my-claude-pro-usage
   3. https://console.anthropic.com/settings/plans
   4. https://console.anthropic.com/settings/limits
   

### References
1. https://python.langchain.com/v0.1/docs/integrations/chat/openai/
2. https://claude3.pro/claude-3-5-sonnet-with-langchain/
3. https://python.langchain.com/v0.2/docs/integrations/chat/google_generative_ai/

In [1]:
#!pip install langchain_openai langchain_google_genai langchain_anthropic langchain

In [1]:
import pandas as pd
import json
import csv
import os
from datetime import datetime
from tqdm import tqdm

from dotenv import dotenv_values
from dotenv import load_dotenv

from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.prompts import ChatPromptTemplate

from langchain_openai import ChatOpenAI
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_anthropic import ChatAnthropic

from langchain.prompts import PromptTemplate
from langchain.prompts.chat import ChatPromptTemplate
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.schema import AIMessage
from concurrent.futures import ThreadPoolExecutor, as_completed

In [2]:
cwe_only_cisa_adp_input_file_path = './data_out/extracted_cwe_info_cisa_adp_only.csv' # Entries with CWEs only
llm_consensus_file_path = './data_out/llm_consensus.csv'
batch_size=10 # number of entries per batch.  
MAX_WORKERS=3 # threads


In [3]:

# load .env file to environment
load_dotenv()

config = dotenv_values(".env")

# Set up API keys (replace with your actual API keys)
os.environ["OPENAI_API_KEY"] = config['OPENAI_API_KEY']
os.environ["ANTHROPIC_API_KEY"] = config['ANTHROPIC_API_KEY']
os.environ["GOOGLE_API_KEY"] = config['GOOGLE_API_KEY']

#If using env vars
#GOOGLE_API_KEY=config['GOOGLE_API_KEY']
#OPENAI_API_KEY=config['OPENAI_API_KEY']
#ANTHROPIC_API_KEY=config['ANTHROPIC_API_KEY']


In [4]:
# Read the CSV file
try:
    df = pd.read_csv(cwe_only_cisa_adp_input_file_path, quoting=csv.QUOTE_ALL, escapechar='\\') # safe CSV

except FileNotFoundError:
    print("Error: CSV file not found.")
    exit(1)

In [5]:
#df=df[:20] # for test purposes take the first N rows only
df

Unnamed: 0,cve_id,Container,adp_providerMetadata_orgId,adp_providerMetadata_shortName,adp_providerMetadata_dateUpdated,CVE_Description,cna_providerMetadata_orgId,cna_providerMetadata_shortName,cna_providerMetadata_dateUpdated,CWE_ID,CWE_Description
0,CVE-2021-35559,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-06-25T16:05:50.566Z,"Vulnerability in the Java SE, Oracle GraalVM E...",,,,CWE-400,CWE-400 Uncontrolled Resource Consumption
1,CVE-2021-26928,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-05-01T15:18:46.280Z,BIRD through 2.0.7 does not provide functional...,,,,CWE-306,CWE-306 Missing Authentication for Critical Fu...
2,CVE-2021-26918,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-05-01T15:09:54.735Z,The ProBot bot through 2021-02-08 for Discord ...,,,,CWE-434,CWE-434 Unrestricted Upload of File with Dange...
3,CVE-2021-34983,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-05-08T15:08:02.757Z,NETGEAR Multiple Routers httpd Missing Authent...,,,,CWE-120,CWE-120 Buffer Copy without Checking Size of I...
4,CVE-2021-33990,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-07-12T15:32:38.330Z,Liferay Portal 6.2.5 allows Command=FileUpload...,,,,CWE-78,CWE-78 Improper Neutralization of Special Elem...
...,...,...,...,...,...,...,...,...,...,...,...
1879,CVE-2024-37762,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-07-08T15:13:35.601Z,MachForm up to version 21 is affected by an au...,,,,CWE-434,CWE-434 Unrestricted Upload of File with Dange...
1880,CVE-2024-37634,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-06-13T20:21:11.912Z,TOTOLINK A3700R V9.1.2u.6165_20211012 was disc...,,,,CWE-121,CWE-121 Stack-based Buffer Overflow
1881,CVE-2024-37535,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-06-10T12:55:15.708Z,GNOME VTE before 0.76.3 allows an attacker to ...,,,,CWE-400,CWE-400 Uncontrolled Resource Consumption
1882,CVE-2024-37019,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-06-04T14:49:54.923Z,Northern.tech Mender Enterprise before 3.6.4 a...,,,,CWE-287,CWE-287 Improper Authentication


In [6]:
# Set up LangChain for LLM interactions


llms = {
    "claude": ChatAnthropic(model="claude-3-sonnet-20240229", temperature=0,   timeout=None, max_retries=2),
    "gemini": ChatGoogleGenerativeAI(model="gemini-1.5-pro", temperature=0,  timeout=None, max_retries=2, generation_config={"response_mime_type": "application/json"}),
    #"gpt-4o": ChatOpenAI(model="gpt-4o", temperature=0, response_format={ "type": "json_object" })
}

# Define the prompt
# The Rationale string is last as it is most complex

# Update the prompt template
prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a cybersecurity expert specializing in identifying Common Weakness Enumeration (CWE) IDs from CVE descriptions. Your goal is to analyze if you agree with the assigned CWE ID or not for multiple CVEs."),
    ("human", """
    Analyze the following CVEs and their assigned CWE IDs:
    
    {cve_entries}
    
    For each CVE, output a JSON object containing the following information:
    {{
        "Agree": string, // "Yes" or "No"
        "Confidence": float // a confidence score between 0 and 1
        "Rationale": string, // Only if you do not Agree, provide a rationale why not
    }}
    
    Respond with a JSON array containing an object for each CVE, in the same order as provided.
    """)
])

# Update the output parser
response_schemas = [
    ResponseSchema(name="Agree", description="Whether the model agrees with the assigned CWE ID"),
    ResponseSchema(name="Confidence", description="Confidence score between 0 and 1"),
    ResponseSchema(name="Rationale", description="Rationale for disagreement")
]
#output_parser = StructuredOutputParser.from_response_schemas(response_schemas)


I0000 00:00:1721515995.553497 1032758 config.cc:230] gRPC experiments enabled: call_status_override_on_cancellation, event_engine_dns, event_engine_listener, http2_stats_fix, monitoring_experiment, pick_first_new, trace_record_callops, work_serializer_clears_time_cache


In [7]:
def create_batch_prompt(batch, prompt_template):
    cve_entries = []
    for _, row in batch.iterrows():
        cve_entry = f"""
        CVE ID: {row['cve_id']}
        CVE Description: {row['CVE_Description']}
        Assigned CWE ID: {row['CWE_ID']}
        """
        cve_entries.append(cve_entry)
    
    batch_prompt = prompt_template.format(cve_entries="\n\n".join(cve_entries))
    return batch_prompt

def parse_response(response_content):
    try:
        # First, try to parse the content as JSON
        parsed_responses = json.loads(response_content)
    except json.JSONDecodeError:
        # If JSON parsing fails, try to extract the JSON array from the text
        start = response_content.find('[')
        end = response_content.rfind(']') + 1
        if start != -1 and end != -1:
            json_str = response_content[start:end]
            parsed_responses = json.loads(json_str)
        else:
            raise ValueError("Unable to parse response content")
    
    # Ensure the parsed response is a list
    if not isinstance(parsed_responses, list):
        parsed_responses = [parsed_responses]
    
    return parsed_responses


def process_cve_batch(batch, llms, prompt_template, response_dir):
    batch_prompt = create_batch_prompt(batch, prompt_template)
    results = []
    responses = []
    
    for model_name, llm in llms.items():
        try:
            response = llm.invoke(batch_prompt)
            parsed_responses = parse_response(response.content)
            
            # Save response to file
            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
            filename = f"{model_name}_response_{timestamp}.json"
            filepath = os.path.join(response_dir, filename)
            with open(filepath, 'w') as f:
                json.dump({
                    'model': model_name,
                    'prompt': batch_prompt,
                    'response': response.content,
                    'parsed_responses': parsed_responses
                }, f, indent=2)
            
            for i, (_, row) in enumerate(batch.iterrows()):
                if i < len(parsed_responses):
                    result = {
                        'CVE ID': row['cve_id'],
                        'CVE Description': row['CVE_Description'],
                        'Assigned CWE ID': row['CWE_ID'],
                        f'{model_name}_Agreement': parsed_responses[i]['Agree'],
                        f'{model_name}_Confidence': parsed_responses[i]['Confidence'],
                        f'{model_name}_Rationale': parsed_responses[i].get('Rationale', ''),
                        f'{model_name}_ResponseFile': filename
                    }
                else:
                    print(f"Warning: Missing response for CVE {row['cve_id']} (CWE {row['CWE_ID']}) from {model_name}")
                    result = {
                        'CVE ID': row['cve_id'],
                        'CVE Description': row['CVE_Description'],
                        'Assigned CWE ID': row['CWE_ID'],
                        f'{model_name}_Agreement': '',
                        f'{model_name}_Confidence': '',
                        f'{model_name}_Rationale': '',
                        f'{model_name}_ResponseFile': filename
                    }
                results.append(result)
                responses.append(response)
        except Exception as e:
            print(f"Error processing batch with {model_name}: {str(e)}")
            for _, row in batch.iterrows():
                results.append({
                    'CVE ID': row['cve_id'],
                    'CVE Description': row['CVE_Description'],
                    'Assigned CWE ID': row['CWE_ID'],
                    f'{model_name}_Agreement': '',
                    f'{model_name}_Confidence': '',
                    f'{model_name}_Rationale': '',
                    f'{model_name}_ResponseFile': ''
                })
    
    return results, responses

def process_cves_in_batches(df, llms, prompt_template, batch_size, max_workers=MAX_WORKERS, response_dir='responses'):
    results = []
    total_batches = (len(df) + batch_size - 1) // batch_size
    
    # Create response directory if it doesn't exist
    os.makedirs(response_dir, exist_ok=True)
    
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        future_to_batch = {
            executor.submit(
                process_cve_batch, 
                df.iloc[i:i+batch_size], 
                llms, 
                prompt_template, 
                response_dir
            ): i for i in range(0, len(df), batch_size)
        }
        
        for future in tqdm(as_completed(future_to_batch), total=total_batches, desc="Processing CVE Batches"):
            batch_results, _ = future.result()
            results.extend(batch_results)
    
    return pd.DataFrame(results)

# Usage
results_df = process_cves_in_batches(df, llms, prompt_template, batch_size)

Processing CVE Batches:   1%|          | 1/189 [00:15<49:55, 15.93s/it]



Processing CVE Batches:  10%|█         | 19/189 [01:57<13:40,  4.82s/it]



Processing CVE Batches:  17%|█▋        | 33/189 [03:23<14:41,  5.65s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  18%|█▊        | 34/189 [03:28<14:31,  5.63s/it]



Processing CVE Batches:  19%|█▉        | 36/189 [03:37<11:42,  4.59s/it]



Processing CVE Batches:  23%|██▎       | 43/189 [04:21<13:24,  5.51s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  26%|██▋       | 50/189 [04:53<10:33,  4.56s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  30%|███       | 57/189 [05:31<10:50,  4.93s/it]



Processing CVE Batches:  31%|███       | 59/189 [05:43<11:12,  5.17s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  34%|███▍      | 65/189 [06:12<08:56,  4.33s/it]



Processing CVE Batches:  35%|███▍      | 66/189 [06:23<12:56,  6.32s/it]



Processing CVE Batches:  36%|███▌      | 68/189 [06:36<13:05,  6.50s/it]

Error processing batch with gemini: Expecting ',' delimiter: line 20 column 52 (char 1321)


Processing CVE Batches:  42%|████▏     | 79/189 [07:31<08:13,  4.49s/it]

Error processing batch with gemini: Expecting ',' delimiter: line 30 column 148 (char 1119)


Processing CVE Batches:  49%|████▉     | 93/189 [08:44<07:20,  4.59s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  52%|█████▏    | 99/189 [09:13<06:03,  4.04s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  56%|█████▌    | 106/189 [09:53<07:49,  5.65s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  57%|█████▋    | 107/189 [10:03<09:17,  6.80s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  60%|██████    | 114/189 [10:36<06:08,  4.91s/it]



Processing CVE Batches:  69%|██████▉   | 130/189 [12:09<06:42,  6.83s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  76%|███████▌  | 143/189 [13:13<03:48,  4.97s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  77%|███████▋  | 146/189 [13:32<04:02,  5.64s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  90%|█████████ | 171/189 [15:43<01:37,  5.43s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  92%|█████████▏| 174/189 [16:00<01:14,  4.98s/it]



Processing CVE Batches:  93%|█████████▎| 175/189 [16:17<02:01,  8.70s/it]

Error processing batch with gemini: Unable to parse response content


Processing CVE Batches:  95%|█████████▍| 179/189 [16:39<01:10,  7.02s/it]



Processing CVE Batches: 100%|██████████| 189/189 [17:20<00:00,  5.51s/it]


In [15]:
results_df

Unnamed: 0,CVE ID,CVE Description,Assigned CWE ID,claude_Agreement,claude_Confidence,claude_Rationale,claude_ResponseFile,gemini_Agreement,gemini_Confidence,gemini_Rationale,gemini_ResponseFile
0,CVE-2021-47329,"In the Linux kernel, the following vulnerabili...",CWE-400,Yes,0.9,,claude_response_20240720_235323.json,,,,
1,CVE-2021-47368,"In the Linux kernel, the following vulnerabili...",CWE-400,Yes,0.9,,claude_response_20240720_235323.json,,,,
2,CVE-2021-47482,"In the Linux kernel, the following vulnerabili...",CWE-544,Yes,0.8,,claude_response_20240720_235323.json,,,,
3,CVE-2021-47371,"In the Linux kernel, the following vulnerabili...",CWE-400,Yes,0.9,,claude_response_20240720_235323.json,,,,
4,CVE-2021-47238,"In the Linux kernel, the following vulnerabili...",CWE-400,Yes,0.9,,claude_response_20240720_235323.json,,,,
...,...,...,...,...,...,...,...,...,...,...,...
3763,CVE-2024-37764,MachForm up to version 19 is affected by an au...,CWE-79,,,,,Yes,0.95,,gemini_response_20240721_001036.json
3764,CVE-2024-37622,Xinhu RockOA v2.6.3 was discovered to contain ...,CWE-79,,,,,Yes,0.95,,gemini_response_20240721_001036.json
3765,CVE-2024-37643,TRENDnet TEW-814DAP v1_(FW1.01B01) was discove...,CWE-121,,,,,Yes,0.9,,gemini_response_20240721_001036.json
3766,CVE-2024-37569,An issue was discovered on Mitel 6869i through...,CWE-77,,,,,Yes,0.95,,gemini_response_20240721_001036.json


def create_batch_prompt(batch, prompt_template):
    cve_entries = []
    for _, row in batch.iterrows():
        cve_entry = f"""
        CVE ID: {row['cve_id']}
        CVE Description: {row['CVE_Description']}
        Assigned CWE ID: {row['CWE_ID']}
        """
        cve_entries.append(cve_entry)
    
    batch_prompt = prompt_template.format(cve_entries="\n\n".join(cve_entries))
    return batch_prompt

def parse_response(response_content):
    try:
        # First, try to parse the content as JSON
        parsed_responses = json.loads(response_content)
    except json.JSONDecodeError:
        # If JSON parsing fails, try to extract the JSON array from the text
        start = response_content.find('[')
        end = response_content.rfind(']') + 1
        if start != -1 and end != -1:
            json_str = response_content[start:end]
            parsed_responses = json.loads(json_str)
        else:
            raise ValueError("Unable to parse response content")
    
    # Ensure the parsed response is a list
    if not isinstance(parsed_responses, list):
        parsed_responses = [parsed_responses]
    
    return parsed_responses

def process_cve_batch(batch, llms, prompt_template):
    batch_prompt = create_batch_prompt(batch, prompt_template)
    results = []
    responses = []
    
    for model_name, llm in llms.items():
        try:
            response = llm.invoke(batch_prompt)
            parsed_responses = parse_response(response.content)
            
            for i, (_, row) in enumerate(batch.iterrows()):
                if i < len(parsed_responses):
                    result = {
                        'CVE ID': row['cve_id'],
                        'CVE Description': row['CVE_Description'],
                        'Assigned CWE ID': row['CWE_ID'],
                        f'{model_name}_Agreement': parsed_responses[i]['Agree'],
                        f'{model_name}_Confidence': parsed_responses[i]['Confidence'],
                        f'{model_name}_Rationale': parsed_responses[i].get('Rationale', '')
                    }
                else:
                    print(f"Warning: Missing response for CVE-CWE {row['cve_id'], row['CWE_ID']} from {model_name}")
                    result = {
                        'CVE ID': row['cve_id'],
                        'CVE Description': row['CVE_Description'],
                        'Assigned CWE ID': row['CWE_ID'],
                        f'{model_name}_Agreement': '',
                        f'{model_name}_Confidence': '',
                        f'{model_name}_Rationale': ''
                    }
                results.append(result)
                responses.append(response)
        except Exception as e:
            print(f"Error processing batch with {model_name}: {str(e)}")
            for _, row in batch.iterrows():
                results.append({
                    'CVE ID': row['cve_id'],
                    'CVE Description': row['CVE_Description'],
                    'Assigned CWE ID': row['CWE_ID'],
                    f'{model_name}_Agreement': '',
                    f'{model_name}_Confidence': '',
                    f'{model_name}_Rationale': ''
                })
    
    return results, responses

def process_cves_in_batches(df, llms, prompt_template, output_parser, batch_size, max_workers=MAX_WORKERS):
    results = []
    total_batches = (len(df) + batch_size - 1) // batch_size
    
    with ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
        future_to_batch = {
            executor.submit(
                process_cve_batch, 
                df.iloc[i:i+batch_size], 
                llms, 
                prompt_template, 
                output_parser
            ): i for i in range(0, len(df), batch_size)
        }
        
        for future in tqdm(as_completed(future_to_batch), total=total_batches, desc="Processing CVE Batches"):
            results.extend(future.result())
    
    return pd.DataFrame(results)

# Usage remains the same
results_df = process_cves_in_batches(df, llms, prompt_template, output_parser, batch_size)

In [9]:
#llms

In [16]:

results_df.to_csv(llm_consensus_file_path, index=False, quoting=csv.QUOTE_ALL, escapechar='\\') # safe CSV

In [19]:
results_df.claude_Agreement.value_counts()

Yes    1608
No      268
          8
Name: claude_Agreement, dtype: int64

In [21]:
results_df.gemini_Agreement.value_counts()

Yes    1270
No      459
        155
Name: gemini_Agreement, dtype: int64

In [13]:
#results_df['gpt-4_Agreement'].value_counts()