# Get Consensus from 3 LLMs on CWE Assignment

See https://cybersecai.github.io/Vulnrichment/Vulnrichment/ for context.


## LLM Info

### JSON Mode info 
1. https://ai.google.dev/gemini-api/docs/json-mode?lang=python 
2. https://platform.openai.com/docs/guides/json-mode
3. https://github.com/anthropics/anthropic-cookbook/blob/main/misc/how_to_enable_json_mode.ipynb "Claude doesn't have a formal "JSON Mode" with constrained sampling."

### Usage / Plan
1. OpenAI
   1. https://platform.openai.com/account/usage
   2. https://platform.openai.com/docs/guides/rate-limits/usage-tiers?context=tier-one
2. Gemini
   1. https://cloud.google.com/gemini/docs/quotas
3. Claude
   1. https://support.anthropic.com/en/articles/8324991-about-claude-pro-usage
   2. https://support.anthropic.com/en/articles/8325614-how-can-i-maximize-my-claude-pro-usage
   3. https://console.anthropic.com/settings/plans
   4. https://console.anthropic.com/settings/limits
   

### References
1. https://python.langchain.com/v0.1/docs/integrations/chat/openai/
2. https://claude3.pro/claude-3-5-sonnet-with-langchain/
3. https://python.langchain.com/v0.2/docs/integrations/chat/google_generative_ai/

In [2]:
#!pip install langchain_openai langchain_google_genai langchain_anthropic langchain

In [3]:
import pandas as pd
import json
import csv
import os
from tqdm import tqdm

from dotenv import dotenv_values
from dotenv import load_dotenv

from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.prompts import ChatPromptTemplate

from langchain_openai import ChatOpenAI
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_anthropic import ChatAnthropic

from langchain.prompts import PromptTemplate
from langchain.prompts.chat import ChatPromptTemplate
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.schema import AIMessage


In [17]:
cwe_only_cisa_adp_input_file_path = './data_out/extracted_cwe_info_cisa_adp_only.csv' # Entries with CWEs only
llm_consensus_file_path = './data_out/llm_consensus.csv'


In [5]:

# load .env file to environment
load_dotenv()

config = dotenv_values(".env")

# Set up API keys (replace with your actual API keys)
os.environ["OPENAI_API_KEY"] = config['OPENAI_API_KEY']
os.environ["ANTHROPIC_API_KEY"] = config['ANTHROPIC_API_KEY']
os.environ["GOOGLE_API_KEY"] = config['GOOGLE_API_KEY']

#If using env vars
#GOOGLE_API_KEY=config['GOOGLE_API_KEY']
#OPENAI_API_KEY=config['OPENAI_API_KEY']
#ANTHROPIC_API_KEY=config['ANTHROPIC_API_KEY']


In [6]:
# Read the CSV file
try:
    df = pd.read_csv(cwe_only_cisa_adp_input_file_path)

except FileNotFoundError:
    print("Error: CSV file not found.")
    exit(1)

df

Unnamed: 0,cve_id,Container,adp_providerMetadata_orgId,adp_providerMetadata_shortName,adp_providerMetadata_dateUpdated,CVE_Description,cna_providerMetadata_orgId,cna_providerMetadata_shortName,cna_providerMetadata_dateUpdated,CWE_ID,CWE_Description
0,CVE-2021-35559,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-06-25T16:05:50.566Z,"Vulnerability in the Java SE, Oracle GraalVM E...",,,,CWE-400,CWE-400 Uncontrolled Resource Consumption
1,CVE-2021-26928,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-05-01T15:18:46.280Z,BIRD through 2.0.7 does not provide functional...,,,,CWE-306,CWE-306 Missing Authentication for Critical Fu...
2,CVE-2021-26918,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-05-01T15:09:54.735Z,The ProBot bot through 2021-02-08 for Discord ...,,,,CWE-434,CWE-434 Unrestricted Upload of File with Dange...
3,CVE-2021-34983,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-05-08T15:08:02.757Z,NETGEAR Multiple Routers httpd Missing Authent...,,,,CWE-120,CWE-120 Buffer Copy without Checking Size of I...
4,CVE-2021-33990,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-07-12T15:32:38.330Z,Liferay Portal 6.2.5 allows Command=FileUpload...,,,,CWE-78,CWE-78 Improper Neutralization of Special Elem...
...,...,...,...,...,...,...,...,...,...,...,...
1879,CVE-2024-37762,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-07-08T15:13:35.601Z,MachForm up to version 21 is affected by an au...,,,,CWE-434,CWE-434 Unrestricted Upload of File with Dange...
1880,CVE-2024-37634,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-06-13T20:21:11.912Z,TOTOLINK A3700R V9.1.2u.6165_20211012 was disc...,,,,CWE-121,CWE-121 Stack-based Buffer Overflow
1881,CVE-2024-37535,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-06-10T12:55:15.708Z,GNOME VTE before 0.76.3 allows an attacker to ...,,,,CWE-400,CWE-400 Uncontrolled Resource Consumption
1882,CVE-2024-37019,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-06-04T14:49:54.923Z,Northern.tech Mender Enterprise before 3.6.4 a...,,,,CWE-287,CWE-287 Improper Authentication


In [7]:
df=df[:3] # for test purposes take the first 3 rows only
df

Unnamed: 0,cve_id,Container,adp_providerMetadata_orgId,adp_providerMetadata_shortName,adp_providerMetadata_dateUpdated,CVE_Description,cna_providerMetadata_orgId,cna_providerMetadata_shortName,cna_providerMetadata_dateUpdated,CWE_ID,CWE_Description
0,CVE-2021-35559,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-06-25T16:05:50.566Z,"Vulnerability in the Java SE, Oracle GraalVM E...",,,,CWE-400,CWE-400 Uncontrolled Resource Consumption
1,CVE-2021-26928,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-05-01T15:18:46.280Z,BIRD through 2.0.7 does not provide functional...,,,,CWE-306,CWE-306 Missing Authentication for Critical Fu...
2,CVE-2021-26918,adp,134c704f-9b21-4f2e-91b3-4a467353bcc0,CISA-ADP,2024-05-01T15:09:54.735Z,The ProBot bot through 2021-02-08 for Discord ...,,,,CWE-434,CWE-434 Unrestricted Upload of File with Dange...


In [8]:

# Set up LangChain for LLM interactions


llms = {
    "claude": ChatAnthropic(model="claude-3-sonnet-20240229", temperature=0,   timeout=None, max_retries=2),
    "gemini": ChatGoogleGenerativeAI(model="gemini-1.5-pro", temperature=0,  timeout=None, max_retries=2, generation_config={"response_mime_type": "application/json"}),
    #"gpt-4o": ChatOpenAI(model="gpt-4o", temperature=0, response_format={ "type": "json_object" })
}

# Define the prompt
# The Rationale string is last as it is most complex

prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a cybersecurity expert specializing in identifying Common Weakness Enumeration (CWE) IDs from CVE descriptions. Your goal is to analyze if you agree with the assigned CWE ID or not."),
    ("human", """
      CVE ID: {cve_id}
      CVE Description: {cve_description}
      Assigned CWE ID: {cwe_id}

Output a JSON object containing the following information:
{{
    "Agree": string, // "Yes" or "No"
    "Confidence": float // a confidence score between 0 and 1
    "Rationale": string, // Only if you do not Agree, provide a rationale why not

}}
""")
])


# Set up output parser
response_schemas = [
    ResponseSchema(name="Agree", description="Whether the model agrees with the assigned CWE ID"),
    ResponseSchema(name="Confidence", description="Confidence score between 0 and 1"),
    ResponseSchema(name="Rationale", description="Rationale for disagreement")

]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)



I0000 00:00:1721508258.013090  668077 config.cc:230] gRPC experiments enabled: call_status_override_on_cancellation, event_engine_dns, event_engine_listener, http2_stats_fix, monitoring_experiment, pick_first_new, trace_record_callops, work_serializer_clears_time_cache


In [9]:
#llms

{'claude': ChatAnthropic(model='claude-3-sonnet-20240229', temperature=0.0, anthropic_api_url='https://api.anthropic.com', anthropic_api_key=SecretStr('**********'), _client=<anthropic.Anthropic object at 0x7a15942f5db0>, _async_client=<anthropic.AsyncAnthropic object at 0x7a14dfb78e80>),
 'gemini': ChatGoogleGenerativeAI(model='models/gemini-1.5-pro', temperature=0.0, max_retries=2, client=<google.ai.generativelanguage_v1beta.services.generative_service.client.GenerativeServiceClient object at 0x7a14dfb79f90>, async_client=<google.ai.generativelanguage_v1beta.services.generative_service.async_client.GenerativeServiceAsyncClient object at 0x7a14dfbdcf70>, default_metadata=())}

In [10]:
import pandas as pd
from tqdm import tqdm
from langchain.schema import AIMessage
from concurrent.futures import ThreadPoolExecutor, as_completed

def process_cve(row, llms, prompt_template, output_parser):
    cve_id, cve_description, cwe_id = row['cve_id'], row['CVE_Description'], row['CWE_ID']
    prompt = prompt_template.format(cve_id=cve_id, cve_description=cve_description, cwe_id=cwe_id)
    
    result = {'CVE ID': cve_id, 'CVE Description': cve_description, 'Assigned CWE ID': cwe_id}
    
    for model_name, llm in llms.items():
        try:
            response = llm.invoke(prompt)
            parsed_response = output_parser.parse(response.content)
            result[f'{model_name}_Agreement'] = parsed_response['Agree']
            result[f'{model_name}_Confidence'] = parsed_response['Confidence']
            result[f'{model_name}_Rationale'] = parsed_response.get('Rationale', '')
        except Exception as e:
            print(f"Error processing {cve_id} with {model_name}: {str(e)}")
            result[f'{model_name}_Agreement'] = ''
            result[f'{model_name}_Confidence'] = ''
            result[f'{model_name}_Rationale'] = ''
    
    return result

def process_cves(df, llms, prompt_template, output_parser, max_workers=5):
    results = []
    total_cves = len(df)
    
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        future_to_row = {executor.submit(process_cve, row, llms, prompt_template, output_parser): row for _, row in df.iterrows()}
        
        for future in tqdm(as_completed(future_to_row), total=total_cves, desc="Processing CVEs"):
            results.append(future.result())
    
    return pd.DataFrame(results)

# Usage
results_df = process_cves(df, llms, prompt_template, output_parser)

Processing CVEs:   0%|          | 0/3 [00:00<?, ?it/s]

Processing CVEs: 100%|██████████| 3/3 [00:06<00:00,  2.21s/it]


In [11]:
from langchain.schema import AIMessage

# Process each CVE entry
results = []
errors = {model: 0 for model in llms.keys()}
total_cves = len(df)

for _, row in tqdm(df.iterrows(), total=total_cves, desc="Processing CVEs"):
    cve_id = row['cve_id']
    cve_description = row['CVE_Description']
    cwe_id = row['CWE_ID']
    
    prompt = prompt_template.format(cve_id=cve_id, cve_description=cve_description, cwe_id=cwe_id)
    
    result = {'CVE ID': cve_id, 'CVE Description': cve_description, 'Assigned CWE ID': cwe_id}
    
    for model_name, llm in llms.items():
        try:
            #response = llm(prompt)
            response = llm.invoke(prompt)
            #print(response)

            parsed_response = output_parser.parse(response.content)
            result[f'{model_name}_Agreement'] = parsed_response['Agree']
            result[f'{model_name}_Confidence'] = parsed_response['Confidence']
            result[f'{model_name}_Rationale'] = parsed_response.get('Rationale', '')

        except Exception as e:
            print(f"Error processing {cve_id} with {model_name}: {str(e)}")
            errors[model_name] += 1
            result[f'{model_name}_Agreement'] = ''
            result[f'{model_name}_Confidence'] = ''
            result[f'{model_name}_Rationale'] = ''    
    results.append(result)

# Create and populate the pandas DataFrame
results_df = pd.DataFrame(results)

Processing CVEs: 100%|██████████| 3/3 [00:16<00:00,  5.40s/it]


In [12]:
results

[{'CVE ID': 'CVE-2021-35559',
  'CVE Description': 'Vulnerability in the Java SE, Oracle GraalVM Enterprise Edition product of Oracle Java SE (component: Swing). Supported versions that are affected are Java SE: 7u311, 8u301, 11.0.12, 17; Oracle GraalVM Enterprise Edition: 20.3.3 and 21.2.0. Easily exploitable vulnerability allows unauthenticated attacker with network access via multiple protocols to compromise Java SE, Oracle GraalVM Enterprise Edition. Successful attacks of this vulnerability can result in unauthorized ability to cause a partial denial of service (partial DOS) of Java SE, Oracle GraalVM Enterprise Edition. Note: This vulnerability applies to Java deployments, typically in clients running sandboxed Java Web Start applications or sandboxed Java applets, that load and run untrusted code (e.g., code that comes from the internet) and rely on the Java sandbox for security. This vulnerability can also be exploited by using APIs in the specified Component, e.g., through a we

In [13]:
results_df

Unnamed: 0,CVE ID,CVE Description,Assigned CWE ID,claude_Agreement,claude_Confidence,claude_Rationale,gemini_Agreement,gemini_Confidence,gemini_Rationale
0,CVE-2021-35559,"Vulnerability in the Java SE, Oracle GraalVM E...",CWE-400,Yes,0.9,,No,0.8,The description indicates a partial Denial of ...
1,CVE-2021-26928,BIRD through 2.0.7 does not provide functional...,CWE-306,Yes,0.9,,No,0.8,While CWE-306 (Missing Authentication for Crit...
2,CVE-2021-26918,The ProBot bot through 2021-02-08 for Discord ...,CWE-434,No,0.7,The CVE description does not provide enough in...,No,0.8,While CWE-434 (Unrestricted Upload of File wit...


In [18]:

results_df.to_csv(llm_consensus_file_path, index=False, quoting=csv.QUOTE_ALL, escapechar='\\') # safe CSV

In [19]:
results_df.claude_Agreement.value_counts()

Yes    2
No     1
Name: claude_Agreement, dtype: int64

In [20]:
results_df.gemini_Agreement.value_counts()

No    3
Name: gemini_Agreement, dtype: int64

In [21]:
#results_df['gpt-4_Agreement'].value_counts()