# Exploratory Data Analysis (EDA): CVE Vulnerability Dataset

**Purpose:** Evaluate structure, quality, and readiness for BI dashboards and CVSS version comparisons.

## Contents

1. [Overview](#overview)
2. [Dataset Description](#dataset-description)
3. [Data Dictionary](#data-dictionary)
4. [Objectives](#objectives)
5. [Expected Outcomes](#expected-outcomes)
6. [Assumptions & Caveats](#assumptions--caveats)

## 1. Overview

This EDA examines a cybersecurity dataset of Common Vulnerabilities and Exposures (CVEs). Each row details a software vulnerability, including metadata, descriptions, affected products, and CVSS scores. Focus areas: data quality, nested field normalization, and CVSS comparisons (v2.0, 3.x, 4.0).

## 2. Dataset Description

**Source:** Aggregated from CVE feeds (e.g., CVEFeed.io, MITRE, NVD) via internal ETL. Sample includes identifiers, text fields, timestamps, exploitation flags, nested products, and CVSS objects.

<details>
<summary>Sample Rows (Truncated)</summary>

```
cve_id,title,description,published_date,last_modified,remotely_exploit,source,category,affected_products,cvss_scores,url,loaded_at
CVE-2025-11608,code-projects E-Banking System POST Parameter ...,A security vulnerability has been detected in ...,Oct. 11, 2025, 5:15 p.m.,Oct. 11, 2025, 5:15 p.m.,Yes !,cve@mitre.org,Injection,[],
[{"score": "7.5", "version": "CVSS 2.0", "seve...],https://cvefeed.io/vuln/detail/CVE-2025-11608,2025-10-12 17:22:01.863992+00:00

CVE-1999-0095,Sendmail Command Injection Vulnerability,The following products are affected by...,Oct. 1, 1988, 4 a.m.,April 3, 2025, 1:03 a.m.,Yes !,cve@mitre.org,,
[{"id": "1", "vendor": "Eric_allman", "product"...],[{"score": "10", "version": "CVSS 2.0", "sever...],https://cvefeed.io/vuln/detail/CVE-1999-0095,2025-10-12 17:22:01.863992+00:00
```

</details>

## 3. Data Dictionary

| Column            | Description                          | Type            | Notes                          |
|-------------------|--------------------------------------|-----------------|--------------------------------|
| `cve_id`          | Global CVE identifier                | string          | Primary key                    |
| `title`           | Short summary                        | string          | May be truncated               |
| `description`     | Detailed description                 | string          | Contains product hints         |
| `published_date`  | Initial disclosure timestamp         | datetime string | Normalize to UTC               |
| `last_modified`   | Last update timestamp                | datetime string | May differ from publish        |
| `remotely_exploit`| Remote exploitation flag             | string/bool     | Normalize values like "Yes !"  |
| `source`          | Origin/maintainer                    | string          | Email/domain format            |
| `category`        | Vulnerability type                   | string (nullable)| Map to taxonomy if missing     |
| `affected_products`| Affected vendors/products list      | JSON list (string)| Parse and normalize            |
| `cvss_scores`     | CVSS objects list (score, version, etc.) | JSON list (string)| Explode to long format         |
| `url`             | CVE detail link                      | string (URL)    | For verification               |
| `loaded_at`       | ETL load timestamp                   | datetime (UTC)  | For freshness checks           |

## 4. Objectives

1. Check data quality: missing values, inconsistencies, duplicates.
2. Normalize nested data: Parse and expand `cvss_scores`, `affected_products`.
3. Analyze CVSS: Distributions, severity trends, version differences.
4. Profile landscape: Categories, sources, vendors.
5. Prep for BI: Tidy tables with stable keys for dashboards.

## 5. Expected Outcomes

- Clean, normalized tables for analysis and BI.
- Long-format CVSS data (one row per CVE-version).
- Stats and visuals for trends.
- Notes on assumptions and limitations.

## 6. Assumptions & Caveats

- **Datetimes:** Convert heterogeneous formats to UTC.
- **Flags:** Normalize e.g., `"Yes !"` to boolean.
- **CVSS:** Multiple versions per CVE; parse vectors for metrics.
- **Categories:** Infer missing ones via text if needed.

### Step 1 — Import Libraries and Configuration

In [90]:

# --- Manipulation et analyse de données
import pandas as pd
import numpy as np

# --- Visualisation
import matplotlib.pyplot as plt
import seaborn as sns

# --- Traitement du texte
import re
import string

# --- Pré-traitement et machine learning utils
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.feature_extraction.text import TfidfVectorizer

# --- Date et temps
from datetime import datetime, timedelta

# --- Options d’affichage pandas
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 50)
pd.set_option('display.width', 120)
pd.set_option('display.float_format', '{:.2f}'.format)

# --- Style des graphiques
sns.set_theme(style="whitegrid")
plt.rcParams['figure.figsize'] = (10, 5)
plt.rcParams['axes.titlesize'] = 13
plt.rcParams['axes.labelsize'] = 11

from sqlalchemy import create_engine

DB_USER = "postgres"
DB_PASS = "tip_pwd"
DB_HOST = "localhost"
DB_PORT = "5432"
DB_NAME = "tip"

engine = create_engine(f"postgresql+psycopg2://{DB_USER}:{DB_PASS}@{DB_HOST}:{DB_PORT}/{DB_NAME}")
df = pd.read_sql("SELECT * FROM raw.cve_details;", engine)

### Step 2 : Loading and Initial Inspection of the Dataset

In [91]:
query_dims = """
WITH
rows_count AS (
  SELECT COUNT(*)::bigint AS rows
  FROM raw.cve_details
),
cols_count AS (
  SELECT COUNT(*)::int AS cols
  FROM information_schema.columns
  WHERE table_schema = 'raw'
    AND table_name   = 'cve_details'
)
SELECT rows_count.rows, cols_count.cols
FROM rows_count, cols_count;
"""

dims = pd.read_sql(query_dims, engine).iloc[0]
n_rows, n_cols = int(dims["rows"]), int(dims["cols"])

print("✅ Dataset loaded successfully!")
print(f"Dataset dimensions: {n_rows} rows × {n_cols} columns\n")

✅ Dataset loaded successfully!
Dataset dimensions: 4576 rows × 12 columns



In [92]:
# Preview of the first rows
display(df.head(5))

Unnamed: 0,cve_id,title,description,published_date,last_modified,remotely_exploit,source,category,affected_products,cvss_scores,url,loaded_at
0,CVE-2025-11608,code-projects E-Banking System POST Parameter ...,A security vulnerability has been detected in ...,"Oct. 11, 2025, 5:15 p.m.","Oct. 11, 2025, 5:15 p.m.",Yes !,[email protected],Injection,[],"[{""score"": ""7.5"", ""version"": ""CVSS 2.0"", ""seve...",https://cvefeed.io/vuln/detail/CVE-2025-11608,2025-10-12 18:54:42.586544+00:00
1,CVE-1999-0095,Sendmail Command Injection Vulnerability,The following products are affected byCVE-1999...,"Oct. 1, 1988, 4 a.m.","April 3, 2025, 1:03 a.m.",Yes !,[email protected],,"[{""id"": ""1"", ""vendor"": ""Eric_allman"", ""product...","[{""score"": ""10"", ""version"": ""CVSS 2.0"", ""sever...",https://cvefeed.io/vuln/detail/CVE-1999-0095,2025-10-12 18:54:42.586544+00:00
2,CVE-1999-0082,Tenable FTP Server Command Injection Vulnerabi...,The following products are affected byCVE-1999...,"Nov. 11, 1988, 5 a.m.","April 3, 2025, 1:03 a.m.",Yes !,[email protected],,"[{""id"": ""1"", ""vendor"": ""Ftp"", ""product"": ""ftp""...","[{""score"": ""10"", ""version"": ""CVSS 2.0"", ""sever...",https://cvefeed.io/vuln/detail/CVE-1999-0082,2025-10-12 18:54:42.586544+00:00
3,CVE-1999-1471,"""BSD Passwd Buffer Overflow Root Privilege Esc...",The following products are affected byCVE-1999...,"Jan. 1, 1989, 5 a.m.","April 3, 2025, 1:03 a.m.",No,[email protected],,"[{""id"": ""1"", ""vendor"": ""Bsd"", ""product"": ""bsd""}]","[{""score"": ""7.2"", ""version"": ""CVSS 2.0"", ""seve...",https://cvefeed.io/vuln/detail/CVE-1999-1471,2025-10-12 18:54:42.586544+00:00
4,CVE-1999-1122,SunOS Restore Privilege Escalation Vulnerability,Vulnerability in restore in SunOS 4.0.3 and ea...,"July 26, 1989, 4 a.m.","April 3, 2025, 1:03 a.m.",No,[email protected],,"[{""id"": ""1"", ""vendor"": ""Sun"", ""product"": ""suno...","[{""score"": ""4.6"", ""version"": ""CVSS 2.0"", ""seve...",https://cvefeed.io/vuln/detail/CVE-1999-1122,2025-10-12 18:54:42.586544+00:00


In [93]:
# General information about the columns
print("Informations sur les types de colonnes :")
df.info()

Informations sur les types de colonnes :
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4576 entries, 0 to 4575
Data columns (total 12 columns):
 #   Column             Non-Null Count  Dtype              
---  ------             --------------  -----              
 0   cve_id             4576 non-null   object             
 1   title              4576 non-null   object             
 2   description        4576 non-null   object             
 3   published_date     4576 non-null   object             
 4   last_modified      4576 non-null   object             
 5   remotely_exploit   4576 non-null   object             
 6   source             4576 non-null   object             
 7   category           4576 non-null   object             
 8   affected_products  4576 non-null   object             
 9   cvss_scores        4576 non-null   object             
 10  url                4576 non-null   object             
 11  loaded_at          4576 non-null   datetime64[ns, UTC]
dtypes: date

### Step 3 — Nettoyage de base et conversions de types

#### 3.1 — Conversion des dates

The first thing we noticed is that the columns **`published_date`** and **`last_modified`** are of type *object* — they need to be converted to **datetime**.  
Convert `published_date` and `last_modified` to datetime, handling heterogeneous formats.

**Post-Conversion Notes:**
- Any unparseable dates become **NaT** → flag them for review.

In [94]:
def normalize_date_str(s):
    """Normalise les variations communes avant parsing:
       - convertit None/NaN en None
       - remplace 'a.m.' / 'p.m.' / 'a.m' / 'pm.' etc par 'AM'/'PM'
       - enlève le point après month abbrev (e.g. 'Oct.' -> 'Oct')
       - supprime espaces multiples
    """
    if pd.isna(s):
        return None
    s = str(s).strip()

    # Normalize AM/PM variants to 'AM' / 'PM'
    s = re.sub(r'\b(a\.?m\.?|am)\b', 'AM', s, flags=re.IGNORECASE)
    s = re.sub(r'\b(p\.?m\.?|pm)\b', 'PM', s, flags=re.IGNORECASE)

    # Remove dot after 3-letter month abbreviations like 'Oct.' -> 'Oct'
    # only if it's followed by space and digit (month dot used only there)
    s = re.sub(r'([A-Za-z]{3})\.(?=\s+\d)', r'\1', s)

    # Also remove stray dots that break parsing (but be conservative)
    # e.g. 'CVE-...' might contain dots but dates are fine after previous fixes.
    # Remove remaining dots in the AM/PM area already handled.
    s = s.replace('..', '.')  # collapse double dots if any

    # Normalize commas/spaces: ensure one space after comma
    s = re.sub(r',\s*', ', ', s)

    # Examples of remaining forms:
    # "Oct 11, 2025, 5:15 PM", "Nov 11, 1988, 5 AM", "July 26, 1989, 4 AM"
    return s

def try_parse_date(s):
    """Try several parsing strategies, return pd.Timestamp or NaT."""
    if s is None:
        return pd.NaT

    # 1) Try common explicit formats (fast)
    formats = [
        "%b %d, %Y, %I:%M %p",   # "Oct 11, 2025, 5:15 PM"
        "%b %d, %Y, %I %p",      # "Nov 11, 1988, 5 PM" (no minutes)
        "%B %d, %Y, %I:%M %p",   # "July 26, 1989, 4:00 AM" (full month)
        "%B %d, %Y, %I %p",      # "July 26, 1989, 4 AM"
        "%Y-%m-%dT%H:%M:%S.%f",  # ISO-ish (if present)
        "%Y-%m-%d %H:%M:%S",     # fallback ISO/no-T
    ]
    for fmt in formats:
        try:
            return pd.to_datetime(s, format=fmt, errors='raise')
        except Exception:
            pass

    # 2) Try pandas with infer (which uses dateutil under the hood)
    try:
        return pd.to_datetime(s, infer_datetime_format=True, errors='raise')
    except Exception:
        pass

    # 3) Last fallback: direct dateutil parsing (most flexible)
    try:
        return parser.parse(s)
    except Exception:
        return pd.NaT

# Apply to your dataframe
for col in ["published_date", "last_modified"]:
    # 1) Normalize strings
    norm_col = f"{col}_norm"
    df[norm_col] = df[col].apply(normalize_date_str)

    # 2) Parse using the robust function
    parsed = df[norm_col].apply(try_parse_date)

    # 3) Assign back as datetime dtype
    df[col] = pd.to_datetime(parsed, errors='coerce')

    # Drop helper column if you want
    df.drop(columns=[norm_col], inplace=True)

# Quick checks
print("Dtypes:")
print(df[["published_date", "last_modified"]].dtypes)
print("\nHow many missing after parse?")
print(df["published_date"].isna().sum(), "published_date NaT")
print(df["last_modified"].isna().sum(), "last_modified NaT")

# Show the rows that still failed (to inspect problematic strings)
failed_pub = df[df["published_date"].isna()][["published_date", "published_date"]].head(10)
if len(failed_pub) > 0:
    print("\nSample rows with published_date still NaT (show original raw strings for debugging):")

  return pd.to_datetime(s, infer_datetime_format=True, errors='raise')
  return pd.to_datetime(s, infer_datetime_format=True, errors='raise')


Dtypes:
published_date    datetime64[ns]
last_modified     datetime64[ns]
dtype: object

How many missing after parse?
0 published_date NaT
0 last_modified NaT


In [95]:
df[['published_date', 'last_modified']].head(5)

Unnamed: 0,published_date,last_modified
0,2025-10-11 17:15:00,2025-10-11 17:15:00
1,1988-10-01 04:00:00,2025-04-03 01:03:00
2,1988-11-11 05:00:00,2025-04-03 01:03:00
3,1989-01-01 05:00:00,2025-04-03 01:03:00
4,1989-07-26 04:00:00,2025-04-03 01:03:00


In [96]:
# --- Normalize `loaded_at` ----------------------------------------------------
# Convert timezone-aware timestamps like '2025-10-12 17:56:02.356745+00:00'
# into simple UTC-naive format: '2025-10-12 17:56:02'

df['loaded_at'] = (
    pd.to_datetime(df['loaded_at'], utc=True, errors='coerce')  # ensure datetime & UTC
      .dt.tz_convert(None)                                      # drop timezone info
      .dt.strftime('%Y-%m-%d %H:%M:%S')                         # uniform format
)

# Quick check
print(df['loaded_at'].head(10))
print(df['loaded_at'].dtype)

0    2025-10-12 18:54:42
1    2025-10-12 18:54:42
2    2025-10-12 18:54:42
3    2025-10-12 18:54:42
4    2025-10-12 18:54:42
5    2025-10-12 18:54:42
6    2025-10-12 18:54:42
7    2025-10-12 18:54:42
8    2025-10-12 18:54:42
9    2025-10-12 18:54:42
Name: loaded_at, dtype: object
object


#### 3.2 — Suppression des colonnes inutiles et vérification des doublons

Since the **`url`** column can be easily reconstructed (it follows a simple pattern:  
`https://cvefeed.io/vuln/detail/{cve_id}`), we can safely drop it from the dataset


In [97]:
df.drop(columns=["url"], inplace=True, errors='ignore')

Afterwards, check for **duplicate entries** (e.g., multiple rows with the same `cve_id`)  
to ensure data consistency and avoid redundant analyses.

In [98]:
# checking for duplicates
df.duplicated().any()

np.False_

#### 3.3 — Normalisation de la colonne `remotely_exploit`

The values in this column should be either `true` or `false`. Let's check the different values in the column.

In [99]:
# Get all unique values in a column
unique_values = df['remotely_exploit'].unique()
print(unique_values)

['Yes !' 'No']


In [101]:
# Convert 'Yes !' to True and 'No' to False, leave existing True/False as is
df['remotely_exploit'] = df['remotely_exploit'].apply(
    lambda x: True if x == 'Yes !' else (False if x == 'No' else x)
)

# Check the result
print(df['remotely_exploit'].head())

0     True
1     True
2     True
3    False
4    False
Name: remotely_exploit, dtype: bool


In [114]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4622 entries, 0 to 4621
Data columns (total 52 columns):
 #   Column                     Non-Null Count  Dtype         
---  ------                     --------------  -----         
 0   cve_id                     4622 non-null   object        
 1   title                      4622 non-null   object        
 2   description                4622 non-null   object        
 3   published_date             4622 non-null   datetime64[ns]
 4   last_modified              4622 non-null   datetime64[ns]
 5   remotely_exploit           4622 non-null   bool          
 6   source                     4622 non-null   object        
 7   category                   4622 non-null   object        
 8   loaded_at                  4622 non-null   object        
 9   cvss_score                 4622 non-null   float64       
 10  cvss_version               4622 non-null   object        
 11  cvss_severity              4622 non-null   object        
 12  cvss_v

### Step 4 — CVSS Data Extraction and Transformation

#### 4.1 — ⚙️ Preparing Multi-Version CVSS Data

The **`cvss_scores`** column contains a list of CVSS score objects for each CVE entry.  
A typical schema looks like this:

```json
[
  {
    "score": "7.5",
    "version": "CVSS 2.0",
    "severity": "HIGH",
    "vector": "AV:N/AC:L/Au:N/C:P/I:P/A:P",
    "exploitability_score": "10.0",
    "impact_score": "6.4",
    "source": "[email protected]"
  },
  {
    "score": "7.3",
    "version": "CVSS 3.1",
    "severity": "HIGH",
    "vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:L",
    "exploitability_score": "3.9",
    "impact_score": "3.4",
    "source": "[email protected]"
  },
  {
    "score": "6.9",
    "version": "CVSS 4.0",
    "severity": "MEDIUM",
    "vector": "CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:L/VI:L/VA:L/SC:N/SI:N/SA:N/..."
  }
]
```

For CVEs that contain **multiple CVSS versions** (e.g., _2.0_, _3.1_, _4.0_),  
we will **duplicate the corresponding rows** so that **each row represents a single CVSS entry** —  
with its specific **score**, **version**, and **vector**.

This normalized structure will make it easier to **analyze**, **compare**, and **visualize** different CVSS versions later during the dashboard or analytical phase.

Before performing this operation, we should first **remove all rows that do not contain any CVSS data**.  
If a CVE has no `cvss_score`, `cvss_version`, or `cvss_vector`,  
it means that **no vulnerability scoring information is available** —  
and therefore, other related fields are likely missing as well.

➡️ These rows will be dropped to ensure we only work with complete and meaningful data.

In [102]:
# 1️⃣ Compter les lignes sans CVSS score (NaN ou liste vide)
missing_count = df["cvss_scores"].isna().sum() + (df["cvss_scores"].str.strip() == "[]").sum()
print(f"Number of rows without CVSS scores: {missing_count}")

# 2️⃣ Supprimer ces lignes directement dans df
df.drop(df[df["cvss_scores"].isna() | (df["cvss_scores"].str.strip() == "[]")].index, inplace=True)

# 3️⃣ Vérification
print(f"Remaining rows after drop: {len(df)}")


Number of rows without CVSS scores: 40
Remaining rows after drop: 4536


In [103]:
df.head(5)

Unnamed: 0,cve_id,title,description,published_date,last_modified,remotely_exploit,source,category,affected_products,cvss_scores,loaded_at
0,CVE-2025-11608,code-projects E-Banking System POST Parameter ...,A security vulnerability has been detected in ...,2025-10-11 17:15:00,2025-10-11 17:15:00,True,[email protected],Injection,[],"[{""score"": ""7.5"", ""version"": ""CVSS 2.0"", ""seve...",2025-10-12 18:54:42
1,CVE-1999-0095,Sendmail Command Injection Vulnerability,The following products are affected byCVE-1999...,1988-10-01 04:00:00,2025-04-03 01:03:00,True,[email protected],,"[{""id"": ""1"", ""vendor"": ""Eric_allman"", ""product...","[{""score"": ""10"", ""version"": ""CVSS 2.0"", ""sever...",2025-10-12 18:54:42
2,CVE-1999-0082,Tenable FTP Server Command Injection Vulnerabi...,The following products are affected byCVE-1999...,1988-11-11 05:00:00,2025-04-03 01:03:00,True,[email protected],,"[{""id"": ""1"", ""vendor"": ""Ftp"", ""product"": ""ftp""...","[{""score"": ""10"", ""version"": ""CVSS 2.0"", ""sever...",2025-10-12 18:54:42
3,CVE-1999-1471,"""BSD Passwd Buffer Overflow Root Privilege Esc...",The following products are affected byCVE-1999...,1989-01-01 05:00:00,2025-04-03 01:03:00,False,[email protected],,"[{""id"": ""1"", ""vendor"": ""Bsd"", ""product"": ""bsd""}]","[{""score"": ""7.2"", ""version"": ""CVSS 2.0"", ""seve...",2025-10-12 18:54:42
4,CVE-1999-1122,SunOS Restore Privilege Escalation Vulnerability,Vulnerability in restore in SunOS 4.0.3 and ea...,1989-07-26 04:00:00,2025-04-03 01:03:00,False,[email protected],,"[{""id"": ""1"", ""vendor"": ""Sun"", ""product"": ""suno...","[{""score"": ""4.6"", ""version"": ""CVSS 2.0"", ""seve...",2025-10-12 18:54:42


In [104]:
import pandas as pd
import json

def extract_cvss_scores(df):
    """
    Extract and normalize CVSS scores from the cvss_scores column.
    Creates one row per CVSS version for each CVE.
    
    Parameters:
    -----------
    df : pandas.DataFrame
        DataFrame containing a 'cvss_scores' column with CVSS data
        
    Returns:
    --------
    pandas.DataFrame
        Normalized DataFrame with one row per CVSS score entry
    """
    
    # List to store expanded rows
    expanded_rows = []
    
    # Iterate through each CVE record
    for idx, row in df.iterrows():
        cvss_scores = row['cvss_scores']
        
        # Handle cases where cvss_scores might be None or empty
        if not cvss_scores or cvss_scores == '[]' or pd.isna(cvss_scores):
            # Keep the row but with null CVSS data
            row_dict = row.to_dict()
            row_dict.update({
                'cvss_score': None,
                'cvss_version': None,
                'cvss_severity': None,
                'cvss_vector': None,
                'cvss_exploitability_score': None,
                'cvss_impact_score': None,
                'cvss_source': None
            })
            expanded_rows.append(row_dict)
            continue
        
        # Parse JSON if it's a string
        if isinstance(cvss_scores, str):
            try:
                cvss_scores = json.loads(cvss_scores)
            except json.JSONDecodeError:
                # If parsing fails, skip or handle gracefully
                row_dict = row.to_dict()
                row_dict.update({
                    'cvss_score': None,
                    'cvss_version': None,
                    'cvss_severity': None,
                    'cvss_vector': None,
                    'cvss_exploitability_score': None,
                    'cvss_impact_score': None,
                    'cvss_source': None
                })
                expanded_rows.append(row_dict)
                continue
        
        # For each CVSS score entry, create a new row
        for cvss_entry in cvss_scores:
            row_dict = row.to_dict()
            
            # Extract CVSS-specific fields
            row_dict['cvss_score'] = cvss_entry.get('score')
            row_dict['cvss_version'] = cvss_entry.get('version')
            row_dict['cvss_severity'] = cvss_entry.get('severity')
            row_dict['cvss_vector'] = cvss_entry.get('vector')
            row_dict['cvss_exploitability_score'] = cvss_entry.get('exploitability_score')
            row_dict['cvss_impact_score'] = cvss_entry.get('impact_score')
            row_dict['cvss_source'] = cvss_entry.get('source')
            
            expanded_rows.append(row_dict)
    
    # Create new DataFrame from expanded rows
    df_expanded = pd.DataFrame(expanded_rows)
    
    # Drop the original cvss_scores column
    if 'cvss_scores' in df_expanded.columns:
        df_expanded = df_expanded.drop('cvss_scores', axis=1)
    
    # Convert numeric columns to appropriate types
    numeric_cols = ['cvss_score', 'cvss_exploitability_score', 'cvss_impact_score']
    for col in numeric_cols:
        if col in df_expanded.columns:
            df_expanded[col] = pd.to_numeric(df_expanded[col], errors='coerce')
    
    return df_expanded


def analyze_cvss_versions(df_expanded):
    """
    Analyze the distribution of CVSS versions in the normalized dataset.
    
    Parameters:
    -----------
    df_expanded : pandas.DataFrame
        Normalized DataFrame with CVSS data
        
    Returns:
    --------
    pandas.DataFrame
        Summary statistics by CVSS version
    """
    
    version_summary = df_expanded.groupby('cvss_version').agg({
        'cve_id': 'count',
        'cvss_score': ['mean', 'median', 'min', 'max'],
        'cvss_severity': lambda x: x.value_counts().to_dict()
    }).round(2)
    
    version_summary.columns = ['_'.join(col).strip() for col in version_summary.columns]
    version_summary = version_summary.rename(columns={'cve_id_count': 'total_entries'})
    
    return version_summary


# Example usage:
# ===============

# Assuming you have your DataFrame 'df' already loaded
# Extract and normalize CVSS scores - ASSIGN THE RESULT!
df = extract_cvss_scores(df)

In [105]:
df.head(3)

Unnamed: 0,cve_id,title,description,published_date,last_modified,remotely_exploit,source,category,affected_products,loaded_at,cvss_score,cvss_version,cvss_severity,cvss_vector,cvss_exploitability_score,cvss_impact_score,cvss_source
0,CVE-2025-11608,code-projects E-Banking System POST Parameter ...,A security vulnerability has been detected in ...,2025-10-11 17:15:00,2025-10-11 17:15:00,True,[email protected],Injection,[],2025-10-12 18:54:42,7.5,CVSS 2.0,HIGH,AV:N/AC:L/Au:N/C:P/I:P/A:P,10.0,6.4,[email protected]
1,CVE-2025-11608,code-projects E-Banking System POST Parameter ...,A security vulnerability has been detected in ...,2025-10-11 17:15:00,2025-10-11 17:15:00,True,[email protected],Injection,[],2025-10-12 18:54:42,7.3,CVSS 3.1,HIGH,CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:L,3.9,3.4,[email protected]
2,CVE-2025-11608,code-projects E-Banking System POST Parameter ...,A security vulnerability has been detected in ...,2025-10-11 17:15:00,2025-10-11 17:15:00,True,[email protected],Injection,[],2025-10-12 18:54:42,6.9,CVSS 4.0,MEDIUM,CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:L/VI:L/VA...,,,[email protected]


We can notice one thing: for entries where the CVSS version is **4.0**, the fields **cvss_exploitability** and **cvss_impact** are empty.

In [106]:
df[df["cvss_version"] == "CVSS 4.0"][["cve_id", "cvss_version", "cvss_severity",   "cvss_vector" ,   "cvss_exploitability_score", "cvss_impact_score"]].head(5)

Unnamed: 0,cve_id,cvss_version,cvss_severity,cvss_vector,cvss_exploitability_score,cvss_impact_score
2,CVE-2025-11608,CVSS 4.0,MEDIUM,CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:L/VI:L/VA...,,


## Why `cvss_exploitability` Is Empty for CVSS 4.0

1. Because CVSS 4.0 no longer includes a separate “Exploitability” sub-score.
    

In earlier versions of CVSS (v2 and v3.x), the total score formula was structured as follows:

|CVSS Version|Score Structure|
|---|---|
|CVSS 2.0|Base Score = Impact × Exploitability|
|CVSS 3.0 / 3.1|Base Score = f(Impact, Exploitability) (still calculated separately)|
|CVSS 4.0|Exploitability is no longer a standalone component|

In CVSS 4.0, the exploitability metrics (such as Attack Vector, Attack Complexity, Privileges Required, etc.) still exist,  
but they are integrated directly into the overall formula, rather than being summarized in a separate `exploitability_score` field.

## About Missing Exploitability and Impact Scores in CVSS 4.0

For now, we will **leave the `exploitability_score` and `impact_score` fields empty** for CVSS 4.0 entries.

However, it is technically possible to **approximate** these values using weighted metrics extracted from the CVSS vector.  
An example approach could be:

```python
df["exploitability_proxy"] = (
    df["AV_score"] * 0.3 +
    df["AC_score"] * 0.25 +
    df["PR_score"] * 0.25 +
    df["UI_score"] * 0.2
)

df["impact_proxy"] = (
    df["VC_score"] * 0.4 +
    df["VI_score"] * 0.3 +
    df["VA_score"] * 0.3
)


# 💡 CVSS Versions and Metric Mappings

## Overview

**CVSS Version** refers to the version of the **CVSS standard** used to assess the severity of a vulnerability.

---

## 🔍 What is CVSS?

**CVSS (Common Vulnerability Scoring System)** is a standardized scoring system used in cybersecurity to measure the severity of vulnerabilities (CVE). It is managed by the **Forum of Incident Response and Security Teams (FIRST)**.

Each CVSS version defines:
- A mathematical formula to calculate a score from **0 to 10**
- Criteria (vectors) describing how the vulnerability can be exploited

---

## ⚙️ Main CVSS Versions

| Version | Year | Main Characteristics |
|---------|------|---------------------|
| **CVSS 2.0** | 2007 | First widely used version; less precise for real-world exploitation contexts |
| **CVSS 3.0** | 2015 | Better distinction between exploitability and impact; introduction of the "scope" concept |
| **CVSS 3.1** | 2019 | Most widely used version; clarifies metric definitions (same formula as 3.0) |
| **CVSS 4.0** | 2023 | Next generation: adds environmental and contextual metrics; better reflects modern attack scenarios |

---

## 🧩 CVSS Metric Mappings

Below are the **abbreviations and their meanings** for each CVSS version. These mappings are essential for parsing CVSS vectors into human-readable components.

---

### 🟦 Common Metrics (CVSS 3.x & Compatible)

```python
MAPS_COMMON = {
    "AV": {    # Attack Vector
        "N": "Network",
        "A": "Adjacent",
        "L": "Local",
        "P": "Physical"
    },
    "AC": {    # Attack Complexity
        "L": "Low",
        "H": "High"
    },
    "PR": {    # Privileges Required
        "N": "None",
        "L": "Low",
        "H": "High"
    },
    "UI": {    # User Interaction
        "N": "None",
        "R": "Required"
    },
    "S": {     # Scope
        "U": "Unchanged",
        "C": "Changed"
    },
    "C": {     # Confidentiality Impact
        "N": "None",
        "L": "Low",
        "H": "High"
    },
    "I": {     # Integrity Impact
        "N": "None",
        "L": "Low",
        "H": "High"
    },
    "A": {     # Availability Impact
        "N": "None",
        "L": "Low",
        "H": "High"
    }
}
```

---

### 🟨 CVSS 2.0 Specific Metrics

```python
MAPS_V2 = {
    "AV": {    # Access Vector
        "N": "Network",
        "A": "Adjacent/Local",
        "L": "Local",
        "P": "Physical"
    },
    "AC": {    # Access Complexity
        "L": "Low",
        "M": "Medium",
        "H": "High"
    },
    "Au": {    # Authentication
        "N": "None",
        "S": "Single",
        "M": "Multiple"
    },
    "C": {     # Confidentiality Impact
        "N": "None",
        "P": "Partial",
        "C": "Complete",
        "L": "Low"
    },
    "I": {     # Integrity Impact
        "N": "None",
        "P": "Partial",
        "C": "Complete",
        "L": "Low"
    },
    "A": {     # Availability Impact
        "N": "None",
        "P": "Partial",
        "C": "Complete",
        "L": "Low"
    }
}
```

**CVSS 2.0 Differences:**

- Uses **Au (Authentication)** instead of PR (Privileges Required)
- Impact values use **Partial/Complete** instead of Low/High
- Less granular than CVSS 3.x versions

---

### 🟥 CVSS 4.0 Additional Metrics

```python
MAPS_V40 = {
    "AT": {    # Attack Requirements
        "N": "None",
        "P": "Present"
    },
    # Vulnerable System Impact
    "VC": {    # Vulnerable System - Confidentiality
        "N": "None",
        "L": "Low",
        "H": "High"
    },
    "VI": {    # Vulnerable System - Integrity
        "N": "None",
        "L": "Low",
        "H": "High"
    },
    "VA": {    # Vulnerable System - Availability
        "N": "None",
        "L": "Low",
        "H": "High"
    },
    # Subsequent System Impact
    "SC": {    # Subsequent System - Confidentiality
        "N": "None",
        "L": "Low",
        "H": "High"
    },
    "SI": {    # Subsequent System - Integrity
        "N": "None",
        "L": "Low",
        "H": "High"
    },
    "SA": {    # Subsequent System - Avai

In [107]:
import pandas as pd
import re

# Mappings CVSS
MAPS_COMMON = {
    "AV": {"N": "Network", "A": "Adjacent", "L": "Local", "P": "Physical"},
    "AC": {"L": "Low", "H": "High"},
    "PR": {"N": "None", "L": "Low", "H": "High"},
    "UI": {"N": "None", "R": "Required"},
    "S": {"U": "Unchanged", "C": "Changed"},
    "C": {"N": "None", "L": "Low", "H": "High"},
    "I": {"N": "None", "L": "Low", "H": "High"},
    "A": {"N": "None", "L": "Low", "H": "High"}
}

MAPS_V2 = {
    "AV": {"N": "Network", "A": "Adjacent/Local", "L": "Local", "P": "Physical"},
    "Au": {"N": "None", "S": "Single", "M": "Multiple"},
    "C": {"N": "None", "P": "Partial", "C": "Complete", "L": "Low"},
    "I": {"N": "None", "P": "Partial", "C": "Complete", "L": "Low"},
    "A": {"N": "None", "P": "Partial", "C": "Complete", "L": "Low"}
}

MAPS_V40 = {
    "AT": {"N": "None", "P": "Present"},
    "VC": {"N": "None", "L": "Low", "H": "High"},
    "VI": {"N": "None", "L": "Low", "H": "High"},
    "VA": {"N": "None", "L": "Low", "H": "High"},
    "SC": {"N": "None", "L": "Low", "H": "High"},
    "SI": {"N": "None", "L": "Low", "H": "High"},
    "SA": {"N": "None", "L": "Low", "H": "High"}
}

def parse_cvss_vector(vector_str, version):
    """
    Parse un vecteur CVSS et retourne un dictionnaire des métriques
    """
    if pd.isna(vector_str) or not isinstance(vector_str, str):
        return {}
    
    metrics = {}
    
    # Déterminer les mappings à utiliser selon la version
    if version == "CVSS 2.0":
        maps = {**MAPS_V2}
    elif version == "CVSS 3.1" or version == "CVSS 3.0":
        maps = {**MAPS_COMMON}
    elif version == "CVSS 4.0":
        maps = {**MAPS_COMMON, **MAPS_V40}
    else:
        maps = {**MAPS_COMMON}
    
    # Nettoyer le vecteur (enlever le préfixe CVSS:3.1/ ou similaire)
    vector_str = re.sub(r'^CVSS:\d+\.\d+/', '', vector_str)
    
    # Parser les paires metric:value
    pairs = vector_str.split('/')
    for pair in pairs:
        if ':' in pair:
            metric, value = pair.split(':', 1)
            metric = metric.strip()
            value = value.strip()
            
            # Chercher la valeur décodée
            if metric in maps and value in maps[metric]:
                metrics[metric] = maps[metric][value]
            else:
                # Garder la valeur brute si pas de mapping
                metrics[metric] = value
    
    return metrics

def extract_cvss_metrics(df):
    """
    Extrait les métriques CVSS et les ajoute comme colonnes au DataFrame
    """
    # Créer une copie du DataFrame
    df_result = df.copy()
    
    # Parser tous les vecteurs
    parsed_metrics = []
    for idx, row in df_result.iterrows():
        metrics = parse_cvss_vector(row['cvss_vector'], row['cvss_version'])
        parsed_metrics.append(metrics)
    
    # Obtenir toutes les métriques uniques
    all_metrics = set()
    for metrics in parsed_metrics:
        all_metrics.update(metrics.keys())
    
    # Créer des colonnes pour chaque métrique
    for metric in sorted(all_metrics):
        column_name = f'cvss_metric_{metric}'
        df_result[column_name] = [metrics.get(metric, None) for metrics in parsed_metrics]
    
    return df_result


# Supposons que 'df' est votre DataFrame existant
df = extract_cvss_metrics(df)

In [108]:
df.head(5)

Unnamed: 0,cve_id,title,description,published_date,last_modified,remotely_exploit,source,category,affected_products,loaded_at,cvss_score,cvss_version,cvss_severity,cvss_vector,cvss_exploitability_score,cvss_impact_score,cvss_source,cvss_metric_A,cvss_metric_AC,cvss_metric_AR,cvss_metric_AT,cvss_metric_AU,cvss_metric_AV,cvss_metric_Au,cvss_metric_C,cvss_metric_CR,cvss_metric_E,cvss_metric_I,cvss_metric_IR,cvss_metric_MAC,cvss_metric_MAT,cvss_metric_MAV,cvss_metric_MPR,cvss_metric_MSA,cvss_metric_MSC,cvss_metric_MSI,cvss_metric_MUI,cvss_metric_MVA,cvss_metric_MVC,cvss_metric_MVI,cvss_metric_PR,cvss_metric_R,cvss_metric_RE,cvss_metric_S,cvss_metric_SA,cvss_metric_SC,cvss_metric_SI,cvss_metric_U,cvss_metric_UI,cvss_metric_V,cvss_metric_VA,cvss_metric_VC,cvss_metric_VI
0,CVE-2025-11608,code-projects E-Banking System POST Parameter ...,A security vulnerability has been detected in ...,2025-10-11 17:15:00,2025-10-11 17:15:00,True,[email protected],Injection,[],2025-10-12 18:54:42,7.5,CVSS 2.0,HIGH,AV:N/AC:L/Au:N/C:P/I:P/A:P,10.0,6.4,[email protected],Partial,L,,,,Network,,Partial,,,Partial,,,,,,,,,,,,,,,,,,,,,,,,,
1,CVE-2025-11608,code-projects E-Banking System POST Parameter ...,A security vulnerability has been detected in ...,2025-10-11 17:15:00,2025-10-11 17:15:00,True,[email protected],Injection,[],2025-10-12 18:54:42,7.3,CVSS 3.1,HIGH,CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:L,3.9,3.4,[email protected],Low,Low,,,,Network,,Low,,,Low,,,,,,,,,,,,,,,,Unchanged,,,,,,,,,
2,CVE-2025-11608,code-projects E-Banking System POST Parameter ...,A security vulnerability has been detected in ...,2025-10-11 17:15:00,2025-10-11 17:15:00,True,[email protected],Injection,[],2025-10-12 18:54:42,6.9,CVSS 4.0,MEDIUM,CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:L/VI:L/VA...,,,[email protected],,Low,X,,X,Network,,,X,P,,X,X,X,X,X,X,X,X,X,X,X,X,,X,X,X,,,,X,,X,Low,Low,Low
3,CVE-1999-0095,Sendmail Command Injection Vulnerability,The following products are affected byCVE-1999...,1988-10-01 04:00:00,2025-04-03 01:03:00,True,[email protected],,"[{""id"": ""1"", ""vendor"": ""Eric_allman"", ""product...",2025-10-12 18:54:42,10.0,CVSS 2.0,HIGH,AV:N/AC:L/Au:N/C:C/I:C/A:C,10.0,10.0,[email protected],Complete,L,,,,Network,,Complete,,,Complete,,,,,,,,,,,,,,,,,,,,,,,,,
4,CVE-1999-0082,Tenable FTP Server Command Injection Vulnerabi...,The following products are affected byCVE-1999...,1988-11-11 05:00:00,2025-04-03 01:03:00,True,[email protected],,"[{""id"": ""1"", ""vendor"": ""Ftp"", ""product"": ""ftp""...",2025-10-12 18:54:42,10.0,CVSS 2.0,HIGH,AV:N/AC:L/Au:N/C:C/I:C/A:C,10.0,10.0,[email protected],Complete,L,,,,Network,,Complete,,,Complete,,,,,,,,,,,,,,,,,,,,,,,,,


In [109]:
# construire dataset 
# change datatypes
#stock in silver

The product data is also in JSON format, and since the affected products are important, we should create a new table in the "Selver" section of our data warehouse.

In [110]:
import json

# ===== Extraire les produits uniques =====
products_dict = {}

for idx, row in df.iterrows():
    cve_id = row['cve_id']
    affected_products = row['affected_products']
    
    if affected_products and affected_products != '[]':
        try:
            if isinstance(affected_products, str):
                products = json.loads(affected_products)
            else:
                products = affected_products
            
            for product in products:
                vendor = product.get('vendor', '').strip()
                product_name = product.get('product', '').strip()
                
                if vendor and product_name:
                    key = (vendor.lower(), product_name.lower())
                    
                    if key not in products_dict:
                        products_dict[key] = {
                            'vendor': vendor,
                            'product_name': product_name,
                            'cves': set()
                        }
                    products_dict[key]['cves'].add(cve_id)
                    
        except (json.JSONDecodeError, TypeError):
            continue

# Créer df_products
products_data = []
product_lookup = {}

for product_id, ((vendor_lower, product_lower), data) in enumerate(products_dict.items(), start=1):
    products_data.append({
        'product_id': product_id,
        'vendor': data['vendor'],
        'product_name': data['product_name'],
        'total_cves': len(data['cves']),
        'cve_list_json': json.dumps(list(data['cves']))
    })
    product_lookup[(vendor_lower, product_lower)] = product_id

df_products = pd.DataFrame(products_data)

# ===== Enrichir avec les dates CVE =====
cve_products_for_dates = []

for idx, row in df.iterrows():
    cve_id = row['cve_id']
    published_date = row['published_date']
    affected_products = row['affected_products']
    
    if affected_products and affected_products != '[]':
        try:
            if isinstance(affected_products, str):
                products = json.loads(affected_products)
            else:
                products = affected_products
            
            for product in products:
                vendor = product.get('vendor', '').strip()
                product_name = product.get('product', '').strip()
                
                if vendor and product_name:
                    key = (vendor.lower(), product_name.lower())
                    product_id = product_lookup.get(key)
                    
                    if product_id:
                        cve_products_for_dates.append({
                            'product_id': product_id,
                            'published_date': published_date
                        })
        except (json.JSONDecodeError, TypeError):
            continue

df_temp = pd.DataFrame(cve_products_for_dates)

# Agréger les dates par produit
product_dates = df_temp.groupby('product_id').agg({
    'published_date': ['min', 'max']
}).reset_index()

product_dates.columns = ['product_id', 'first_cve_date', 'last_cve_date']

# Joindre avec df_products
df_pr = df_products.merge(product_dates, on='product_id', how='left')

In [111]:
df_pr.head(10)

Unnamed: 0,product_id,vendor,product_name,total_cves,cve_list_json,first_cve_date,last_cve_date
0,1,Eric_allman,sendmail,14,"[""CVE-1999-0203"", ""CVE-1999-0095"", ""CVE-1999-0...",1988-10-01 04:00:00,2000-04-23 04:00:00
1,2,Ftp,ftp,2,"[""CVE-1999-0201"", ""CVE-1999-0082""]",1988-11-11 05:00:00,1997-01-01 05:00:00
2,3,Ftpcd,ftpcd,1,"[""CVE-1999-0082""]",1988-11-11 05:00:00,1988-11-11 05:00:00
3,4,Bsd,bsd,6,"[""CVE-1999-1098"", ""CVE-1999-1394"", ""CVE-1999-1...",1989-01-01 05:00:00,2001-10-03 04:00:00
4,5,Sun,sunos,201,"[""CVE-2000-0471"", ""CVE-1999-1467"", ""CVE-2001-0...",1989-07-26 04:00:00,2002-05-29 04:00:00
5,6,Sun,nfs,4,"[""CVE-1999-0166"", ""CVE-1999-0169"", ""CVE-1999-0...",1990-05-01 04:00:00,1997-07-01 04:00:00
6,7,Freebsd,freebsd,126,"[""CVE-1999-1187"", ""CVE-2000-0915"", ""CVE-2001-0...",1990-05-09 04:00:00,2002-03-08 05:00:00
7,8,Next,next,5,"[""CVE-1999-1193"", ""CVE-1999-1198"", ""CVE-1999-1...",1990-10-03 04:00:00,1991-10-22 04:00:00
8,9,Next,nex,1,"[""CVE-1999-1392""]",1990-10-03 04:00:00,1990-10-03 04:00:00
9,10,Digital,vms,1,"[""CVE-1999-1057""]",1990-10-25 04:00:00,1990-10-25 04:00:00


We will drop the product from the DataFrame dataset since we have already created a new one.

In [112]:
df.drop(columns=['affected_products'], inplace=True)

In [113]:
df.head(5)

Unnamed: 0,cve_id,title,description,published_date,last_modified,remotely_exploit,source,category,loaded_at,cvss_score,cvss_version,cvss_severity,cvss_vector,cvss_exploitability_score,cvss_impact_score,cvss_source,cvss_metric_A,cvss_metric_AC,cvss_metric_AR,cvss_metric_AT,cvss_metric_AU,cvss_metric_AV,cvss_metric_Au,cvss_metric_C,cvss_metric_CR,cvss_metric_E,cvss_metric_I,cvss_metric_IR,cvss_metric_MAC,cvss_metric_MAT,cvss_metric_MAV,cvss_metric_MPR,cvss_metric_MSA,cvss_metric_MSC,cvss_metric_MSI,cvss_metric_MUI,cvss_metric_MVA,cvss_metric_MVC,cvss_metric_MVI,cvss_metric_PR,cvss_metric_R,cvss_metric_RE,cvss_metric_S,cvss_metric_SA,cvss_metric_SC,cvss_metric_SI,cvss_metric_U,cvss_metric_UI,cvss_metric_V,cvss_metric_VA,cvss_metric_VC,cvss_metric_VI
0,CVE-2025-11608,code-projects E-Banking System POST Parameter ...,A security vulnerability has been detected in ...,2025-10-11 17:15:00,2025-10-11 17:15:00,True,[email protected],Injection,2025-10-12 18:54:42,7.5,CVSS 2.0,HIGH,AV:N/AC:L/Au:N/C:P/I:P/A:P,10.0,6.4,[email protected],Partial,L,,,,Network,,Partial,,,Partial,,,,,,,,,,,,,,,,,,,,,,,,,
1,CVE-2025-11608,code-projects E-Banking System POST Parameter ...,A security vulnerability has been detected in ...,2025-10-11 17:15:00,2025-10-11 17:15:00,True,[email protected],Injection,2025-10-12 18:54:42,7.3,CVSS 3.1,HIGH,CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:L,3.9,3.4,[email protected],Low,Low,,,,Network,,Low,,,Low,,,,,,,,,,,,,,,,Unchanged,,,,,,,,,
2,CVE-2025-11608,code-projects E-Banking System POST Parameter ...,A security vulnerability has been detected in ...,2025-10-11 17:15:00,2025-10-11 17:15:00,True,[email protected],Injection,2025-10-12 18:54:42,6.9,CVSS 4.0,MEDIUM,CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:L/VI:L/VA...,,,[email protected],,Low,X,,X,Network,,,X,P,,X,X,X,X,X,X,X,X,X,X,X,X,,X,X,X,,,,X,,X,Low,Low,Low
3,CVE-1999-0095,Sendmail Command Injection Vulnerability,The following products are affected byCVE-1999...,1988-10-01 04:00:00,2025-04-03 01:03:00,True,[email protected],,2025-10-12 18:54:42,10.0,CVSS 2.0,HIGH,AV:N/AC:L/Au:N/C:C/I:C/A:C,10.0,10.0,[email protected],Complete,L,,,,Network,,Complete,,,Complete,,,,,,,,,,,,,,,,,,,,,,,,,
4,CVE-1999-0082,Tenable FTP Server Command Injection Vulnerabi...,The following products are affected byCVE-1999...,1988-11-11 05:00:00,2025-04-03 01:03:00,True,[email protected],,2025-10-12 18:54:42,10.0,CVSS 2.0,HIGH,AV:N/AC:L/Au:N/C:C/I:C/A:C,10.0,10.0,[email protected],Complete,L,,,,Network,,Complete,,,Complete,,,,,,,,,,,,,,,,,,,,,,,,,
