Skip to content

RiskBasedPrioritization/RiskBasedPrioritizationAnalysis

Repository files navigation

Overview

This is the data and the notebooks (code + documentation) for the analysis that is part of the EPSS Applied Guide.

Data Sources

data_in

Data Source Detail ~~ CVE count K Directory
CISA KEV Active Exploitation 1 cisa_kev
EPSS Predictor of Exploitation 220 epss
Metasploit modules Weaponized Exploit 3 metasploit
Nuclei templates Weaponized Exploit 2 nuclei
ExploitDB Published Exploit Code 25 exploitdb
NVD CVE Data NVD CVEs 220 nvd
Qualys TruRisk Report The 2023 Qualys TruRisk research report lists 190 CVEs from 2022 with QVS scores .2 qualys
Microsoft Security Response Center (MSRC) CVEs Exploited and with Exploitability Assessment .2 msrc

Getting data

  • get_data.sh gets the data that can be downloaded automatically and used as-is.
  • Other data is manually downloaded - see instructions below.
    • MSRC
    • ExploitDB
    • GPZ
  • Larger files are gzip'd
  • A date.txt file is included in each folder with the data that contains the date of downloaded.

National Vulnerability Database (NVD)

Get NVD data automatically

  • A notebook or script in nvd downloads the NVD data.
  • The data is ouput to data_out/CVSSData.csv.gz
  • Note: The download method used will be deprecated some time after Dec 2023 per https://nvd.nist.gov/vuln/data-feeds

Google Project Zero (GPZ)

See 0day "In the Wild" GoogleSheet

  • Select "All" tab.
  • File - Download as csv

Microsoft Security Response Center (MSRC)

Qualys TruRisk Report

The CVE data was extracted from the Qualys TruRisk Report PDF using standard tools like sed. This data is static so a date.txt is not included.

ExploitDB

Other Data Sources

Other data sources to consider - these are not currently used here:

  1. https://github.com/trickest/cve for a list of CVE PoCs

Analysis

analysis

  1. enrich_cves.ipynb
    1. Take the data sources from data_in/
    2. Enrich the CVE data from NVD with the other data sources
    3. Add an "Exploit" column to indicate the source of the exploitability (used later to set colors of CVE data in plots)
    4. store the output in data_out/nvd_cves_v3_enriched.csv.gz
  2. kev_epss_cvss.ipynb
    1. Read the enriched CVE data from data_out/CVSSData_enriched.csv.gz
    2. Read the data from CISA KEV alert reports in ./data_in/cisa_kev/
    3. Plot CISA KEV datasets showing EPSS, CVSS by source of the exploitability
    4. Write data_out/cisa_kev/csa/csa.csv.gz which is the CISA KEV CysberSecurity Alerts (CSA) subset with EPSS and other data
  3. qualys.ipynb
    1. Read the enriched CVE data from data_out/CVSSData_enriched.csv.gz
    2. Read the data from ./data_in/qualys
    3. Plot Qualys dataset showing EPSS, CVSS by source of the exploitability
    4. Write data_out/qualys/qualys.csv.gz which is the Qualys data with EPSS and other data
  4. msrc.ipynb
    1. Read the enriched CVE data from data_out/CVSSData_enriched.csv.gz
    2. Read the data from ./data_in/msrc
    3. Plot Microsoft Exploitability Index dataset showing EPSS, CVSS by source of the exploitability
    4. Write data_out/msrc/msrc.csv.gz which is the MSEI data with EPSS and other data

CISA SSVC Decision Trees

CISA SSVC Decision Tree From Scratch Example Implementation

cisa_ssvc_dt

  1. DT_from_scratch.ipynb
    1. Read the enriched CVE data from data_out/CVSSData_enriched.csv.gz
    2. Read the Decision Tree definition cisa_ssvc_dt/DT_rbp.csv
    3. Define the Decision Logic for the Decision Nodes
    4. Calculate the Decision Node Values for all CVEs
    5. Do some Exploratory Data Analysis with Venn Diagrams to understand our data
    6. Calculate the Output Decision from the Decision Node Values
    7. Plot Flow of All CVES across the Decision Tree aka Sankey
      1. Read the Sankey Diagram template definition cisa_ssvc_dt/DT_sankey.csv
    8. Triage some CVEs
      1. Read a list of CVEs to triage cisa_ssvc_dt/triage/cves2triage.csv
      2. Get Decisions
      3. Plot

CISA SSVC Decision Tree Analysis for Feature Importance

  1. DT_analysis.ipynb
    1. Read the Decision Tree definition cisa_ssvc_dt/DT_rbp.csv
    2. Perform Feature Importance using 2 methods
      1. Permutation Importance
      2. Drop-column Importance

See CERTCC/SSVC#309 for the suggestion to add drop column importance to CISA SSVC.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published