# O-Cloud Data Gathering

#### Requirements

In [1]:
import subprocess

# Run the pip command and capture the output
installedpackages = subprocess.run(['pip', 'freeze'], stdout=subprocess.PIPE, text=True).stdout
# Read requirements from file
with open("./requirements.txt", 'r') as file:
    requirements = file.read()
# Split the multi-line string into a list of lines
lines = requirements.splitlines()
# Check if requirements are installed line by line 
for line in lines:
    index = installedpackages.find(line)
    if index == -1:
        # Install dependecies from requirements.txt
        %pip install -r ./requirements.txt > /dev/null
        break

In [2]:
from OCloud_Data_Gathering import *
from plotly.offline import init_notebook_mode
from stix2 import FileSystemSource
%matplotlib inline
init_notebook_mode(connected=True)

#### `pull_clone_gitrepo(directory, repo)`

This method manages a Git repository, either by cloning it if the directory doesn't exist or pulling changes if it does.

##### Parameters

- `directory`: The target directory for the repository.
- `repo`: The Git repository URL.

##### Behavior

- If `directory` doesn't exist, the method clones `repo` into it using `Repo.clone_from()`.
- If `directory` exists and is a valid Git repository, the method pulls changes using `repo.remotes.origin.pull()`.
- If `directory` exists but is not a Git repository, it is deleted and `repo` is cloned into it.

This method ensures proper management of Git repositories in the specified directory.

#### `generate_techniques_dataframe()`

This method retrieves techniques data from the ATT&CK framework and returns it as a Pandas DataFrame.

##### Behavior

1. Downloads and parses ATT&CK STIX data of version 4.0 for the enterprise edition.
2. Converts the parsed data into Pandas DataFrames for techniques, related relationships, and citations.
3. Returns the Pandas DataFrame containing techniques data.

This method simplifies the retrieval and organization of techniques data from ATT&CK into a structured DataFrame format.



In [4]:
##Download CTI data from GitHub
pull_clone_gitrepo('./data', 'https://github.com/mitre/cti')
fs = FileSystemSource('./data/capec/2.1')
techniques_df = generate_techniques_dataframe()

[32m2023-08-25 12:19:45.587[0m | [1mINFO    [0m | [36mmitreattack.attackToExcel.attackToExcel[0m:[36mget_stix_data[0m:[36m69[0m - [1mDownloading ATT&CK data from github.com/mitre/cti[0m
parsing techniques: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 244/244 [00:00<00:00, 52463.74it/s]
parsing relationships for type=technique: 100%|███████████████████████████████████████████████████████████████████████████████████████| 4852/4852 [00:00<00:00, 145335.60it/s]


#### `get_grouped_o_cloud_technique(file, dropDrop_duplicates: bool = False)`

This method extracts and groups O-Cloud threat data from a CSV file, allowing for further analysis.

##### Parameters

- `file`: The path to the CSV file containing O-Cloud threat data.
- `dropDrop_duplicates`: An optional boolean parameter to drop duplicate technique entries. Default is `False`.

##### Behavior

1. Reads O-Cloud threat data from the specified CSV file, using the "Name" column as an index.
2. Optionally drops duplicate entries based on the "Technique" column if `dropDrop_duplicates` is set to `True`.
3. Groups the data by the "Name" column and returns the grouped object.

This method facilitates efficient analysis of O-Cloud threat data by grouping it based on specified criteria.


#### `get_technique_capecs_id(grouped, techniques_df)`

This method retrieves CAPEC IDs associated with specific techniques from grouped data.

##### Parameters

- `grouped`: Grouped data containing techniques.
- `techniques_df`: DataFrame containing technique details.

##### Behavior

1. Initializes an empty list `techniques_capecs` to store technique and CAPEC ID associations.
2. Iterates through each group in the provided `grouped` data.
3. For each technique in the group, extracts associated CAPEC IDs from `techniques_df`.
4. Handles instances where CAPEC IDs might be in comma-separated format.
5. Appends tuples of technique and CAPEC IDs to the `techniques_capecs` list.
6. Returns the list of technique and CAPEC ID associations.

This method aids in obtaining CAPEC IDs linked to specific techniques for further analysis.


#### `write_ids_to_file(techniques_capecs, file)`

This method writes technique and CAPEC ID associations to a CSV file.

##### Parameters

- `techniques_capecs`: List of tuples containing technique and associated CAPEC IDs.
- `file`: Path to the CSV file to be written.

##### Behavior

1. Opens the specified CSV file in write mode.
2. Initializes a CSV writer and writes a header row with column names ('Technique ID', 'CAPEC ID').
3. Iterates through each tuple in `techniques_capecs`.
4. For tuples with non-empty CAPEC IDs, writes each combination of technique and CAPEC ID to the CSV.
5. Closes the file after writing.

This method efficiently creates a CSV file containing technique and CAPEC ID associations.


In [5]:
grouped = get_grouped_o_cloud_technique(file='./mapping/o_cloud_technique_mapping_without_subtechniques.csv', drop_duplicates=True)
techniques_capecs = get_technique_capecs_id(grouped,techniques_df)
print(techniques_capecs)
write_ids_to_file(techniques_capecs, file ='./mapping/o_cloud_capecs_per_technique.csv')

[('T1498', []), ('T1552', []), ('T1609', []), ('T1204', []), ('T1068', ['CAPEC-69']), ('T1078', ['CAPEC-560']), ('T1003', ['CAPEC-567']), ('T1614', []), ('T1195', ['CAPEC-437', 'CAPEC-438', 'CAPEC-439']), ('T1525', []), ('T1610', []), ('T1612', []), ('T1040', ['CAPEC-158']), ('T1600', []), ('T1613', []), ('T1082', ['CAPEC-311']), ('T1580', []), ('T1070', ['CAPEC-93']), ('T1049', []), ('T1619', []), ('T1046', []), ('T1036', []), ('T1496', []), ('T1542', []), ('T1495', []), ('T1016', ['CAPEC-309']), ('T1611', []), ('T1538', []), ('T1530', []), ('T1499', ['CAPEC-227', 'CAPEC-131', 'CAPEC-130', 'CAPEC-125']), ('T1578', [])]


In [7]:
print_stats(techniques_capecs)

Techniques: 31
Empty Techniques: 22
CAPECs: 14


#### `find_cwe_for_capec(techniques_capecs)`

This method fetches related CWEs and CVEs for given CAPEC IDs.

##### Parameters

- `techniques_capecs`: List of tuples containing technique and associated CAPEC IDs.

##### Behavior

1. Initializes `capec_list` and `list_of_tinfos` to store collected data.
2. Records the start time and logs the process initiation.
3. Iterates through tuples in `techniques_capecs` to retrieve associated CWEs and CVEs.
4. For each tuple with non-empty CAPEC IDs, fetches relevant CVE data.
5. Appends the collected information to `capec_list` and associates it with the technique.
6. Gathers all technique-related data in `list_of_tinfos`.
7. Records the end time, calculates runtime, and logs completion.
8. Returns a dictionary containing scan date, runtime, and related data.

This method facilitates the retrieval of CWEs and CVEs associated with specific CAPEC IDs, providing valuable threat information.


#### `write_dict_to_file(t_cwe_cve_dict, file)`

This method writes a dictionary to a JSON file using a custom encoder.

##### Parameters

- `t_cwe_cve_dict`: The dictionary containing data to be written to the JSON file.
- `file`: The path to the JSON file to be written.

##### Behavior

1. Opens the specified JSON file in write mode using a context manager.
2. Uses the `json.dump()` function to serialize the `t_cwe_cve_dict` dictionary and write it to the file.
3. The `cve_custom_encoder` class is used to handle custom encoding if required.

This method provides a straightforward way to save a dictionary as JSON data in a file with optional custom encoding.


In [8]:
t_cwe_cve_dict = find_cwe_for_capec(techniques_capecs,fs)
write_dict_to_file(t_cwe_cve_dict, "./scans/t-cwe-cve-dict.json")

Start fetching CAPEC'S -> CWE'S -> CVE'S for given CAPEC-IDS...

Searching CVE's for CAPEC-69
Found: CVE-2007-4217, CVE-2008-1877, CVE-2007-5159, CVE-2008-4638, CVE-2008-0162, CVE-2008-0368, CVE-2007-3931, CVE-2020-3812, 


Searching CVE's for CAPEC-560
Found: CVE-2007-0681, CVE-2000-0944, CVE-2005-3435, CVE-2005-0408, CVE-1999-1152, CVE-2001-1291, CVE-2001-0395, CVE-2001-1339, CVE-2002-0628, CVE-1999-1324, 


Searching CVE's for CAPEC-567
Found: 


Searching CVE's for CAPEC-437
Found: 


Searching CVE's for CAPEC-438
Found: 


Searching CVE's for CAPEC-439
Found: CVE-2019-13945, CVE-2018-4251, 


Searching CVE's for CAPEC-158
Found: CVE-2009-2272, CVE-2009-1466, CVE-2009-0152, CVE-2009-1603, CVE-2009-0964, CVE-2008-6157, CVE-2008-6828, CVE-2008-1567, CVE-2008-0174, CVE-2007-5778, CVE-2002-1949, CVE-2008-4122, CVE-2008-3289, CVE-2008-4390, CVE-2007-5626, CVE-2004-1852, CVE-2008-0374, CVE-2007-4961, CVE-2007-4786, CVE-2005-3140, 


Searching CVE's for CAPEC-311
Found: 


Searching CVE's