# Quantized models

We have released two quantized versions of our base model ([fdtn-ai/Foundation-Sec-8B](https://huggingface.co/fdtn-ai/Foundation-Sec-8B)), in different sizes:

- [fdtn-ai/Foundation-Sec-8B-Q4_K_M-GGUF](https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Q4_K_M-GGUF): 4-bit quantized (~4.92GB memory footprint)
- [fdtn-ai/Foundation-Sec-8B-Q8_0-GGUF](https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Q8_0-GGUF): 8-bit quantized (~8.54GB memory footprint)

Both models are optimized to run on laptops without requiring a GPU.

## Setup

In [1]:
from llama_cpp import Llama
from huggingface_hub import hf_hub_download

# run the followwing command if llama-cpp-python has not been installed
# !pip install llama-cpp-python

In [2]:
MODEL_NAME = "fdtn-ai/Foundation-Sec-8B-Q4_K_M-GGUF"
FILE_NAME = "foundation-sec-8b-q4_k_m.gguf"

# If you want to use 8-bit (Q8_0) GGUF instead, uncomment the codes below

# MODEL_NAME = "fdtn-ai/Foundation-Sec-8B-Q8_0-GGUF"
# FILE_NAME = "foundation-sec-8b-q8_0.gguf"

In [3]:
model_file = hf_hub_download(repo_id = MODEL_NAME, filename = FILE_NAME)
model = Llama(model_path = model_file, verbose = False)

llama_context: n_ctx_per_seq (512) < n_ctx_train (131072) -- the full capacity of the model will not be utilized


## Configurations
You can adjust the model's text generation behavior by tuning its arguments. <br>
Below is an example configuration to ensure reproducible outputs. <br>
For a full list of arguments and detailed explanations, refer to the [llama-cpp-python API reference](https://llama-cpp-python.readthedocs.io/en/latest/api-reference/).

In [4]:
params = {
    "max_tokens" : 256,
    "temperature" : 0.0,
    "repeat_penalty" : 1.2,
}

## Inference

Let’s try the same prompts used in the [Quickstart (Foundation-Sec-8B)](https://github.com/RobustIntelligence/foundation-ai-cookbook/blob/main/1_quickstarts/Quickstart_Foundation-Sec-8B.ipynb) and observe the results!

In [5]:
prompt = "MITRE ATT&CK is"

response = model(prompt, **params)

In [6]:
print(prompt + response["choices"][0]["text"])

MITRE ATT&CK is a globally accessible knowledge base of adversary tactics and techniques based on real-world observations. The ATT&CK knowledge base is used as a foundation for the development of specific threat models and methodologies in the private sector, government, and the cybersecurity product and service community.
MITRE ATT&CK has been adopted by many organizations to help them understand how adversaries operate so they can better defend their networks against attacks. It provides a common language that security professionals use when discussing threats with each other or within an organization’s IT department. The framework also helps identify gaps in existing defenses and prioritize improvements based on actual threat activity observed across multiple industries worldwide over time since its inception back in 2013 by MITRE Corporation (a non-profit research institute).
MITRE ATT&CK is a knowledge base of adversary tactics, techniques, and procedures that can be used to under

In [7]:
prompt="""CVE-2021-44228 is a remote code execution flaw in Apache Log4j2 via unsafe JNDI lookups (“Log4Shell”). The CWE is CWE-502.

CVE-2017-0144 is a remote code execution vulnerability in Microsoft’s SMBv1 server (“EternalBlue”) due to a buffer overflow. The CWE is CWE-119.

CVE-2014-0160 is an information-disclosure bug in OpenSSL’s heartbeat extension (“Heartbleed”) causing out-of-bounds reads. The CWE is CWE-125.

CVE-2017-5638 is a remote code execution issue in Apache Struts 2’s Jakarta Multipart parser stemming from improper input validation of the Content-Type header. The CWE is CWE-20.

CVE-2019-0708 is a remote code execution vulnerability in Microsoft’s Remote Desktop Services (“BlueKeep”) triggered by a use-after-free. The CWE is CWE-416.

CVE-2015-10011 is a vulnerability about OpenDNS OpenResolve improper log output neutralization. The CWE is"""

# Update the max_new_tokens to 3, as we just want to get CWE-ID.
params["max_tokens"] = 3

response = model(prompt, **params)

In [8]:
print(response["choices"][0]["text"].strip())

CWE-117
