# FNCS Augmented

**Obiettivo:**  
per ciascun *security control* della **NIST Special Publication 800-53 rev. 5**, contenente fra parentesi quadre dei parametri impostabili dall'Organizzazione che ne fa uso, riscrivere il testo del controllo in linguaggio naturale - utilizzando un LLM - così da presentare il controllo in forma leggibile per usi futuri.

## I dati

Iniziamo importando i tre dataframe appositamente preparati:

In [1]:
import pandas as pd

csf = pd.read_excel("FNCS.xlsx", sheet_name="FNCS")
sp = pd.read_excel("FNCS.xlsx", sheet_name="800-53rev.5")
cis = pd.read_excel("FNCS.xlsx", sheet_name="CIS CSCv8")

Il primo dataframe (csf) contiene il CyberSecurity Framework NIST (funzioni, categorie e sottocategorie), dove le sottocategorie costituiscono i requisiti di sicurezza da garantire.  
La particolarità è che ogni subcategory è mappata ai relativi controlli tratti dalla Special Publication del NIST 800-53 rev. 5 e dai CIS Controls v. 8.

In [6]:
csf.head()

Unnamed: 0,Function,Category,Subcategory,NIST SP 800-53 Rev. 5,CIS CSC v8
0,IDENTIFY (ID),"ID.AM: Asset Management - The data, personnel,...",ID.AM-1: Physical devices and systems within t...,"CM-8,PM-5",1.0
1,IDENTIFY (ID),"ID.AM: Asset Management - The data, personnel,...",ID.AM-2: Software platforms and applications w...,CM-8,2.16
2,IDENTIFY (ID),"ID.AM: Asset Management - The data, personnel,...",ID.AM-3: Organizational communication and data...,"AC-4,CA-3,CA-9,PL-8,SA-17",3.0
3,IDENTIFY (ID),"ID.AM: Asset Management - The data, personnel,...",ID.AM-4: External information systems are cata...,"AC-20,PM-5,SA-9",12.0
4,IDENTIFY (ID),"ID.AM: Asset Management - The data, personnel,...","ID.AM-5: Resources (e.g., hardware, devices, d...","CP-2,RA-2,RA-9,SA-20,SC-6",3.0


Il secondo dataframe (sp), contiene i security controls della special publication del NIST. Ciascun controllo è identificato da un codice è ha un nome, il testo del controllo e un approfondimento che lo spiega.

In [7]:
sp.head()

Unnamed: 0,Control Identifier,Control Name,Control,Discussion
0,AC-1,Policy and Procedures,"a. Develop, document, and disseminate to [Assi...",Access control policy and procedures address t...
1,AC-2,Account Management,a. Define and document the types of accounts a...,Examples of system account types include indiv...
2,AC-2,Account Management | Automated System Account ...,Support the management of system accounts usin...,Automated system account management includes u...
3,AC-2,Account Management | Automated Temporary and E...,Automatically [Selection: remove; disable] tem...,Management of temporary and emergency accounts...
4,AC-2,Account Management | Disable Accounts,Disable accounts within [Assignment: organizat...,"Disabling expired, inactive, or otherwise anom..."


Analogamente, il terzo dataframe (cis) contiene i controlli CIS, identificati dal proprio ID e costituiti dal nome del controllo e dalla relativa descrizione.

In [8]:
cis.head()

Unnamed: 0,Control ID,Control,Description
0,1,Inventory and Control of Enterprise Assets,"Actively manage (inventory, track, and correct..."
1,1,Establish and Maintain Detailed Enterprise Ass...,"Establish and maintain an accurate, detailed, ..."
2,1,Address Unauthorized Assets,Ensure that a process exists to address unauth...
3,1,Utilize an Active Discovery Tool,Utilize an active discovery tool to identify a...
4,1,Use Dynamic Host Configuration Protocol (DHCP)...,Use DHCP logging on all DHCP servers or Intern...


Pertanto, ad esempio, alla prima subcategory del CSF:

In [12]:
csf.iloc[0,2]

'ID.AM-1: Physical devices and systems within the organization are inventoried'

Corrispondono i seguenti controlli NIST:

In [13]:
csf.iloc[0,3]

'CM-8,PM-5'

E CIS:

In [14]:
csf.iloc[0,4]

1

Ossia:

NIST CM-8:

In [21]:
mask = sp["Control Identifier"] == "CM-8"
sp[mask]

Unnamed: 0,Control Identifier,Control Name,Control,Discussion
263,CM-8,System Component Inventory,a. Develop and document an inventory of system...,"System components are discrete, identifiable i..."
264,CM-8,System Component Inventory | Updates During In...,Update the inventory of system components as p...,"Organizations can improve the accuracy, comple..."
265,CM-8,System Component Inventory | Automated Mainten...,"Maintain the currency, completeness, accuracy,...",Organizations maintain system inventories to t...
266,CM-8,System Component Inventory | Automated Unautho...,(a) Detect the presence of unauthorized hardwa...,Automated unauthorized component detection is ...
267,CM-8,System Component Inventory | Accountability In...,Include in the system component inventory info...,Identifying individuals who are responsible an...
268,CM-8,System Component Inventory | Assessed Configur...,Include assessed component configurations and ...,Assessed configurations and approved deviation...
269,CM-8,System Component Inventory | Centralized Repos...,Provide a centralized repository for the inven...,Organizations may implement centralized system...
270,CM-8,System Component Inventory | Automated Locatio...,Support the tracking of system components by g...,The use of automated mechanisms to track the l...
271,CM-8,System Component Inventory | Assignment of Com...,(a) Assign system components to a system; and\...,System components that are not assigned to a s...


NIST PM-5:

In [22]:
mask = sp["Control Identifier"] == "PM-5"
sp[mask]

Unnamed: 0,Control Identifier,Control Name,Control,Discussion
541,PM-5,System Inventory,Develop and update [Assignment: organization-d...,[OMB A-130] provides guidance on developing sy...
542,PM-5,System Inventory | Inventory of Personally Ide...,"Establish, maintain, and update [Assignment: o...","An inventory of systems, applications, and pro..."


CIS 1:

In [24]:
mask = cis["Control ID"] == 1
cis[mask]

Unnamed: 0,Control ID,Control,Description
0,1,Inventory and Control of Enterprise Assets,"Actively manage (inventory, track, and correct..."
1,1,Establish and Maintain Detailed Enterprise Ass...,"Establish and maintain an accurate, detailed, ..."
2,1,Address Unauthorized Assets,Ensure that a process exists to address unauth...
3,1,Utilize an Active Discovery Tool,Utilize an active discovery tool to identify a...
4,1,Use Dynamic Host Configuration Protocol (DHCP)...,Use DHCP logging on all DHCP servers or Intern...
5,1,Use a Passive Asset Discovery Tool,Use a passive discovery tool to identify asset...


Conseguentemente, un requisito di sicurezza (subcategory) può tradursi in moteplici controlli di sicurezza specifici.  
E' possibile anche che uno o più controlli NIST si sovrappongano (come contenuto sostanziale) a quelli CIS e viceversa.

## Analisi di un caso singolo

Analizziamo la prima subcategory:

In [4]:
csf.iloc[0,2]

'ID.AM-1: Physical devices and systems within the organization are inventoried'

La quale sostanzialmente richiede di effettuare il censimento dei sistemi e dei devices.  
Prendiamo solamente il primo dei controlli NIST applicabili (declinato in nome del controlo, controllo e discussione):

In [5]:
mask = sp["Control Identifier"] == "CM-8"
sp[mask].loc[263,"Control Name"]

'System Component Inventory'

In [6]:
mask = sp["Control Identifier"] == "CM-8"
print(sp[mask].loc[263, "Control"])

a. Develop and document an inventory of system components that:
1. Accurately reflects the system;
2. Includes all components within the system;
3. Does not include duplicate accounting of components or components assigned to any other system;
4. Is at the level of granularity deemed necessary for tracking and reporting; and
5. Includes the following information to achieve system component accountability: [Assignment: organization-defined information deemed necessary to achieve effective system component accountability]; and
b. Review and update the system component inventory [Assignment: organization-defined frequency].


In [7]:
mask = sp["Control Identifier"] == "CM-8"
print(sp[mask].loc[263, "Discussion"])

System components are discrete, identifiable information technology assets that include hardware, software, and firmware. Organizations may choose to implement centralized system component inventories that include components from all organizational systems. In such situations, organizations ensure that the inventories include system-specific information required for component accountability. The information necessary for effective accountability of system components includes the system name, software owners, software version numbers, hardware inventory specifications, software license information, and for networked components, the machine names and network addresses across all implemented protocols (e.g., IPv4, IPv6). Inventory specifications include date of receipt, cost, model, serial number, manufacturer, supplier information, component type,  and physical location.
Preventing duplicate accounting of system components addresses the lack of accountability that occurs when component o

Da tale esempio deriva in particolare la difficoltà presente con riferimento alla documentazione NIST, la quale include nel testo dei controlli alcuni **parametri** racchiusi fra parentesi quadre e riempibili dall'Organizzazione che vuole applicare il controllo.

Riassumendo quanto rilevato sino ad adesso:  
- una subcategory si può declinare in uno o più controlli NIST/CIS
- alcuni controlli potrebbero sovrapporsi ad altri e costituire quindi dei possibili "duplicati"
- alcuni controlli potrebbero contenere del testo particolare racchiuso fra parentesi quadre

## Il task

Il task da realizzare è dunque quello di riscrivere i controlli NIST in modo da sostituire i parametri contenuti fra parentesi quadre con testo espresso in un linguaggio naturale.  
Potremmo usare il seguente prompt:

In [2]:
template = """
Analyze the security control from NIST Special Publication 800-53 rev. 5 that I am transmitting to you.
The transmitted control may or may not contain particular text - called PARAMETER - enclosed in square brackets. Regarding this particular text, consider the following explanation:
'For some controls, additional flexibility is provided by allowing organizations to define specific values for designated parameters associated with the controls. Flexibility is achieved as part of a tailoring process using assignment and selection operations embedded within the controls and enclosed by brackets. The assignment and selection operations give organizations the capability to customize controls based on organizational security and privacy requirements. In contrast to assignment operations which allow complete flexibility in the designation of parameter values, selection operations narrow the range of potential values by providing a specific list of items from which organizations choose.
Determination of the organization-defined parameters can evolve from many sources, including laws, executive orders, directives, regulations, policies, standards, guidance, and mission or business needs. Organizational risk assessments and risk tolerance are also important factors in determining the values for control parameters. Once specified by the organization, the values for the assignment and selection operations become a part of the control.'
Your task is:
- analyze the text of the control
- check for the presence of parameters
- if there are no parameters, respond with the word 'None'
- if there are parameters, rewrite them all (summarizing their content) in natural language and remove the square brackets, then return the original text of the control integrated with the parts you have rewritten.
Do not invent anything and do not add anything else such as comments, premises, conclusions, and explanations in your response, just return 'None' if there are no parameters or the rewritten control text.
Below are some examples:

Control:
a. Develop and document an inventory of system components that:
- Accurately reflects the system;
- Includes all components within the system;
- Does not include duplicate accounting of components or components assigned to any other system;
- Is at the level of granularity deemed necessary for tracking and reporting; and
- Includes the following information to achieve system component accountability: [Assignment: organization-defined information deemed necessary to achieve effective system component accountability]; and
b. Review and update the system component inventory [Assignment: organization-defined frequency].

Answer:
a. Develop and document an inventory of system components that:
- Accurately reflects the system;
- Includes all components within the system;
- Does not include duplicate accounting of components or components assigned to any other system;
- Is at the level of granularity deemed necessary for tracking and reporting; and
- Includes the information defined by the Organization and considered necessary to ensure effective system components accountability; and
b. Review and update the system component inventory according to the frequency defined by the Organization.

Control:
Include in the system component inventory information, a means for identifying by [Assignment (one or more): name, position, role], individuals responsible and accountable for administering those components.

Answer:
Include in the system component inventory information, a means for identifying, by name, position, or role, the individuals responsible and accountable for administering those components.

Control:
a. Develop, document, and disseminate to [Assignment: organization-defined personnel or roles]:
1. [Selection (one or more): organization-level; mission/business process-level; system-level] access control policy that:
(a) Addresses purpose, scope, roles, responsibilities, management commitment, coordination among organizational entities, and compliance; and
(b) Is consistent with applicable laws, executive orders, directives, regulations, policies, standards, and guidelines; and
2. Procedures to facilitate the implementation of the access control policy and the associated access controls;
b. Designate an [Assignment: organization-defined official] to manage the development, documentation, and dissemination of the access control policy and procedures; and
c. Review and update the current access control:
1. Policy [Assignment: organization-defined frequency] and following [Assignment: organization-defined events]; and
2. Procedures [Assignment: organization-defined frequency] and following [Assignment: organization-defined events].

Answer:
a. Develop, document, and disseminate to personnel or roles defined by the Organization:
1. An organization-level, mission/business process-level or system-level access control policy that:
(a) Addresses purpose, scope, roles, responsibilities, management commitment, coordination among organizational entities, and compliance; and
(b) Is consistent with applicable laws, executive orders, directives, regulations, policies, standards, and guidelines; and
2. Procedures to facilitate the implementation of the access control policy and the associated access controls;
b. Designate an official definid by the Organization to manage the development, documentation, and dissemination of the access control policy and procedures; and
c. Review and update the current access control:
1. Policy at a frequency defined by the Organization and following events defined by the Organization; and
2. Procedures at a frequency defined by the Organization and following events defined by the Organization.

Control:
Automatically audit account creation, modification, enabling, disabling, and removal actions.

Answer:
None

Control:
{control}

Answer:
"""

Nel prompt sostanzialmente dico al modello di analizzare il testo del controllo di sicurezza che gli passerò, il quale potrebbe contenere dei parametri racchiusi fra parentesi quadre, e fornisco la spiegazione - presa direttamente dalla NIST SP - di cosa sono i parametri e di quale sia il loro scopo.  
Pertanto, chiedo al modello di riscrivere in linguaggio naturale (riassumendoli) i parametri contenuti nel controllo, oppure di ritornare un semplice "None" nel caso non siano presenti parametri.  
Infine fornisco esempi di controlli e relative riscritture (tecnica *few-shot prompting*).

## Test

Vediamo se il prompt funziona usando GPT-4 di OpenAI come LLM (tramite Azure) e LangChain come framework.

In [3]:
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI
from langchain.schema.output_parser import StrOutputParser

import os
import openai
from dotenv import find_dotenv, load_dotenv
_ = load_dotenv(find_dotenv())
from langchain.chat_models import AzureChatOpenAI

In [4]:
prompt = ChatPromptTemplate.from_template(template)
model = AzureChatOpenAI(deployment_name="gpt4_32k_dbrock", model="gpt-4-32k", temperature=0)
output_parser = StrOutputParser()

In [5]:
chain = prompt | model | output_parser

Per eseguire il test, passiamo il seguente controllo contenente un parametro:

In [36]:
print(sp.loc[15, "Control"])

Enforce dual authorization for [Assignment: organization-defined privileged commands and/or other organization-defined actions].


In [12]:
answer = chain.invoke({"control":sp.loc[15, "Control"]})
print(answer)

Enforce dual authorization for privileged commands and/or other actions as defined by the Organization.


Ed il seguente controllo non contenente parametri:

In [38]:
print(sp.loc[14, "Control"])

Enforce approved authorizations for logical access to information and system resources in accordance with applicable access control policies.


In [39]:
answer = chain.invoke({"control":sp.loc[14, "Control"]})
print(answer)

None


Ottimo! :) Il modello ha adeguatamente riassunto il controllo che richiedeva all'Organizzazione di definire comandi privilegiati o altre azioni relativamente ai quali applicare la doppia autorizzazione, sostituendo il parametro con testo naturale:  

da "[Assignment: organization-defined privileged commands and/or other organization-defined actions]"  

a "for privileged commands and/or other actions as defined by the Organization"  

Mentre nel secondo controllo ha correttamente rilevato l'assenza di parametri e restituito un semplice "None".

## Esecuzione

Dato l'esito positivo del test, possiamo applicare la chain a tutti i controlli del dataframe sp (ossia i controlli NIST) e creare così una nuova colonna nel dataframe contenente i controlli riscritti come desiderato.  
Pertanto:
- creo una funzione (chiamata rewrite) che prende in input un controllo NIST e, se il controllo contiene il carattere "[", chiama la chain per trasformare i parametri del controllo in linguaggio naturale, altrimenti ritorna il controllo originario
- tramite il metodo apply() applico la funzione rewrite a tutti i controlli NIST
- creo una nuova colonna chiamata "Rewritten Control" riportante i controlli riscritti (o quelli originali nel caso fossero privi di parametri)
- rinomino la colonna originaria "Control" in "Original Control"

In [6]:
def rewrite(x):
    if "[" not in x:
        return x
    else:
        return chain.invoke({"control":x})

In [7]:
sp["Rewritten Control"] = sp.Control.apply(rewrite)

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised Timeout: Request timed out: HTTPSConnectionPool(host='dbrock.openai.azure.com', port=443): Read timed out. (read timeout=600).
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised Timeout: Request timed out: HTTPSConnectionPool(host='dbrock.openai.azure.com', port=443): Read timed out. (read timeout=600).
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised Timeout: Request timed out: HTTPSConnectionPool(host='dbrock.openai.azure.c

In [8]:
sp = sp.rename({"Control":"Original Control"}, axis=1)

Ed ecco il nuovo dataframe:

In [9]:
sp

Unnamed: 0,Control Identifier,Control Name,Original Control,Discussion,Rewritten Control
0,AC-1,Policy and Procedures,"a. Develop, document, and disseminate to [Assi...",Access control policy and procedures address t...,"a. Develop, document, and disseminate to perso..."
1,AC-2,Account Management,a. Define and document the types of accounts a...,Examples of system account types include indiv...,a. Define and document the types of accounts a...
2,AC-2,Account Management | Automated System Account ...,Support the management of system accounts usin...,Automated system account management includes u...,Support the management of system accounts usin...
3,AC-2,Account Management | Automated Temporary and E...,Automatically [Selection: remove; disable] tem...,Management of temporary and emergency accounts...,Automatically remove or disable temporary and ...
4,AC-2,Account Management | Disable Accounts,Disable accounts within [Assignment: organizat...,"Disabling expired, inactive, or otherwise anom...",Disable accounts within a time period defined ...
...,...,...,...,...,...
1002,SR-11,Component Authenticity,a. Develop and implement anti-counterfeit poli...,Sources of counterfeit components include manu...,a. Develop and implement anti-counterfeit poli...
1003,SR-11,Component Authenticity | Anti-counterfeit Trai...,Train [Assignment: organization-defined person...,None.,Train personnel or roles defined by the Organi...
1004,SR-11,Component Authenticity | Configuration Control...,Maintain configuration control over the follow...,None.,Maintain configuration control over the follow...
1005,SR-11,Component Authenticity | Anti-counterfeit Scan...,Scan for counterfeit system components [Assign...,The type of component determines the type of s...,Scan for counterfeit system components at a fr...


Se prendiamo un controllo originario a caso, contenente dei parametri:

In [10]:
print(sp.loc[1, "Original Control"])

a. Define and document the types of accounts allowed and specifically prohibited for use within the system;
b. Assign account managers;
c. Require [Assignment: organization-defined prerequisites and criteria] for group and role membership;
d. Specify:
1. Authorized users of the system;
2. Group and role membership; and
3. Access authorizations (i.e., privileges) and [Assignment: organization-defined attributes (as required)] for each account;
e. Require approvals by [Assignment: organization-defined personnel or roles] for requests to create accounts;
f. Create, enable, modify, disable, and remove accounts in accordance with [Assignment: organization-defined policy, procedures, prerequisites, and criteria];
g. Monitor the use of accounts;
h. Notify account managers and [Assignment: organization-defined personnel or roles] within:
1. [Assignment: organization-defined time period] when accounts are no longer required;
2. [Assignment: organization-defined time period] when users are termi

Possiamo vedere che è stato riscritto in linguaggio naturale rimuovendo i parametri:

In [11]:
print(sp.loc[1, "Rewritten Control"])

a. Define and document the types of accounts allowed and specifically prohibited for use within the system;
b. Assign account managers;
c. Require prerequisites and criteria defined by the Organization for group and role membership;
d. Specify:
1. Authorized users of the system;
2. Group and role membership; and
3. Access authorizations (i.e., privileges) and attributes as required and defined by the Organization for each account;
e. Require approvals by personnel or roles defined by the Organization for requests to create accounts;
f. Create, enable, modify, disable, and remove accounts in accordance with the policy, procedures, prerequisites, and criteria defined by the Organization;
g. Monitor the use of accounts;
h. Notify account managers and personnel or roles defined by the Organization within:
1. A time period defined by the Organization when accounts are no longer required;
2. A time period defined by the Organization when users are terminated or transferred; and
3. A time per

Salviamo tutto in un file excel:

In [12]:
with pd.ExcelWriter('FNCS Augmented.xlsx') as writer:
    csf.to_excel(writer, sheet_name="FNCS", index=False)
    sp.to_excel(writer, sheet_name="800-53rev.5", index=False)
    cis.to_excel(writer, sheet_name="CIS CSCv8", index=False)

*That's all, Folks!*