In [107]:
!pip -q install clean-text

In [110]:
from IPython.display import display, HTML

from cleantext import clean
import pandas as pd
import requests 
import spacy
import tqdm

In [111]:
nlp = spacy.load('en_core_web_sm')

In [112]:
asvs_json = requests.get('https://github.com/OWASP/ASVS/releases/download/v4.0.3_release/OWASP.Application.Security.Verification.Standard.4.0.3-en.json').json()

print(asvs_json['Version'])

4.0.3


In [114]:
asvs_req_nlp = []

sections = asvs_json['Requirements']

for section in tqdm.tqdm(sections):
    for items in section['Items']:
        for req in items['Items']:
            description = clean(req['Description'], lower=False)

            if 'DELETED' not in description:
                req['nlp'] = nlp(description)
                req['category'] = items['Name']
                asvs_req_nlp.append(req)

print('\n\nSample:')
print(asvs_req_nlp[0])

100%|██████████| 14/14 [00:03<00:00,  3.80it/s]



Sample:
{'Shortcode': 'V1.1.1', 'Ordinal': 1, 'Description': 'Verify the use of a secure software development lifecycle that addresses security in all stages of development. ([C1](https://owasp.org/www-project-proactive-controls/#div-numbering))', 'L1': {'Required': False, 'Requirement': ''}, 'L2': {'Required': True, 'Requirement': ''}, 'L3': {'Required': True, 'Requirement': ''}, 'CWE': [], 'NIST': [], 'nlp': Verify the use of a secure software development lifecycle that addresses security in all stages of development. ([C1](https://owasp.org/www-project-proactive-controls/#div-numbering)), 'category': 'Secure Software Development Lifecycle'}





In [77]:
# https://attack.mitre.org/tactics/enterprise/

abuse_scenarios = [
    # Reconnaissance - T1595
    "Adversaries may execute active reconnaissance scans to gather information that can be used during targeting. Active scans are those where the adversary probes victim infrastructure via network traffic, as opposed to other forms of reconnaissance that do not involve direct interaction.",

    # Resource Development - T1588
    "Adversaries may buy and/or steal capabilities that can be used during targeting. Rather than developing their own capabilities in-house, adversaries may purchase, freely download, or steal them. Activities may include the acquisition of malware, software (including licenses), exploits, certificates, and information relating to vulnerabilities. Adversaries may obtain capabilities to support their operations throughout numerous phases of the adversary lifecycle.",

    # Initial Access - T1199
    "Adversaries may breach or otherwise leverage organizations who have access to intended victims. Access through trusted third party relationship exploits an existing connection that may not be protected or receives less scrutiny than standard mechanisms of gaining access to a network.",

    # Execution - T1559
    "Adversaries may abuse inter-process communication (IPC) mechanisms for local code or command execution. IPC is typically used by processes to share data, communicate with each other, or synchronize execution. IPC is also commonly used to avoid situations such as deadlocks, which occurs when processes are stuck in a cyclic waiting pattern.",

    # Persistence - T1098
    "Adversaries may manipulate accounts to maintain access to victim systems. Account manipulation may consist of any action that preserves adversary access to a compromised account, such as modifying credentials or permission groups. These actions could also include account activity designed to subvert security policies, such as performing iterative password updates to bypass password duration policies and preserve the life of compromised credentials.",

    # Privilege Escalation - T1134
    "Adversaries may modify access tokens to operate under a different user or system security context to perform actions and bypass access controls. Windows uses access tokens to determine the ownership of a running process. A user can manipulate access tokens to make a running process appear as though it is the child of a different process or belongs to someone other than the user that started the process. When this occurs, the process also takes on the security context associated with the new token.",

    # Defense Evasion - T1006
    "Adversaries may directly access a volume to bypass file access controls and file system monitoring. Windows allows programs to have direct access to logical volumes. Programs with direct access may read and write files directly from the drive by analyzing file system data structures. This technique bypasses Windows file access controls as well as file system monitoring tools.",

    # Credential Access - T1110
    "Adversaries may use brute force techniques to gain access to accounts when passwords are unknown or when password hashes are obtained. Without knowledge of the password for an account or set of accounts, an adversary may systematically guess the password using a repetitive or iterative mechanism. Brute forcing passwords can take place via interaction with a service that will check the validity of those credentials or offline against previously acquired credential data, such as password hashes.",

    # Discovery - T1087
    "Adversaries may attempt to get a listing of accounts on a system or within an environment. This information can help adversaries determine which accounts exist to aid in follow-on behavior.",

    # Lateral Movement - T1534
    "Adversaries may use internal spearphishing to gain access to additional information or exploit other users within the same organization after they already have access to accounts or systems within the environment. Internal spearphishing is multi-staged campaign where an email account is owned either by controlling the user's device with previously installed malware or by compromising the account credentials of the user. Adversaries attempt to take advantage of a trusted internal account to increase the likelihood of tricking the target into falling for the phish attempt.",

    # Collection - T1005
    "Adversaries may search local system sources, such as file systems and configuration files or local databases, to find files of interest and sensitive data prior to Exfiltration.",

    # Command and Control - T1104
    "Adversaries may create multiple stages for command and control that are employed under different conditions or for certain functions. Use of multiple stages may obfuscate the command and control channel to make detection more difficult.",
    
    # Exfiltration - T1030
    "An adversary may exfiltrate data in fixed size chunks instead of whole files or limit packet sizes below certain thresholds. This approach may be used to avoid triggering network data transfer threshold alerts.",

    # Impact - T1499
    "Adversaries may perform Endpoint Denial of Service (DoS) attacks to degrade or block the availability of services to users. Endpoint DoS can be performed by exhausting the system resources those services are hosted on or exploiting the system to cause a persistent crash condition. Example services include websites, email services, DNS, and web-based applications. Adversaries have been observed conducting DoS attacks for political purposes and to support other malicious activities, including distraction, hacktivism, and extortion.",
]

In [122]:
for i, abuse in enumerate(abuse_scenarios):
    abuse_nlp = nlp(clean(abuse, lower=False))

    for req in asvs_req_nlp:
        req['nlp_similarity'] = abuse_nlp.similarity(req['nlp'])

    print('\n#{i}: {abuse}'.format(abuse=abuse, i=i+1))

    df_req = pd.json_normalize(asvs_req_nlp)
    
    display(HTML(df_req[['nlp_similarity', 'category', 'Shortcode', 'Description']].sort_values('nlp_similarity', ascending=False).head(10).reset_index(drop=True, inplace=False).to_html()))


#1: Adversaries may execute active reconnaissance scans to gather information that can be used during targeting. Active scans are those where the adversary probes victim infrastructure via network traffic, as opposed to other forms of reconnaissance that do not involve direct interaction.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.797182,Log Processing,V7.2.1,"Verify that all authentication decisions are logged, without storing sensitive session tokens or passwords. This should include requests with relevant metadata needed for security investigations."
1,0.774155,Malicious Software Architecture,V1.10.1,"Verify that a source code control system is in use, with procedures to ensure that check-ins are accompanied by issues or change tickets. The source code control system should have access control and identifiable users to allow traceability of any changes."
2,0.764322,Generic Web Service Security,V13.1.1,Verify that all application components use the same encodings and parsers to avoid parsing attacks that exploit different URI or file parsing behavior that could be used in SSRF and RFI attacks.
3,0.756764,Log Processing,V7.2.2,Verify that all access control decisions can be logged and all failed decisions are logged. This should include requests with relevant metadata needed for security investigations.
4,0.733425,Build and Deploy,V14.1.5,Verify that authorized administrators can verify the integrity of all security-relevant configurations to detect tampering.
5,0.720403,Authentication Architecture,V1.2.3,"Verify that the application uses a single vetted authentication mechanism that is known to be secure, can be extended to include strong authentication, and has sufficient logging and monitoring to detect account abuse or breaches."
6,0.713918,File Storage,V12.4.2,Verify that files obtained from untrusted sources are scanned by antivirus scanners to prevent upload and serving of known malicious content.
7,0.711574,Service Authentication,V2.10.3,"Verify that passwords are stored with sufficient protection to prevent offline recovery attacks, including local system access."
8,0.708305,File Upload,V12.1.1,Verify that the application will not accept large files that could fill up storage or cause a denial of service.
9,0.704489,File Download,V12.5.2,Verify that direct requests to uploaded files will never be executed as HTML/JavaScript content.



#2: Adversaries may buy and/or steal capabilities that can be used during targeting. Rather than developing their own capabilities in-house, adversaries may purchase, freely download, or steal them. Activities may include the acquisition of malware, software (including licenses), exploits, certificates, and information relating to vulnerabilities. Adversaries may obtain capabilities to support their operations throughout numerous phases of the adversary lifecycle.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.741294,Log Processing,V7.2.1,"Verify that all authentication decisions are logged, without storing sensitive session tokens or passwords. This should include requests with relevant metadata needed for security investigations."
1,0.725251,Application Integrity,V10.3.3,"Verify that the application has protection from subdomain takeovers if the application relies upon DNS entries or DNS subdomains, such as expired domain names, out of date DNS pointers or CNAMEs, expired projects at public source code repos, or transient cloud APIs, serverless functions, or storage buckets (*autogen-bucket-id*.cloud.example.com) or similar. Protections can include ensuring that DNS names used by applications are regularly checked for expiry or change."
2,0.715681,Application Integrity,V10.3.2,"Verify that the application employs integrity protections, such as code signing or subresource integrity. The application must not load or execute code from untrusted sources, such as loading includes, modules, plugins, code, or libraries from untrusted sources or the Internet."
3,0.710137,"Memory, String, and Unmanaged Code",V5.4.3,"Verify that sign, range, and input validation techniques are used to prevent integer overflows."
4,0.707014,General Authenticator Security,V2.2.1,"Verify that anti-automation controls are effective at mitigating breached credential testing, brute force, and account lockout attacks. Such controls include blocking the most common breached passwords, soft lockouts, rate limiting, CAPTCHA, ever increasing delays between attempts, IP address restrictions, or risk-based restrictions such as location, first login on a device, recent attempts to unlock the account, or similar. Verify that no more than 100 failed attempts per hour is possible on a single account."
5,0.706621,Service Authentication,V2.10.4,"Verify passwords, integrations with databases and third-party systems, seeds and internal secrets, and API keys are managed securely and not included in the source code or stored within source code repositories. Such storage SHOULD resist offline attacks. The use of a secure software key store (L1), hardware TPM, or an HSM (L3) is recommended for password storage."
6,0.685239,Business Logic Security,V11.1.5,"Verify the application has business logic limits or validation to protect against likely business risks or threats, identified using threat modeling or similar methodologies."
7,0.684569,Secure Software Development Lifecycle,V1.1.2,"Verify the use of threat modeling for every design change or sprint planning to identify threats, plan for countermeasures, facilitate appropriate risk responses, and guide security testing."
8,0.68429,GraphQL,V13.4.1,"Verify that a query allow list or a combination of depth limiting and amount limiting is used to prevent GraphQL or data layer expression Denial of Service (DoS) as a result of expensive, nested queries. For more advanced scenarios, query cost analysis should be used."
9,0.682003,General Data Protection,V8.1.3,"Verify the application minimizes the number of parameters in a request, such as hidden fields, Ajax variables, cookies and header values."



#3: Adversaries may breach or otherwise leverage organizations who have access to intended victims. Access through trusted third party relationship exploits an existing connection that may not be protected or receives less scrutiny than standard mechanisms of gaining access to a network.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.775049,Generic Web Service Security,V13.1.1,Verify that all application components use the same encodings and parsers to avoid parsing attacks that exploit different URI or file parsing behavior that could be used in SSRF and RFI attacks.
1,0.75893,Malicious Software Architecture,V1.10.1,"Verify that a source code control system is in use, with procedures to ensure that check-ins are accompanied by issues or change tickets. The source code control system should have access control and identifiable users to allow traceability of any changes."
2,0.747973,Log Processing,V7.2.1,"Verify that all authentication decisions are logged, without storing sensitive session tokens or passwords. This should include requests with relevant metadata needed for security investigations."
3,0.727034,File Storage,V12.4.2,Verify that files obtained from untrusted sources are scanned by antivirus scanners to prevent upload and serving of known malicious content.
4,0.725798,Unintended Security Disclosure,V14.3.3,Verify that the HTTP headers or any part of the HTTP response do not expose detailed version information of system components.
5,0.720998,Application Integrity,V10.3.1,"Verify that if the application has a client or server auto-update feature, updates should be obtained over secure channels and digitally signed. The update code must validate the digital signature of the update before installing or executing the update."
6,0.717221,GraphQL,V13.4.1,"Verify that a query allow list or a combination of depth limiting and amount limiting is used to prevent GraphQL or data layer expression Denial of Service (DoS) as a result of expensive, nested queries. For more advanced scenarios, query cost analysis should be used."
7,0.716798,Sensitive Private Data,V8.3.1,"Verify that sensitive data is sent to the server in the HTTP message body or headers, and that query string parameters from any HTTP verb do not contain sensitive data."
8,0.709394,Business Logic Security,V11.1.1,Verify that the application will only process business logic flows for the same user in sequential step order and without skipping steps.
9,0.709251,Sanitization and Sandboxing,V5.2.2,Verify that unstructured data is sanitized to enforce safety measures such as allowed characters and length.



#4: Adversaries may abuse inter-process communication (IPC) mechanisms for local code or command execution. IPC is typically used by processes to share data, communicate with each other, or synchronize execution. IPC is also commonly used to avoid situations such as deadlocks, which occurs when processes are stuck in a cyclic waiting pattern.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.785494,Service Authentication,V2.10.1,"Verify that intra-service secrets do not rely on unchanging credentials such as passwords, API keys or shared accounts with privileged access."
1,0.783321,Log Processing,V7.2.1,"Verify that all authentication decisions are logged, without storing sensitive session tokens or passwords. This should include requests with relevant metadata needed for security investigations."
2,0.783042,Business Logic Security,V11.1.4,"Verify that the application has anti-automation controls to protect against excessive calls such as mass data exfiltration, business logic requests, file uploads or denial of service attacks."
3,0.767927,Application Integrity,V10.3.3,"Verify that the application has protection from subdomain takeovers if the application relies upon DNS entries or DNS subdomains, such as expired domain names, out of date DNS pointers or CNAMEs, expired projects at public source code repos, or transient cloud APIs, serverless functions, or storage buckets (*autogen-bucket-id*.cloud.example.com) or similar. Protections can include ensuring that DNS names used by applications are regularly checked for expiry or change."
4,0.767708,General Authenticator Security,V2.2.1,"Verify that anti-automation controls are effective at mitigating breached credential testing, brute force, and account lockout attacks. Such controls include blocking the most common breached passwords, soft lockouts, rate limiting, CAPTCHA, ever increasing delays between attempts, IP address restrictions, or risk-based restrictions such as location, first login on a device, recent attempts to unlock the account, or similar. Verify that no more than 100 failed attempts per hour is possible on a single account."
5,0.757278,One Time Verifier,V2.8.6,"Verify physical single-factor OTP generator can be revoked in case of theft or other loss. Ensure that revocation is immediately effective across logged in sessions, regardless of location."
6,0.747427,File Storage,V12.4.2,Verify that files obtained from untrusted sources are scanned by antivirus scanners to prevent upload and serving of known malicious content.
7,0.745349,Input and Output Architecture,V1.5.2,"Verify that serialization is not used when communicating with untrusted clients. If this is not possible, ensure that adequate integrity controls (and possibly encryption if sensitive data is sent) are enforced to prevent deserialization attacks including object injection."
8,0.741244,General Authenticator Security,V2.2.2,"Verify that the use of weak authenticators (such as SMS and email) is limited to secondary verification and transaction approval and not as a replacement for more secure authentication methods. Verify that stronger methods are offered before weak methods, users are aware of the risks, or that proper measures are in place to limit the risks of account compromise."
9,0.734195,Service Authentication,V2.10.3,"Verify that passwords are stored with sufficient protection to prevent offline recovery attacks, including local system access."



#5: Adversaries may manipulate accounts to maintain access to victim systems. Account manipulation may consist of any action that preserves adversary access to a compromised account, such as modifying credentials or permission groups. These actions could also include account activity designed to subvert security policies, such as performing iterative password updates to bypass password duration policies and preserve the life of compromised credentials.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.842403,Log Processing,V7.2.1,"Verify that all authentication decisions are logged, without storing sensitive session tokens or passwords. This should include requests with relevant metadata needed for security investigations."
1,0.835857,Malicious Software Architecture,V1.10.1,"Verify that a source code control system is in use, with procedures to ensure that check-ins are accompanied by issues or change tickets. The source code control system should have access control and identifiable users to allow traceability of any changes."
2,0.810625,Generic Web Service Security,V13.1.1,Verify that all application components use the same encodings and parsers to avoid parsing attacks that exploit different URI or file parsing behavior that could be used in SSRF and RFI attacks.
3,0.790888,General Data Protection,V8.1.1,Verify the application protects sensitive data from being cached in server components such as load balancers and application caches.
4,0.767873,Business Logic Security,V11.1.5,"Verify the application has business logic limits or validation to protect against likely business risks or threats, identified using threat modeling or similar methodologies."
5,0.76665,Sanitization and Sandboxing,V5.2.2,Verify that unstructured data is sanitized to enforce safety measures such as allowed characters and length.
6,0.762875,Secure Software Development Lifecycle,V1.1.2,"Verify the use of threat modeling for every design change or sprint planning to identify threats, plan for countermeasures, facilitate appropriate risk responses, and guide security testing."
7,0.761014,Token-based Session Management,V3.5.1,Verify the application allows users to revoke OAuth tokens that form trust relationships with linked applications.
8,0.760668,Business Logic Security,V11.1.4,"Verify that the application has anti-automation controls to protect against excessive calls such as mass data exfiltration, business logic requests, file uploads or denial of service attacks."
9,0.756158,Service Authentication,V2.10.3,"Verify that passwords are stored with sufficient protection to prevent offline recovery attacks, including local system access."



#6: Adversaries may modify access tokens to operate under a different user or system security context to perform actions and bypass access controls. Windows uses access tokens to determine the ownership of a running process. A user can manipulate access tokens to make a running process appear as though it is the child of a different process or belongs to someone other than the user that started the process. When this occurs, the process also takes on the security context associated with the new token.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.807031,General Authenticator Security,V2.2.7,Verify intent to authenticate by requiring the entry of an OTP token or user-initiated action such as a button press on a FIDO hardware key.
1,0.770819,SOAP Web Service,V13.3.1,"Verify that XSD schema validation takes place to ensure a properly formed XML document, followed by validation of each input field before any processing of that data takes place."
2,0.757531,Application Integrity,V10.3.1,"Verify that if the application has a client or server auto-update feature, updates should be obtained over secure channels and digitally signed. The update code must validate the digital signature of the update before installing or executing the update."
3,0.751899,Malicious Software Architecture,V1.10.1,"Verify that a source code control system is in use, with procedures to ensure that check-ins are accompanied by issues or change tickets. The source code control system should have access control and identifiable users to allow traceability of any changes."
4,0.75095,Unintended Security Disclosure,V14.3.3,Verify that the HTTP headers or any part of the HTTP response do not expose detailed version information of system components.
5,0.749123,Sanitization and Sandboxing,V5.2.3,Verify that the application sanitizes user input before passing to mail systems to protect against SMTP or IMAP injection.
6,0.736764,HTTP Security Headers,V14.4.6,Verify that a suitable Referrer-Policy header is included to avoid exposing sensitive information in the URL through the Referer header to untrusted parties.
7,0.736381,Configuration Architecture,V1.14.4,"Verify that the build pipeline contains a build step to automatically build and verify the secure deployment of the application, particularly if the application infrastructure is software defined, such as cloud environment build scripts."
8,0.730453,Business Logic Security,V11.1.1,Verify that the application will only process business logic flows for the same user in sequential step order and without skipping steps.
9,0.726068,RESTful Web Service,V13.2.6,Verify that the message headers and payload are trustworthy and not modified in transit. Requiring strong encryption for transport (TLS only) may be sufficient in many cases as it provides both confidentiality and integrity protection. Per-message digital signatures can provide additional assurance on top of the transport protections for high-security applications but bring with them additional complexity and risks to weigh against the benefits.



#7: Adversaries may directly access a volume to bypass file access controls and file system monitoring. Windows allows programs to have direct access to logical volumes. Programs with direct access may read and write files directly from the drive by analyzing file system data structures. This technique bypasses Windows file access controls as well as file system monitoring tools.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.825527,Malicious Software Architecture,V1.10.1,"Verify that a source code control system is in use, with procedures to ensure that check-ins are accompanied by issues or change tickets. The source code control system should have access control and identifiable users to allow traceability of any changes."
1,0.75814,Malicious Code Search,V10.2.4,Verify that the application source code and third party libraries do not contain time bombs by searching for date and time related functions.
2,0.757196,Sanitization and Sandboxing,V5.2.3,Verify that the application sanitizes user input before passing to mail systems to protect against SMTP or IMAP injection.
3,0.756832,General Data Protection,V8.1.1,Verify the application protects sensitive data from being cached in server components such as load balancers and application caches.
4,0.748806,Generic Web Service Security,V13.1.1,Verify that all application components use the same encodings and parsers to avoid parsing attacks that exploit different URI or file parsing behavior that could be used in SSRF and RFI attacks.
5,0.741499,RESTful Web Service,V13.2.6,Verify that the message headers and payload are trustworthy and not modified in transit. Requiring strong encryption for transport (TLS only) may be sufficient in many cases as it provides both confidentiality and integrity protection. Per-message digital signatures can provide additional assurance on top of the transport protections for high-security applications but bring with them additional complexity and risks to weigh against the benefits.
6,0.740205,Secure Software Development Lifecycle,V1.1.2,"Verify the use of threat modeling for every design change or sprint planning to identify threats, plan for countermeasures, facilitate appropriate risk responses, and guide security testing."
7,0.729685,Malicious Code Search,V10.2.1,"Verify that the application source code and third party libraries do not contain unauthorized phone home or data collection capabilities. Where such functionality exists, obtain the user's permission for it to operate before collecting any data."
8,0.720847,Business Logic Security,V11.1.4,"Verify that the application has anti-automation controls to protect against excessive calls such as mass data exfiltration, business logic requests, file uploads or denial of service attacks."
9,0.716935,Token-based Session Management,V3.5.1,Verify the application allows users to revoke OAuth tokens that form trust relationships with linked applications.



#8: Adversaries may use brute force techniques to gain access to accounts when passwords are unknown or when password hashes are obtained. Without knowledge of the password for an account or set of accounts, an adversary may systematically guess the password using a repetitive or iterative mechanism. Brute forcing passwords can take place via interaction with a service that will check the validity of those credentials or offline against previously acquired credential data, such as password hashes.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.818108,Log Processing,V7.2.1,"Verify that all authentication decisions are logged, without storing sensitive session tokens or passwords. This should include requests with relevant metadata needed for security investigations."
1,0.816247,Malicious Software Architecture,V1.10.1,"Verify that a source code control system is in use, with procedures to ensure that check-ins are accompanied by issues or change tickets. The source code control system should have access control and identifiable users to allow traceability of any changes."
2,0.811109,RESTful Web Service,V13.2.6,Verify that the message headers and payload are trustworthy and not modified in transit. Requiring strong encryption for transport (TLS only) may be sufficient in many cases as it provides both confidentiality and integrity protection. Per-message digital signatures can provide additional assurance on top of the transport protections for high-security applications but bring with them additional complexity and risks to weigh against the benefits.
3,0.805851,General Authenticator Security,V2.2.2,"Verify that the use of weak authenticators (such as SMS and email) is limited to secondary verification and transaction approval and not as a replacement for more secure authentication methods. Verify that stronger methods are offered before weak methods, users are aware of the risks, or that proper measures are in place to limit the risks of account compromise."
4,0.792773,Generic Web Service Security,V13.1.1,Verify that all application components use the same encodings and parsers to avoid parsing attacks that exploit different URI or file parsing behavior that could be used in SSRF and RFI attacks.
5,0.776009,Sensitive Private Data,V8.3.1,"Verify that sensitive data is sent to the server in the HTTP message body or headers, and that query string parameters from any HTTP verb do not contain sensitive data."
6,0.774353,RESTful Web Service,V13.2.1,"Verify that enabled RESTful HTTP methods are a valid choice for the user or action, such as preventing normal users using DELETE or PUT on protected API or resources."
7,0.766404,SOAP Web Service,V13.3.1,"Verify that XSD schema validation takes place to ensure a properly formed XML document, followed by validation of each input field before any processing of that data takes place."
8,0.764694,Unintended Security Disclosure,V14.3.3,Verify that the HTTP headers or any part of the HTTP response do not expose detailed version information of system components.
9,0.764157,Sensitive Private Data,V8.3.3,Verify that users are provided clear language regarding collection and use of supplied personal information and that users have provided opt-in consent for the use of that data before it is used in any way.



#9: Adversaries may attempt to get a listing of accounts on a system or within an environment. This information can help adversaries determine which accounts exist to aid in follow-on behavior.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.801609,File Upload,V12.1.1,Verify that the application will not accept large files that could fill up storage or cause a denial of service.
1,0.791692,Malicious Software Architecture,V1.10.1,"Verify that a source code control system is in use, with procedures to ensure that check-ins are accompanied by issues or change tickets. The source code control system should have access control and identifiable users to allow traceability of any changes."
2,0.735983,Sensitive Private Data,V8.3.2,Verify that users have a method to remove or export their data on demand.
3,0.728566,Communications Architecture,V1.9.2,"Verify that application components verify the authenticity of each side in a communication link to prevent person-in-the-middle attacks. For example, application components should validate TLS certificates and chains."
4,0.724648,RESTful Web Service,V13.2.6,Verify that the message headers and payload are trustworthy and not modified in transit. Requiring strong encryption for transport (TLS only) may be sufficient in many cases as it provides both confidentiality and integrity protection. Per-message digital signatures can provide additional assurance on top of the transport protections for high-security applications but bring with them additional complexity and risks to weigh against the benefits.
5,0.722857,Build and Deploy,V14.1.5,Verify that authorized administrators can verify the integrity of all security-relevant configurations to detect tampering.
6,0.712498,Application Integrity,V10.3.1,"Verify that if the application has a client or server auto-update feature, updates should be obtained over secure channels and digitally signed. The update code must validate the digital signature of the update before installing or executing the update."
7,0.703471,Unintended Security Disclosure,V14.3.3,Verify that the HTTP headers or any part of the HTTP response do not expose detailed version information of system components.
8,0.69065,Secure Software Development Lifecycle,V1.1.2,"Verify the use of threat modeling for every design change or sprint planning to identify threats, plan for countermeasures, facilitate appropriate risk responses, and guide security testing."
9,0.677192,Log Processing,V7.2.1,"Verify that all authentication decisions are logged, without storing sensitive session tokens or passwords. This should include requests with relevant metadata needed for security investigations."



#10: Adversaries may use internal spearphishing to gain access to additional information or exploit other users within the same organization after they already have access to accounts or systems within the environment. Internal spearphishing is multi-staged campaign where an email account is owned either by controlling the user's device with previously installed malware or by compromising the account credentials of the user. Adversaries attempt to take advantage of a trusted internal account to increase the likelihood of tricking the target into falling for the phish attempt.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.824364,General Authenticator Security,V2.2.7,Verify intent to authenticate by requiring the entry of an OTP token or user-initiated action such as a button press on a FIDO hardware key.
1,0.810173,RESTful Web Service,V13.2.6,Verify that the message headers and payload are trustworthy and not modified in transit. Requiring strong encryption for transport (TLS only) may be sufficient in many cases as it provides both confidentiality and integrity protection. Per-message digital signatures can provide additional assurance on top of the transport protections for high-security applications but bring with them additional complexity and risks to weigh against the benefits.
2,0.792918,SOAP Web Service,V13.3.1,"Verify that XSD schema validation takes place to ensure a properly formed XML document, followed by validation of each input field before any processing of that data takes place."
3,0.792212,General Authenticator Security,V2.2.2,"Verify that the use of weak authenticators (such as SMS and email) is limited to secondary verification and transaction approval and not as a replacement for more secure authentication methods. Verify that stronger methods are offered before weak methods, users are aware of the risks, or that proper measures are in place to limit the risks of account compromise."
4,0.785892,Business Logic Security,V11.1.1,Verify that the application will only process business logic flows for the same user in sequential step order and without skipping steps.
5,0.775526,Application Integrity,V10.3.1,"Verify that if the application has a client or server auto-update feature, updates should be obtained over secure channels and digitally signed. The update code must validate the digital signature of the update before installing or executing the update."
6,0.773547,HTTP Security Headers,V14.4.6,Verify that a suitable Referrer-Policy header is included to avoid exposing sensitive information in the URL through the Referer header to untrusted parties.
7,0.765897,Sensitive Private Data,V8.3.3,Verify that users are provided clear language regarding collection and use of supplied personal information and that users have provided opt-in consent for the use of that data before it is used in any way.
8,0.762733,Malicious Software Architecture,V1.10.1,"Verify that a source code control system is in use, with procedures to ensure that check-ins are accompanied by issues or change tickets. The source code control system should have access control and identifiable users to allow traceability of any changes."
9,0.758051,File Storage,V12.4.2,Verify that files obtained from untrusted sources are scanned by antivirus scanners to prevent upload and serving of known malicious content.



#11: Adversaries may search local system sources, such as file systems and configuration files or local databases, to find files of interest and sensitive data prior to Exfiltration.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.877704,Sanitization and Sandboxing,V5.2.6,"Verify that the application protects against SSRF attacks, by validating or sanitizing untrusted data or HTTP file metadata, such as filenames and URL input fields, and uses allow lists of protocols, domains, paths and ports."
1,0.817654,Application Integrity,V10.3.3,"Verify that the application has protection from subdomain takeovers if the application relies upon DNS entries or DNS subdomains, such as expired domain names, out of date DNS pointers or CNAMEs, expired projects at public source code repos, or transient cloud APIs, serverless functions, or storage buckets (*autogen-bucket-id*.cloud.example.com) or similar. Protections can include ensuring that DNS names used by applications are regularly checked for expiry or change."
2,0.811673,Application Integrity,V10.3.2,"Verify that the application employs integrity protections, such as code signing or subresource integrity. The application must not load or execute code from untrusted sources, such as loading includes, modules, plugins, code, or libraries from untrusted sources or the Internet."
3,0.796706,Malicious Code Search,V10.2.2,"Verify that the application does not ask for unnecessary or excessive permissions to privacy related features or sensors, such as contacts, cameras, microphones, or location."
4,0.792139,Business Logic Security,V11.1.5,"Verify the application has business logic limits or validation to protect against likely business risks or threats, identified using threat modeling or similar methodologies."
5,0.79012,General Data Protection,V8.1.3,"Verify the application minimizes the number of parameters in a request, such as hidden fields, Ajax variables, cookies and header values."
6,0.784318,Other Access Control Considerations,V4.3.3,"Verify the application has additional authorization (such as step up or adaptive authentication) for lower value systems, and / or segregation of duties for high value applications to enforce anti-fraud controls as per the risk of application and past fraud."
7,0.771198,Cryptographic Architecture,V1.6.2,Verify that consumers of cryptographic services protect key material and other secrets by using key vaults or API based alternatives.
8,0.763446,Token-based Session Management,V3.5.2,"Verify the application uses session tokens rather than static API secrets and keys, except with legacy implementations."
9,0.761467,Business Logic Security,V11.1.4,"Verify that the application has anti-automation controls to protect against excessive calls such as mass data exfiltration, business logic requests, file uploads or denial of service attacks."



#12: Adversaries may create multiple stages for command and control that are employed under different conditions or for certain functions. Use of multiple stages may obfuscate the command and control channel to make detection more difficult.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.791528,Generic Web Service Security,V13.1.1,Verify that all application components use the same encodings and parsers to avoid parsing attacks that exploit different URI or file parsing behavior that could be used in SSRF and RFI attacks.
1,0.780302,RESTful Web Service,V13.2.6,Verify that the message headers and payload are trustworthy and not modified in transit. Requiring strong encryption for transport (TLS only) may be sufficient in many cases as it provides both confidentiality and integrity protection. Per-message digital signatures can provide additional assurance on top of the transport protections for high-security applications but bring with them additional complexity and risks to weigh against the benefits.
2,0.777876,General Authenticator Security,V2.2.2,"Verify that the use of weak authenticators (such as SMS and email) is limited to secondary verification and transaction approval and not as a replacement for more secure authentication methods. Verify that stronger methods are offered before weak methods, users are aware of the risks, or that proper measures are in place to limit the risks of account compromise."
3,0.776007,Malicious Software Architecture,V1.10.1,"Verify that a source code control system is in use, with procedures to ensure that check-ins are accompanied by issues or change tickets. The source code control system should have access control and identifiable users to allow traceability of any changes."
4,0.754401,Business Logic Security,V11.1.1,Verify that the application will only process business logic flows for the same user in sequential step order and without skipping steps.
5,0.748945,GraphQL,V13.4.1,"Verify that a query allow list or a combination of depth limiting and amount limiting is used to prevent GraphQL or data layer expression Denial of Service (DoS) as a result of expensive, nested queries. For more advanced scenarios, query cost analysis should be used."
6,0.740163,Business Logic Security,V11.1.3,Verify the application has appropriate limits for specific business actions or transactions which are correctly enforced on a per user basis.
7,0.733491,Other Access Control Considerations,V4.3.3,"Verify the application has additional authorization (such as step up or adaptive authentication) for lower value systems, and / or segregation of duties for high value applications to enforce anti-fraud controls as per the risk of application and past fraud."
8,0.726283,Log Processing,V7.2.1,"Verify that all authentication decisions are logged, without storing sensitive session tokens or passwords. This should include requests with relevant metadata needed for security investigations."
9,0.726216,Business Logic Security,V11.1.5,"Verify the application has business logic limits or validation to protect against likely business risks or threats, identified using threat modeling or similar methodologies."



#13: An adversary may exfiltrate data in fixed size chunks instead of whole files or limit packet sizes below certain thresholds. This approach may be used to avoid triggering network data transfer threshold alerts.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.784739,Malicious Software Architecture,V1.10.1,"Verify that a source code control system is in use, with procedures to ensure that check-ins are accompanied by issues or change tickets. The source code control system should have access control and identifiable users to allow traceability of any changes."
1,0.784721,Log Processing,V7.2.1,"Verify that all authentication decisions are logged, without storing sensitive session tokens or passwords. This should include requests with relevant metadata needed for security investigations."
2,0.759625,Sensitive Private Data,V8.3.1,"Verify that sensitive data is sent to the server in the HTTP message body or headers, and that query string parameters from any HTTP verb do not contain sensitive data."
3,0.75873,Generic Web Service Security,V13.1.1,Verify that all application components use the same encodings and parsers to avoid parsing attacks that exploit different URI or file parsing behavior that could be used in SSRF and RFI attacks.
4,0.734436,HTTP Security Headers,V14.4.6,Verify that a suitable Referrer-Policy header is included to avoid exposing sensitive information in the URL through the Referer header to untrusted parties.
5,0.727654,RESTful Web Service,V13.2.6,Verify that the message headers and payload are trustworthy and not modified in transit. Requiring strong encryption for transport (TLS only) may be sufficient in many cases as it provides both confidentiality and integrity protection. Per-message digital signatures can provide additional assurance on top of the transport protections for high-security applications but bring with them additional complexity and risks to weigh against the benefits.
6,0.723775,Fundamental Session Management Security,V3.1.1,Verify the application never reveals session tokens in URL parameters.
7,0.718993,GraphQL,V13.4.2,Verify that GraphQL or other data layer authorization logic should be implemented at the business logic layer instead of the GraphQL layer.
8,0.718323,Business Logic Security,V11.1.1,Verify that the application will only process business logic flows for the same user in sequential step order and without skipping steps.
9,0.718264,File Storage,V12.4.2,Verify that files obtained from untrusted sources are scanned by antivirus scanners to prevent upload and serving of known malicious content.



#14: Adversaries may perform Endpoint Denial of Service (DoS) attacks to degrade or block the availability of services to users. Endpoint DoS can be performed by exhausting the system resources those services are hosted on or exploiting the system to cause a persistent crash condition. Example services include websites, email services, DNS, and web-based applications. Adversaries have been observed conducting DoS attacks for political purposes and to support other malicious activities, including distraction, hacktivism, and extortion.


Unnamed: 0,nlp_similarity,category,Shortcode,Description
0,0.819715,Log Processing,V7.2.1,"Verify that all authentication decisions are logged, without storing sensitive session tokens or passwords. This should include requests with relevant metadata needed for security investigations."
1,0.810251,Application Integrity,V10.3.3,"Verify that the application has protection from subdomain takeovers if the application relies upon DNS entries or DNS subdomains, such as expired domain names, out of date DNS pointers or CNAMEs, expired projects at public source code repos, or transient cloud APIs, serverless functions, or storage buckets (*autogen-bucket-id*.cloud.example.com) or similar. Protections can include ensuring that DNS names used by applications are regularly checked for expiry or change."
2,0.796998,Service Authentication,V2.10.4,"Verify passwords, integrations with databases and third-party systems, seeds and internal secrets, and API keys are managed securely and not included in the source code or stored within source code repositories. Such storage SHOULD resist offline attacks. The use of a secure software key store (L1), hardware TPM, or an HSM (L3) is recommended for password storage."
3,0.772408,GraphQL,V13.4.1,"Verify that a query allow list or a combination of depth limiting and amount limiting is used to prevent GraphQL or data layer expression Denial of Service (DoS) as a result of expensive, nested queries. For more advanced scenarios, query cost analysis should be used."
4,0.771291,Malicious Software Architecture,V1.10.1,"Verify that a source code control system is in use, with procedures to ensure that check-ins are accompanied by issues or change tickets. The source code control system should have access control and identifiable users to allow traceability of any changes."
5,0.756806,File Execution,V12.3.4,"Verify that the application protects against Reflective File Download (RFD) by validating or ignoring user-submitted filenames in a JSON, JSONP, or URL parameter, the response Content-Type header should be set to text/plain, and the Content-Disposition header should have a fixed filename."
6,0.751179,General Data Protection,V8.1.1,Verify the application protects sensitive data from being cached in server components such as load balancers and application caches.
7,0.750643,Unintended Security Disclosure,V14.3.2,"Verify that web or application server and application framework debug modes are disabled in production to eliminate debug features, developer consoles, and unintended security disclosures."
8,0.748784,Build and Deploy,V14.1.2,"Verify that compiler flags are configured to enable all available buffer overflow protections and warnings, including stack randomization, data execution prevention, and to break the build if an unsafe pointer, memory, format string, integer, or string operations are found."
9,0.747571,Business Logic Security,V11.1.5,"Verify the application has business logic limits or validation to protect against likely business risks or threats, identified using threat modeling or similar methodologies."
