## Executive Order Experiment
Here we start with the rag_experiment database, add Executive Order 14028, and ask more questions.

See rag_experiment.ipynb for more details.


In [1]:
import sys
print (sys.version)

3.10.11 (main, Apr  5 2023, 14:15:30) [GCC 7.5.0]


Log into openai with OPEN_API_KEY

In [2]:
import os
import openai

# get API key from top-right dropdown on OpenAI website
openai.api_key = os.getenv("OPENAI_API_KEY") or "OPENAI_API_KEY"

# openai.Engine.list()  # check we have authenticated

Connect to Pinecone with PINECONE_API_KEY and PINECONE_ENVIRONMENT

We assume the index is already there

In [5]:
import pinecone
from tqdm import tqdm

api_key = os.getenv("PINECONE_API_KEY") or "PINECONE_API_KEY"
# find your environment next to the api key in pinecone console
env = os.getenv("PINECONE_ENVIRONMENT") or "PINECONE_ENVIRONMENT"

pinecone.init(api_key=api_key, enviroment=env)
# pinecone.whoami()

### Create a sample embedding so that we know the embedding length

In [6]:
embed_model = "text-embedding-ada-002"

res = openai.Embedding.create(
    input=["This is sample test that will determine the length"],
    engine=embed_model
)

embedding_length = len(res['data'][0]['embedding'])
embedding_length

1536

In [7]:
index_name = 'openai-nist-sp'

# Create the index if it doesn't exist already
if index_name not in pinecone.list_indexes():
    # if does not exist, create index
    pinecone.create_index(
        index_name,
        dimension=embedding_length,
        metric='cosine',
        metadata_config={'indexed': ['document']}
    )

In [8]:
# connect to index
index = pinecone.Index(index_name)
# view index stats 
index_stats = index.describe_index_stats()
index_stats

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 7}},
 'total_vector_count': 7}

In [9]:
# Read NIST SP 800-207 and add to the vector database if it is empty

entry_word_max = 1500
entries=[] # Collect our entries

if (index_stats.total_vector_count == 0):
    with open('EO14028.txt', 'r') as file:
        book = file.read()

    # Collect the text by paragraph into blocks of entry_word_max words
    entry = ""
    entry_word_count = 0
    for line in book.split('SECTION'):
        line_word_count = len(line.split())
        if ((line_word_count + entry_word_count) > entry_word_max):
            entries.append(entry)
            entry = ""
            entry_word_count = 0
        entry += line + " "
        entry_word_count += line_word_count

    # record last entry if not empty
    if (entry_word_count > 0):
        entries.append(entry)

len(entries)


0

In [10]:
# Now calculate the embeddings

if (len(entries) > 0):
    embeddings = openai.Embedding.create(input=entries, engine=embed_model)

In [11]:
# Now zip up the embeddings and insert!

if (len(entries) > 0):
    # Object is of the form (id, vector, meta_data)
    to_upsert = [('800-207-'+str(i), embeddings['data'][i]['embedding'],{'document':'NIST SP 800-207', 'text':entries[i]}) 
                 for i in range(len(entries)) ]
    index.upsert(vectors=to_upsert)
    

In [12]:
# Define a convenience function to conduct searches
def search_with_rag(openai, index, query):
    # Create an embedding of the query
    qe = openai.Embedding.create(input=[query], engine=embed_model)
    # Search our index
    res = index.query(qe['data'][0]['embedding'], top_k=2, include_metadata=True)
    # Construct the chat question
    context_1 = ( 
    "This is initial context for a question below\n\n" +
    "Context: \n" +
    res['matches'][0]['metadata']['text'] + "\n\n"
    )
    context_2 = ( 
        "This is additional context for a question below\n\n" +
        "Context: \n" +
        res['matches'][1]['metadata']['text'] + "\n\n"
    )
    question = (
        "Answer the question based upon the previous context. If the answer is not clear from the source, " +
        "state 'I cannot answer based upon the context.'.\n\n" +
        "Question: " + query
    )
    # Ask GPT 3.5 WITH context
    res = openai.ChatCompletion.create(
        model='gpt-3.5-turbo',
        messages=[
            {"role": "user", "content" : context_1},
            {"role": "user", "content" : context_2},
            {"role": "user", "content" : question}
        ]
    )

    return(res['choices'][0]['message']['content'])

In [27]:
search_with_rag(openai, index, "What is the purpose of Executive Order 14028?")

"The purpose of Executive Order 14028 is to improve the nation's cybersecurity and to protect Federal Government computer systems, including systems that process data and those that run vital machinery, from malicious cyber actors. The order aims to make bold changes, significant investments, and bring the full scope of Federal Government authorities and resources to protect and secure its computer systems. It is also the policy of the Administration that the prevention, detection, assessment, and remediation of cyber incidents is a top priority and essential to national and economic security."

In [28]:
search_with_rag(openai, index, "When was Executive Order 14028 released?")

'Executive Order 14028 was released on May 12, 2021.'

In [29]:
search_with_rag(openai, index, "Where can I get an official copy of the Executive Order 14028?")

'You can get an official copy of Executive Order 14028 from the official website of the White House.'

In [30]:
search_with_rag(openai, index, "When does Executive Order 14028 expire?")

'The context does not provide information about when Executive Order 14028 expires.'

In [13]:
search_with_rag(openai, index, "Who does Executive Order 14028 apply to?")

'Executive Order 14028 applies to the Federal Government and Federal Information Systems, including those operated by Federal Civilian Executive Branch Agencies and excludes National Security Systems. It also aims to encourage partnerships between the Federal Government and the private sector to improve cybersecurity.'

In [14]:
search_with_rag(openai, index, "Who is the author of Executive Order 14028?")

'The author of Executive Order 14028 is Joseph R. Biden Jr. It is mentioned at the end of the document.'

In [34]:
search_with_rag(openai, index, "What is required by Executive Order 14028?")

'Executive Order 14028 requires the Federal Government to improve its efforts to identify, deter, protect against, detect, and respond to cybersecurity threats. It also requires the Federal Government to partner with the private sector in order to foster a more secure cyber space. Additionally, the order requires Federal Information Systems to meet or exceed the standards and requirements for cybersecurity set forth in the order, and for agencies to establish requirements for logging, log retention, and log management. The order also mandates the implementation of an Endpoint Detection and Response (EDR) initiative to support proactive detection of cybersecurity incidents within Federal Government infrastructure. Finally, the order requires the adoption of National Security Systems requirements that are equivalent to or exceed the cybersecurity requirements set forth in this order.'

In [45]:
print(search_with_rag(openai, index, "What deadlines or milestones are important in Executive Order 14028?"))

There are several deadlines and milestones mentioned in Executive Order 14028, including:

- Within 30 days of the order, the Director of NIST shall solicit input from various actors to identify existing or develop new standards, tools, and best practices for software security.
- Within 180 days of the order, the Director of NIST shall publish preliminary guidelines for enhancing software supply chain security.
- Within 360 days of the order, the Director of NIST shall publish additional guidelines that include procedures for periodic review and updating of the preliminary guidelines.
- Within 90 days of publication of the preliminary guidelines, the Secretary of Commerce, acting through the Director of NIST, shall issue guidance identifying practices that enhance the security of the software supply chain.
- Within 60 days of the order, the Secretary of Commerce shall publish minimum elements for a Software Bill of Materials (SBOM).
- Within 45 days of the order, the Secretary of Comme

In [37]:
search_with_rag(openai, index, "Who should I talk to for more information?")

'The context does not provide information on who to contact for more information.'

In [38]:
search_with_rag(openai, index, "What are OMB's responsibilities?")

"OMB's responsibilities include reviewing the Federal Acquisition Regulation (FAR) and the Defense Federal Acquisition Regulation Supplement contract requirements and language for contracting with IT and OT service providers, recommending updates to such requirements and language to the FAR Council and other appropriate agencies, formulating policies for agencies to establish requirements for logging, log retention, and log management, ensuring that agencies have adequate resources to comply with the requirements identified in the section on Improving the Federal Government's Investigative and Remediation Capabilities, and working with the Secretary of Homeland Security and agency heads to ensure that agencies have adequate resources to comply with the requirements issued pursuant to the section on Improving Detection of Cybersecurity Vulnerabilities and Incidents on Federal Government Networks."

In [39]:
search_with_rag(openai, index, "Has there been an update to Executive Order 14028?")

'There is no information in the provided context that suggests whether there has been an update to Executive Order 14028.'

In [44]:
print(search_with_rag(openai, index, 
                      "Can you create a table of contents for Executive Order 14028 and summarize the main sections?"))

Yes, I can. 

Table of Contents for Executive Order 14028:

Section 1 - Policy
Section 2 - Removing Barriers to Sharing Threat Information Between Government and the Private Sector
Section 3 - Modernizing Federal Government Cybersecurity
Section 4 - Enhancing Software Supply Chain Security
Section 5 - Establishing a Cybersecurity Safety Review Board
Section 6 - Standardizing the Federal Government’s Playbook for Responding to Cybersecurity Vulnerabilities and Incidents
Section 7 - Improving Detection of Cybersecurity Vulnerabilities and Incidents on Federal Government Networks
Section 8 - Improving the Federal Government’s Investigative and Remediation Capabilities
Section 9 - National Security Systems

Main Sections:

- Policy statement on the need for bold changes and significant investments in cybersecurity to defend vital institutions
- Removing barriers to sharing threat information between government and private sector
- Modernizing federal government cybersecurity efforts
- Enha

In [46]:
print(search_with_rag(openai, index, "What are the most important points of Executive Order 14028 overall?"))

The most important points of Executive Order 14028 are the need for the Federal Government to improve its efforts to identify, deter, protect against, detect, and respond to cyber threats, the importance of private sector partnership to adapt to changing threats, the need for bold changes and significant investments to defend vital institutions, and the policy of improving detection, prevention, assessment, and remediation of cyber incidents on federal government networks. The order also emphasizes the importance of logging and retaining data, the need for adequate resources, and the adoption of National Security Systems requirements that exceed cybersecurity requirements set forth in the order.


In [47]:
print(search_with_rag(openai, index, "What are SBOMs and what are their requirements in Executive Order 14028?"))

The concept of Software Bill of Materials (SBOMs) is not explicitly mentioned in Executive Order 14028, and no specific requirements for SBOMs are given. However, the order emphasizes the importance of improving the detection of cybersecurity vulnerabilities and incidents on Federal Government networks, and enhancing the Federal Government's investigative and remediation capabilities. These goals may necessitate the collection and sharing of detailed information regarding the software and systems in use, which could include some aspects of an SBOM.


In [48]:
print(search_with_rag(openai, index, "What does Executive Order 14028 say about Cybersecurity?"))

Executive Order 14028 is titled "Improving the Nation’s Cybersecurity," and it states that the Federal Government must improve cybersecurity efforts to identify, deter, protect against, detect, and respond to malicious cyber actors. The order emphasizes that protecting the nation from malicious cyber actors requires the Federal Government to partner with the private sector and make significant investments to defend vital institutions. Incremental improvements will not give us the security we need. The order also directs federal agencies to deploy an Endpoint Detection and Response (EDR) initiative for proactive detection of cybersecurity incidents utilizing host-level visibility, attribution, and responses. Furthermore, it tasks Federal Information Systems with meeting or exceeding the standards and requirements for cybersecurity issued pursuant to this order.


In [51]:
print(search_with_rag(openai, index, "Do you see any contradictions or conflicts in the text of the document? What are they?"))

There are no obvious contradictions or conflicts in the text of the document.


In [53]:
print(search_with_rag(openai, index, "What are the vaguest parts in the text of this document?"))

As an AI language model, I cannot determine subjective elements such as what may be considered as "vague" by a reader. However, the document does contain technical terms and acronyms that may not be familiar to all readers, which could potentially make those parts difficult to understand without further explanation.


In [11]:
# When we are done, delete the index
# pinecone.delete_index(index_name)