In [4]:
import os
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI

from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA

In [5]:
openai_api_key = os.environ['OPENAI_API_KEY']

In [16]:
## get documents
raw_text = """
---
title: "CISA Secure Software Development Attestation Form (Draft)"
linktitle: "Self-Attestation Form"
type: "article"
date: 2023-05-10T15:21:01+02:00
lastmod: 2023-05-10T15:21:01+02:00
draft: false
tags: ["Reference"]
images: []
menu:
  docs:
    parent: "software-security"
weight: 10
toc: true
---

## Attestation and Signature
On behalf of the above-specified company, I attest that [software producer] presently makes consistent use of the following practices, drawn from the secure software development
framework (SSDF), in developing the software identified in Section I:

1. The software is developed and built in secure environments. Those environments are secured by the following actions, at a minimum:
    1. Separating and protecting each environment involved in developing and building Software;
    1. Regularly logging, monitoring, and auditing trust relationships used for authorization and access:
        1. to any software development and build environments; and
        1. among components within each environment;
    1. Enforcing multi-factor authentication and conditional access across the environments relevant to developing and building software in a manner that minimized security risk;
    1. Taking consistent and reasonable steps to document as well as minimize use or inclusion of software products that create undue risk within the environments used to develop and build software;
    1. Encrypting sensitive data, such as credentials, to the extent practicable and based on risk;
    1. Implementing defensive cyber security practices, including continuous monitoring of operations and alerts and, as necessary, responding to suspected and confirmed cyber incidents;
1. The software producer has made a good-faith effort to maintain trusted source code supply chains by:
    1. Employing automated tools or comparable processes; and 
    1. Establishing a process that includes reasonable steps to address the security of third-party components and manage related vulnerabilities;
1. The software producer employs automated tools or comparable processes in a good-faith effort to maintain trusted source code supply chains;
1. The software producer maintains provenance data for internal and third-party code incorporated into the software;
1. The software producer employs automated tools or comparable processes that check for security vulnerabilities. In addition:
    1. The software producer ensures these processes operate on an ongoing basis and, at a minimum, prior to product, version, or update releases; and
    1. The software producer has a policy or process to address discovered security vulnerabilities prior to product release; and
    1. The software producer operates a vulnerability disclosure program and accepts, reviews, and addresses disclosed software vulnerabilities in a timely fashion.
    
I attest that all requirements outlined above are consistently maintained and satisfied.
I further attest the company will notify all impacted agencies if conformance to any element of this attestation is no longer valid. 

Please check the appropriate boxes below, if applicable:
* [ ] There are addendums and/or artifacts attached to this self-attestation form, the title and contents of which are delineated below the signature line.
* [ ] I attest that the referenced software has been verified by a certified FedRAMP Third Party Assessor Organization (3PAO) or other 3PAO approved by an appropriate agency official, and the Assessor used relevant NIST Guidance, which includes all elements outlined in this form, as the assessment baseline. Relevant documentation is attached.

## References

The [Draft of the Secure Software Development Self Attestation Form](https://www.cisa.gov/secure-software-attestation-form) available on cisa.gov, was released as part of a [Request For Comments](https://www.cisa.gov/secure-software-attestation-form) on April 27, 2023. Comments are due on June 26, 2023.

_Reprinted courtesy of the National Institute of Standards and Technology, U.S. Department of Commerce. Not copyrightable in the United States._
"""

In [17]:
llm = OpenAI(openai_api_key=openai_api_key, temperature=0)

In [18]:
# Split results string into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.create_documents([raw_text])
embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
text_embedding = embeddings.embed_query(raw_text)
print (f"Your embedding is length {len(text_embedding)}")
print (f"Here's a sample: {text_embedding[:5]}...")
print (f"You have {len(docs)} documents")
print ("Preview:")
print (docs[0].page_content, "\n")
print (docs[1].page_content)

Your embedding is length 1536
Here's a sample: [0.01485704630613327, -0.004188054706901312, -0.006927063222974539, -0.01942206174135208, -0.029299091547727585]...
You have 6 documents
Preview:
---
title: "CISA Secure Software Development Attestation Form (Draft)"
linktitle: "Self-Attestation Form"
type: "article"
date: 2023-05-10T15:21:01+02:00
lastmod: 2023-05-10T15:21:01+02:00
draft: false
tags: ["Reference"]
images: []
menu:
  docs:
    parent: "software-security"
weight: 10
toc: true
---

## Attestation and Signature
On behalf of the above-specified company, I attest that [software producer] presently makes consistent use of the following practices, drawn from the secure software development
framework (SSDF), in developing the software identified in Section I: 

1. The software is developed and built in secure environments. Those environments are secured by the following actions, at a minimum:
    1. Separating and protecting each environment involved in developing and building Sof

In [19]:
persist_directory = 'duckdb'
docsearch = Chroma.from_documents(docs,embeddings,persist_directory=persist_directory)

In [20]:
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch.as_retriever())

In [24]:
qa.run("Generate an FAQ based on this document.")

Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIError: Request failed due to server shutdown {
  "error": {
    "message": "Request failed due to server shutdown",
    "type": "server_error",
    "param": null,
    "code": null
  }
}
 500 {'error': {'message': 'Request failed due to server shutdown', 'type': 'server_error', 'param': None, 'code': None}} {'Date': 'Tue, 27 Jun 2023 15:20:11 GMT', 'Content-Type': 'application/json', 'Content-Length': '141', 'Connection': 'keep-alive', 'access-control-allow-origin': '*', 'openai-model': 'text-davinci-003', 'openai-organization': 'slim-ai-1', 'openai-processing-ms': '5788', 'openai-version': '2020-10-01', 'strict-transport-security': 'max-age=15724800; includeSubDomains', 'x-ratelimit-limit-requests': '3000', 'x-ratelimit-limit-tokens': '250000', 'x-ratelimit-remaining-requests': '2999', 'x-ratelimit-remaining-tokens': '249744', 'x-ratelimit-reset-requests': '20ms', 'x-rate

'\n\nQ: What is the CISA Secure Software Development Attestation Form?\nA: The CISA Secure Software Development Attestation Form is a document released by the National Institute of Standards and Technology, U.S. Department of Commerce, as part of a Request For Comments on April 27, 2023. The form is designed to help software producers ensure that their software development processes are secure and compliant with relevant NIST guidance. \n\nQ: What are the requirements of the form?\nA: The form requires software producers to attest that they make consistent use of secure software development practices, as outlined in the form. This includes having a policy or process to address discovered security vulnerabilities prior to product release, and operating a vulnerability disclosure program to accept, review, and address disclosed software vulnerabilities in a timely fashion. \n\nQ: When are comments due on the form?\nA: Comments on the form are due on June 26, 2023.'