## Tune Prompt for Training Data
We need to generate the training data. This is going to be crucial because it ultimately shapes how the model responds to queries and feels when interacting with the user. Because it's important we'll try some different prompts to see what creates the best output for us.

## Environment Setup
This step uses the following libraries:
|Library|License|
|-|-|
| [python-dotenv](https://github.com/theskumar/python-dotenv) | BSD-3-Clause |
| [pandas](https://pandas.pydata.org/docs/getting_started/overview.html) | BSD-3-Clause |
| [pydantic](https://github.com/pydantic/pydantic) | MIT |
| [OpenAI](https://github.com/openai/openai-python) | Apache 2.0 |

In [1]:
import os, json, random
from pathlib import Path
from openai import OpenAI
from dotenv import load_dotenv
import pandas as pd
from IPython.display import display
from pydantic import BaseModel

In [2]:
load_dotenv()
OAI_KEY = os.getenv("PAT_API_KEY")
client = OpenAI(api_key=OAI_KEY)

DOCUMENT    = "FM5_0"
PDF_PATH    = Path("pdfs/raw/fm5-0.pdf")
BASE_MODEL  = Path("QuantFactory/Llama-3.2-1B-GGUF")
GGUF_FILE   = "Llama-3.2-1B.Q8_0.gguf"
CACHE_DIR   = "hf_cache"
DATA_DIR    = DOCUMENT / BASE_MODEL / "data"
MODEL_DIR   = DOCUMENT / BASE_MODEL / "lora"
CHUNKED_DATA = DATA_DIR / "chunked" / "chunked.jsonl"
QA_DATA      = DATA_DIR / "qa"      / "qa_pairs.jsonl"

N_PASSAGES    = 30
OPENAI_MODEL  = "gpt-4o-mini"
TEMPERATURE   = 0.4

Using OpenAI's structured API, we can leverage Pydantic classes to make sure our response adheres to a format. This is a simple question/answer format.

In [3]:
class QA(BaseModel):
    question:   str
    answer:     str

Define some possible prompts that we can use.

In [4]:
PROMPTS = {
    "v1":   """You are a data‑labeling assistant. You will be given a passage from a text that is prepended with context.
                TASK: Read the passage, then extract the context and create a question and answer from the passage.
                - Ask a factual question whose answer is present in the passage, 5‑25 words long.
                - Answer must be copied verbatim (case‑insensitive) from the passage, ≤50 words.
                - Do NOT include the passage in your output.
                - If you cannot find a suitable fact, output "FAIL" in the answer.
                """,

    "v2":   """You turn passages into Q/A.
                RULES — reply with a single question and it's corresponding answer with associated metadata:
                    - question: the question (5‑25 words)
                    - answer: the answer (≤ 50 words)

                If no fact fits, return "FAIL" in answer.

                Example 1
                INPUT:
                 A decisive point is a geographic place or critical factor...
                OUTPUT:
                {"question": "What is a decisive point?", "answer":" A geographic place or critical factor"}

                Example 2
                INPUT:
                Mission command is the Army's approach to command and control...
                OUTPUT:
                {"question": "What is mission command?", "answer": "the Army's approach to command and control"}
                """,

    "v3":   """Read the passage and emit exactly ONE question/answer pair with source information:
                              "question" – a factual question (5‑25 words)
                              "answer"   – verbatim answer text from the passage (≤ 50 words)
                        """
}

Load the chunks from the pdf and do a test to make sure the openAI client is set up properly.

In [5]:
passages = []
with CHUNKED_DATA.open() as f:
    for line in f:
        passages.append(json.loads(line)["text"])

random.seed(42)
sampled = random.sample(passages, N_PASSAGES)

In [6]:
resp = client.responses.parse(
                model=OPENAI_MODEL,
                input = [
                    {
                        "role": "system",
                        "content": PROMPTS['v2']
                    },
                    {
                        "role": "user",
                        "content": sampled[0]
                    }
                ],
                text_format=QA
)

In [7]:
print(resp.output_parsed)

question='What should be summarized in the Environmental Considerations section?' answer="The commander's scheme of environmental actions required to support the operation plan, including issues and actions for all operation phases."


Now gather all the responses for the different prompts.

In [8]:
results = []

for name, template in PROMPTS.items():
    for idx, passage in enumerate(sampled):
        try:
            resp = client.responses.parse(
                model=OPENAI_MODEL,
                input = [
                    {
                        "role": "system",
                        "content": template
                    },
                    {
                        "role": "user",
                        "content": passage
                    }
                ],
                text_format=QA
            )
            q, a = resp.output_parsed.question, resp.output_parsed.answer
        except Exception as e:
            q, a = "", ""
            print(e)
        print(f"\r{name}: {idx + 1:02d}/{N_PASSAGES}", end="")

        results.append({
            "prompt_version": name,
            "passage":        passage,
            "question":       q,
            "answer":         a
        })

v3: 30/30

Now I'll inspect the output to see which version I like best. I sort by the "passage" column so I can compare the different outputs for each prompt side-by-side.

In [9]:
pd.options.display.max_rows = 100
pd.options.display.max_colwidth = 10000

df = pd.DataFrame(results)
display(df)

Unnamed: 0,prompt_version,passage,question,answer
0,v1,"used to structure this narrative. Refer to Appendix 4 (Geospatial Engineering) to Annex G (Engineer) as required.\n- (5) (U) Environmental Considerations. Summarize the commander's scheme of environmental actions required to support the operation plan, operation order, or concept plan. Identify issues and actions that should be addressed during all phases of the operation. Refer to Appendix 5 (Environmental Considerations) to Annex G (Engineer) as required.\n- (6) (U) Engineer Reconnaissance. State the scheme of engineer reconnaissance by task and purpose for engineer tactical and technical reconnaissance including infrastructure reconnaissance requirements.\n- b. (U) Tasks to Subordinate Units. List engineering tasks to specific units that are not assigned in the base plan or order. List tasks specific to engineering and mobility, countermobility, and survivability assets only as necessary to ensure unity of effort. Specific and detailed task descriptions should be done in each respective appendix as applicable.\n- c. (U) Coordinating Instructions. List only instructions applicable to two or more subordinate units not covered in the base plan or order. Provide additional coordinating instructions for the following:\n- (1) (U) Identify and list the times or events when obstacle control measures become effective.\n- (2) (U) List supported unit information requirements focused on mobility, countermobility, and survivability that must be considered by subordinate engineer staff officers or that the supported unit requires. This includes engineer-related commander's critical information requirements and perhaps the requests for information that have already been submitted to higher.\n- (3) (U) Explain and describe the countermobility and survivability timelines.\n- 4. (U) Sustainment. Identify sustainment priorities for engineer key tasks and specify additional sustainment instructions as necessary, and, at a minimum, address engineer Class IV and V locations. Refer to Annex F (Sustainment) as required.\n\n[page number] [CLASSIFICATION]\n\nFigure E-7. Sample Annex G (Engineer) format (continued)\n\n## [CLASSIFICATION]\n\n## ANNEX G (ENGINEER) TO OPERATION PLAN or ORDER [number] [(code name)]-[issuing headquarters] [(classification of title)]\n\n",What should be identified during all phases of the operation related to environmental considerations?,Identify issues and actions that should be addressed during all phases of the operation.
1,v1,"an approved contingency plan exists that closely resembles the emerging scenario, that plan can be refined or adapted as necessary and executed. Contingency plans are often phased, and they have specified end states. Contingency plans seek to re-establish conditions favorable to the United States. Contingency plans have an identified military objective and termination criteria. They address military operations ranging from humanitarian assistance to large-scale combat operations.\n- 2-33. Planning for a contingency encompasses the activities associated with the development of plans for the deployment, employment, sustainment, and redeployment of forces and resources in response to potential crises identified in joint strategic planning documents. The level of planning detail of contingency plans varies based on guidance and changes in the security environment. Planning details range from level 1 to level 4 as discussed in paragraphs 2-34 through 2-37. Although listed sequentially, during a crisis they may be conducted concurrently or compressed depending on the situation and conditions. (See JP 5-0 for more information on contingency planning and associated levels of planning detail.)\n\n## Level 1-Commander's Estimate\n\n- 2-34. The commander's estimate is the commander's initial assessment in which options are provided in a concise statement that defines who, what, when, where, why, and how the course of action will be implemented (JP 5-0). The commander's estimate, at planning level 1, involves the least amount of detail and focuses on producing multiple courses of action (COAs) to address a contingency. The product for this level can be a COA briefing, command directive, commander's estimate, or a memorandum with a required force list. The commander's estimate provides the Secretary of Defense with military COAs to meet a potential contingency.\n\n## Level 2-Base Plan\n\n- 2-35. A base plan, at planning level 2, describes the concept of operations, major forces, schemes of support, and anticipated timelines for completing the mission. It normally does not include annexes. A base plan may contain alternatives, including flexible deterrent operations, to provide flexibility in addressing a contingency as it develops or to aid in developing the situation.\n\n## Level 3-Concept Plan\n\n- 2-36. A concept plan",What does the commander's estimate provide to the Secretary of Defense?,The commander's estimate provides the Secretary of Defense with military COAs to meet a potential contingency.
2,v1,"and integrated into the other variables, informational considerations, remembered with the mnemonic METT-TC (I).\n- 1-25. METT-TC (I) represents the mission variables leaders use to analyze and understand a situation in relationship to the unit's mission. Informational considerations are those aspects of the human, information and physical dimensions that affect how humans and automated systems derive meaning from, use, act upon, and are impacted by information (FM 3-0). Informational considerations is expressed as a parenthetical variable in that it is not an independent variable, but an important component of each variable of METT-TC that leaders pay particular attention to when developing understanding of a situation. (See Appendix A for a detailed discussion of the mission variables.)\n\n## IDENTIFY, UNDERSTAND, AND DEVELOP SOLUTIONS TO PROBLEMS\n\n- 1-26. Planning helps leaders better understand and identify problems and develop solutions to solve or manage those problems. A problem is an issue or obstacle that makes it difficult to achieve a desired goal or objective. In a broad sense, a problem exists when an individual becomes aware of a significant difference between what is currently observed or occurring in the environment, and what is desired. In the context of operations, an operational problem is the issue or set of issues that impede commanders from achieving their desired end state. (See paragraph 1-53 for further discussion on identification of problems and problem solving.) Identification of the actual problem to solve is critical to successful planning. Misidentification of the problem often leads to an ineffective plan and operational approach and time critical to subordinates for development of their plans.\n- 1-27. Throughout operations, Army leaders face various problems, requiring unique and creative solutions. Not all problems require the same level of planning. Leaders often identify simple problems immediately and quickly decide on a solution-sometimes on the spot. However, planning is critical when a problem is a set of interrelated issues, and the solution to each issue affects the others. For unfamiliar situations, planning offers ways to solve the complete set of problems. In general, the more complex a situation is, the more important and involved the planning effort becomes.\n\n- 1-28. Just as planning is only part of the operations",What does METT-TC (I) represent in the context of mission analysis?,METT-TC (I) represents the mission variables leaders use to analyze and understand a situation in relationship to the unit's mission.
3,v1,"annex uses the five-paragraph attachment format.\n- E-58. Commanders and staffs use Annex P (Host-Nation Support) to describe how sustainment operations support the concept of operations described in the base plan or order. The G-4 or S-4 is the staff officer responsible for Annex P (Host-Nation Support).\n- E-59. Host-nation support is the civil and military assistance provided by the host nation to the forces located in or transiting through that host-nation's territory. Efficient use of available host-nation support can greatly aid forces and augment the deployed sustainment structure. (See figure E-15 on pages 307 through 311 for the Annex P format.)\n\n## [CLASSIFICATION]\n\nPlace the classification at the top and bottom of every page of the attachments. Place the classification marking at the front of each paragraph and subparagraph in parentheses. Refer to AR 380-5 and DODM 5200.01V2 for classification and release marking instructions.\n\nCopy ## of ## copies Issuing headquarters Place of issue Date-time group of signature\n\nMessage reference number\n\nInclude heading if attachment is distributed separately from the base order or higher-level attachment.\n\nANNEX P (HOST-NATION SUPPORT) TO OPERATION PLAN or ORDER [number] [(code name)]-[issuing headquarters] [(classification of title)]\n\n- (U) References: List documents essential to understanding the attachment.\n- a. List maps and charts first. Map entries include series number, country, sheet names or numbers, edition, and scale.\n- b. List other references in subparagraphs labeled as shown.\n- c. Doctrinal references for host-nation support include FM 3-16, FM 5-0, and FM 6-0.\n- (U) Time Zone Used Throughout the Order: Write the time zone established in the base plan or order.\n- (U) Task Organization: Describe the organization of forces (including attachments and detachments to and from the issuing headquarters) and their command and support relationships. State when each attachment or detachment is effective (for example, on order, on commitment of the reserve). Refer to Annex A (Task Organization) if long or complicated.\n- 1. (U) Situation. Include information affecting host-nation support",What is the staff officer responsible for Annex P (Host-Nation Support)?,The G-4 or S-4 is the staff officer responsible for Annex P (Host-Nation Support).
4,v1,"the assessment is also largely subjective. Planners may also consider time and space when developing force ratios to more accurately assess where and when engagements could occur and how to determine appropriate force ratios.\n\n5-84. Planners first compare tangible factors of friendly strengths against enemy weaknesses, and vice versa, for combat maneuver units and functional and multifunctional units, as necessary. From this objective analysis, planners produce force ratios which highlight advantages or vulnerabilities for each force that may be exploited or may need additional considerations or protection. These comparisons provide planners insight into effective force employment recommendations. (Refer to FM 3-0 for more information on the dynamics of combat power.)\n\n5-85. After computing force ratios, the staff analyzes the intangible aspects of combat power. A technique for this analysis is conducting a subjective comparison of friendly strengths and enemy weaknesses for each dynamic of combat power. The resulting analysis can be beneficial by either reinforcing or offsetting the advantages and vulnerabilities identified by the objective analysis. Often, the intangible factors are more\n\nimportant than the number of tanks or tubes of artillery. This can lead to planner's effectively identifying decision points for the effective employment of forces. (See table 5-4 for an example of a relative combat power assessment.)\n\nTable 5-4. Example relative combat power assessment\n\n5-86. In troop-to-task analysis for stability and defense support of civil authorities, staffs determine relative combat power by comparing available resources to specified or implied stability or defense support of civil authorities tasks. This analysis provides insights into available options and needed resources.\n\n## Generate Options\n\n5-87. Based on the commander's guidance and the results of the initial relative combat power assessment, the staff generates options. A good COA can defeat feasible enemy COAs and ensure the unit remains flexible to execute branches or sequels. In an unconstrained planning environment, planners aim to develop several possible COAs. Depending on available time, commanders may limit the planning options, consistent with the commander's guidance. Options focus on enemy COAs arranged in order of their probable adoption.\n\n5-88. Brainstorming can be used for generating options. It requires extra time, imagination, and creativity, but it produces the widest range of options. The staff (and members of organizations outside the headquarters) remains unbiased and open-minded when developing proposed options.\n\n- 5-89. In developing",What factors do planners compare to assess combat strengths and weaknesses?,"Planners first compare tangible factors of friendly strengths against enemy weaknesses, and vice versa."
5,v1,"commanders execute it because something planned has or has not been successful. In planning priorities, commanders plan a be-prepared mission after any on-order mission.\n- 5-38. Once staff members have identified specified and implied tasks, they ensure understanding of each task's requirements and purpose. The staff then identifies one or two essential tasks. An essential task is a specified or implied task that must be executed to accomplish the mission. Essential tasks are always included in the unit's mission statement.\n\n## Review Available Assets and Identify Resource Shortfalls\n\n5-39. The commander and staff analyze the current task organization, command and support relationships, and status (including current capabilities and limitations) of all units, specifically identifying changes. This analysis also includes capabilities of civilian and military organizations (including joint and multinational) that operate within their unit's assigned area or are otherwise designated to support. They consider relationships among specified, implied, and essential tasks and available assets. Staff officers use the capabilities and resources recorded in their running estimates as a starting point for their analysis. From this analysis, staffs conduct an initial assessment to determine if they have the resources needed to complete all tasks. Staffs may also conduct a preliminary relative combat power assessment to give commanders a rough comparison of friendly and enemy maneuver units. If obvious shortages are identified in any area, they may request from higher headquarters any additional resources or units believed necessary for mission success. Staffs also identify any deviations from the normal task organization and provide them to commanders to understand and consider when developing the planning guidance. A more detailed analysis of available assets and relative combat power occurs during COA development.\n\n## Determine Constraints\n\n5-40. The commander and staff identify any constraints placed on their command. A constraint is a restriction placed on the command by a higher command. A constraint dictates an action or inaction, thus restricting the freedom of action of a subordinate commander. Constraints are found in paragraph 3 of the OPLAN or OPORD. Annexes to the order may also include constraints. The operation overlay, for example, may contain a restrictive fire line or a no-fire area. Constraints may also be issued verbally, in WARNORDs, or in policy memoranda. Staff officers may use relevant constraints recorded in their running estimates as a starting point to their analysis.",What must be executed to accomplish the mission according to the passage?,An essential task is a specified or implied task that must be executed to accomplish the mission.
6,v1,"and facilitates continuous information sharing. Internally, this interaction allows the staff to receive guidance from the commander and resolve issues as they arise. Additionally, it provides a structure and framework for the staff to work collectively and produce a coordinated plan. Externally, the MDMP facilitates information sharing among headquarters. As decisions, information, and staff products become available, the higher headquarters sends them to subordinates in WARNORDs. WARNORDs facilitate parallel planning by providing critical information to allow subordinates to start necessary planning and preparation activities.\n\n## Role of the Commander\n\n- 5-7. Commanders are the most important participants in the MDMP. Through their presence and actions, commanders actively drive this portion of the operations process. More than simply decision makers, commanders use their experience, knowledge, and judgment to guide staff planning efforts. During the MDMP, commanders focus their activities on understanding and visualizing the OE and describing their commander's visualization. While unable to devote all their time to the MDMP, commanders follow the status of the planning effort, participate during critical points of the process, provide guidance, and make decisions based on the detailed work of their staff. The commander issues guidance throughout the MDMP including, but not limited, to the following:\n- · Upon receipt of or in anticipation of a mission (initial planning guidance).\n- · Following mission analysis (planning guidance for COA development, the desired end state and commander's intent).\n- · Following COA development (revised planning guidance for COA analysis).\n- · During COA approval (revised planning guidance to complete the plan).\n- · Upon receipt of new information that invalidates assumptions or changes understanding of the OE.\n\n5-8. Commanders use their experience and judgment to add depth and clarity to their planning guidance. They ensure staffs understand the broad outline of their visualization while allowing the latitude necessary to explore different options. This guidance provides the basis for a detailed concept of operations without dictating the specifics of the final plan. As with their intent, the commander may modify planning guidance based on staff and subordinate input and changing conditions.\n\n5-9. Figure 5-1 lists several interactions between the commander and staff to discuss, assess, and approve or",What do WARNORDs facilitate in the planning process?,WARNORDs facilitate parallel planning by providing critical information to allow subordinates to start necessary planning and preparation activities.
7,v1,"blank.\n\n## Chapter 3 Army Problem Solving\n\nThis chapter describes a systematic approach to solving problems. The chapter begins by discussing problem solving as related to decision making. This chapter establishes the base logic for all other problem-solving planning processes. The chapter concludes by discussing the seven-step process used in Army problem solving.\n\n## PROBLEM SOLVING AND DECISION MAKING\n\n- 3-1. The ability to recognize and effectively solve problems is an essential skill for leaders. A problem is an issue or obstacle that makes it difficult to achieve a desired goal, objective, or end state. Army problem solving is a form of decision making. It is a systematic approach to defining a problem, developing possible solutions to solve the problem, arriving at the best solution, and implementing it. The object of problem solving is not just to solve near-term problems, but to also do so in a way that forms the basis for long-term success.\n- 3-2. Not all problems require lengthy analysis to solve. For simple problems, leaders often make decisions quickly-sometimes on the spot. However, for complicated problems involving a variety of factors, a systematic problem-solving approach is essential. How much analysis is required to effectively solve a problem depends on the problem's complexity, the leader's experience, and amount of time available.\n- 3-3. Army problem solving supports a single leader working alone or a group of leaders working together. Commanders normally direct their staffs or subordinate leaders to work together to recommend solutions to problems. In formal situations, they present their recommendations as staff studies, decision papers, and decision briefings. At lower echelons, recommendations are normally presented orally. (See FM 6-0 for more information on staff studies, decision papers, and decision briefings.)\n- 3-4. Problem solving is an art and science. It is a structured analytic process designed to ensure that all critical factors relevant to the problem are considered, and the relationships between variables are anticipated and accounted for in the solution. This ensures that the desired objective or end state is achieved in the most effective and efficient manner.\n- 3-5. The art of problem solving involves subjective analysis of variables that, in many cases, cannot be easily measured. Leadership and morale, for example, are difficult to measure, but they",What is the main goal of Army problem solving?,"The object of problem solving is not just to solve near-term problems, but to also do so in a way that forms the basis for long-term success."
8,v1,"the base plan or order. Document coordination and reach back support requests in accordance with space coordinating authority guidance such as 'Space Coordinating Plans' and other directives for the area of responsibility; include unique equipment sustainment and technical points of contact.\n- 4. (U) Sustainment. Identify priorities of sustainment for space operations key tasks and specify additional instructions as required. Refer to Annex F (Sustainment) as required.\n- a. (U) Logistics. Identify unique sustainment requirements, procedures, and guidance to support space operations teams and operations. Specify procedures for specialized technical logistic support from external organizations as necessary. Use subparagraphs to identify priorities and specific instructions for space operations logistic support. Refer to Annex F (Sustainment) and Annex P (Host-Nation Support) as required.\n- b. (U) Personnel. Use subparagraphs to identify priorities and specific instructions for human resources support, financial management, legal support, and religious support. Refer to Annex F (Sustainment) as required.\n- c. (U) Health System Support. Identify availability, priorities, and instructions for medical care. Refer to Annex F (Sustainment) as required.\n\n## 5. (U) Command and Signal.\n\n- a. (U) Command.\n- (1) (U) Location of the Commander and Key Leaders. State the location of the commander and key space leaders such as the space coordinating authority, director of space forces, Combined Space Operations Center, cyber electromagnetic warfare officers, and other key reachback leaders.\n- (2) (U) Succession of Command. State the succession of command if not covered in the unit's standard operating procedures.\n- (3) (U) Command Posts. Describe the employment of space-related command and control and functional chains including their location and contact information. Describe the employment of command posts, including their locations and when operational and non-operational. State the primary controlling command post for specific tasks or phases of the operation (for example, 'The division tactical command post will control the air assault').\n- (4) (U) Liaison Requirements. State the space liaison requirements not covered in the unit's standard operating procedures, such as air component coordination element or multinational space officers.\n\n## [page number]\n\n[CLASSIFICATION]\n\nFigure E-",What should be identified regarding sustainment priorities for space operations?,Identify priorities of sustainment for space operations key tasks and specify additional instructions as required.
9,v1,"Planning Guidance\n\n2-15. Prepared by DOD and approved by the President, the CPG fulfills the statutory requirement in Title 10, USC, Section 113 and provides written policy guidance on the preparation and review of campaign and contingency plans (including prioritization) to the CJCS and CCDRs for contingency planning. Contingency\n\nplans are branches of combatant command campaign plans. The CPG focuses the guidance given in the NSS and NDS and is the principal source document for the JSCP. (See paragraphs 2-31 through 2-33 for more information on contingency planning.)\n\n## Joint Strategic Campaign Plan\n\n- 2-16. The JSCP fulfills the CJCS's statutory responsibilities in Title 10, USC, Section 153, to assist the President and Secretary of Defense in providing for strategic direction to the joint force and implementing the strategic guidance in the NSS, NDS, NMS, and CPG. The JSCP provides this guidance to CCDRs, Service chiefs, combat support agencies and applicable DOD agencies for preparation of plans based on current military capabilities, strategic guidance and contingency planning guidance identified to the CJCS in the CPG.\n- 2-17. In addition to communicating to the CCMDs' specific planning guidance, the JSCP operationalizes the strategic vision described in the NMS and nests with the strategic direction delineated by the NSS, NDS, and GFMIG. The JSCP also provides integrated planning guidance and direction for planners to fulfill the CJCS's role as the global integrator.\n\n## Global Campaign Plans\n\n- 2-18. A global campaign plan is the primary means by which the CJCS or designated CCDRs arrange for unity of effort through which they guide the planning, integration, and coordination of joint operations across CCMD areas of responsibility (AORs) and functional responsibilities. Global campaign plans address the most pressing transregional and multifunctional strategic challenges across all domains. Each global campaign plan has an assigned coordinating authority which is a CCDR with the primary responsibility for a global campaign plan. Contingency plans to a global campaign plan are called integrated contingency plans.\n\n## Functional Campaign Plans\n\n- 2-19. Functional campaign plans address functional threats or challenges that are not geographically constrained and",What does the CPG provide written policy guidance on?,the preparation and review of campaign and contingency plans (including prioritization) to the CJCS and CCDRs for contingency planning.


Pick a version and get all of the training data before saving.

In [10]:
template = PROMPTS["v2"]

In [13]:
training_data = []
for idx, passage in enumerate(passages):
    try:
        resp = client.responses.parse(
            model=OPENAI_MODEL,
            input = [
                {
                    "role": "system",
                    "content": template
                },
                {
                    "role": "user",
                    "content": passage
                }
            ],
            text_format=QA
        )
        q, a = resp.output_parsed.question, resp.output_parsed.answer
    except Exception as e:
        q, a = "", ""
        print(e)
    print(f"\r{idx + 1:02d}/{len(passages)}", end="")

    training_data.append({
        "passage":        passage,
        "question":       q,
        "answer":         a
    })

462/462

In [14]:
with open(QA_DATA, "w", encoding="utf-8") as f:
    for d in training_data:
        f.write(json.dumps(d, ensure_ascii=False) + "\n")