# 1-Generate Observations using LangChain

- **Goal:** Use LLMs to generate textual observations for the following domains: financial, health, policy, weather, sports, and miscellaneous. 

- **Code Structure:** 

    1. Base template: This is included in every domain's input.
    2. Domain template: Vary or specific to a domain.

- **Run Notebook:**

    1. See README.md for installation and initial setup.
    2. Choose models (from `text_generation_models.py`) to generate data.
    3. Here in JupyerNotebook, click `Run All` button at top/in menu bar.
    4. Reach out if any problems.

In [1]:
import os, sys

import pandas as pd

from tqdm import tqdm
from langchain_core.prompts import PipelinePromptTemplate, PromptTemplate

# Get the current working directory of the notebook
notebook_dir = os.getcwd()
# Add the parent directory to the system path
sys.path.append(os.path.join(notebook_dir, '../'))

from log_files import LogData
from data_processing import DataProcessing
from text_generation_models import TextGenerationModelFactory

## Base Templates for Domain Observations

- `{observation_properties}:` These are the variables for each observation.
- `{observation_requirements}:` These are to state how outcome observations should be limited to or expressed as.
- `{observation_templates}:` These are to give LLMs proper structure/syntax.
- `{observation_examples}:` These are to provide LLMs with examples that match the templates.

In [2]:
full_observation_template = """{observation_properties}

{observation_requirements}

{observation_templates}

{observation_examples}
"""
full_observation_prompt = PromptTemplate.from_template(full_observation_template)

In [3]:
observation_properties_template = """An observation <o> = (<o_s>, <o_t>, <o_d>, <o_a>), where it consists of the following four properties:

    1. <o_s>, any source entity in the {observation_domain} domain.
        - Can be a person (with a name) or a {observation_domain} person such as a {observation_domain} reporter, {observation_domain} analyst, {observation_domain} expert, {observation_domain} top executive, {observation_domain} senior level person, etc, civilian.
        - Can only be an organization that is associated with the {observation_domain} observation.
    2. <o_t>, any target entity in the {observation_domain} domain.
        - Can be a person (with a name) or a {observation_domain} person such as a {observation_domain} reporter, {observation_domain} analyst, {observation_domain} expert, {observation_domain} top executive, {observation_domain} senior level person, etc).
        - Can only be an organization that is associated with the {observation_domain} observation.
    3. <o_d>, date or time range when <o> is expected to come to fruition or when one should observe the <o>.
        - Forecast can range from a second to anytime in the future.
        - Answers the questions: "How far to go out from today?" or "Where to stop?".
    4. <o_a>, {observation_domain} observation output.
        - Characteristics of a domain-specific outputs such as various quantifiable metrics relevant to the {observation_domain} domain.
        - Some examples are {observation_domain_output}.
"""
observation_properties_prompt = PromptTemplate.from_template(observation_properties_template)

In [4]:
observation_requirements_template = """Requirements to use for each observation:

    - Should be based on real-world {observation_domain} data and not hallucinate.
    - Must be a simple sentence (observation) (and NOT compounding using "and" or "or").
    - Should diversify all four properties of the observation (<o>) as in change and not use same for <o_s>, <o_t>, <o_d>, <o_a>.
    - The observation should be unique and not repeated.
    - Do not number the observations.
    - In front of every observation, put the template number in the format of "T1:", "T2:", etc. and do not number them like "1.", "2.", etc. Should have template number and generated prediction matching.
    - Must not generate, "template 1:..., template 2:..., etc" or anything similar and don't generate "T1:", "T2:", etc by itself.    
    - Must not generate, "Here are {observations_N} unique observation based on the provided templates or anything similar.
    - Change how the current date (<o_d>) written in the observation with examples of (1) Wednesday, August 21, 2024; (2) Wed, August 21, 2024; (3) 08/21/2024; (4) 08/21/2024; (5) 21/08/2024; (6) 21 August 2024; (7) 2024/08/21; (8) 2024-08-21; (9) August 21, 2024; (10) Aug 21, 2024; (11) 21 August 2024, (12) 21 Aug 2024, Q3 of 2027, 2029 of Q3, etc.
    - Do not use any of the examples in the prompt.
    - Do not put template number on line by itself. Always pair with an observation.
    - Disregard brackets: "<>"
    - Do not use person name of entity name more than once as in don't use name Joe as both the <o_s> and <o_t>, unless like Mr. Sach and Goldman Sach or Mr. Sam Walton and Sam's Club, etc.
    - The source entity (<o_s>) is rarely the same as the target entity (<o_t>) and if same, the <o_s> is making a observation on itself in the <o_t>.
    - Should variate the slope of rose/increased/as much as, fell/decreased/as little as, changed, stayed stable, high/low chance/probability/degree of, etc.
    - Must be past tense as in already occurred and not future tense.
    - Must not use will, would, be going to, should, etc. in the observation.
    - Do not include "{observation_domain} template 1:	"
    - Should variate the past tense prediction verbs such as observed, saw, noted, etc.
"""
observation_requirements_prompt = PromptTemplate.from_template(observation_requirements_template)

In [5]:
observation_templates_template = """Here are some {observation_domain} templates:

    - {observation_domain} template 1: <o_s> observed that the <o_a> at <o_t> had remained stable in <o_d>.
    - {observation_domain} template 2: On <o_d>, <o_s> monitored the <o_a> at <o_t> changed.
    - {observation_domain} template 3: <o_s> noted on <o_d>, the <o_t> <o_a> fell.
    - {observation_domain} template 4: According to <o_s>, the <o_a> at <o_t> rose <o_d>.
    - {observation_domain} template 5: In <o_d>, <o_s> envisioned that <o_t> <o_a> decreased.
    - {observation_domain} template 6: <o_t> <o_a> increased <o_d>, according to <o_s>.  
"""
observation_templates_prompt = PromptTemplate.from_template(observation_templates_template)

In [6]:
observation_examples_template = """Here are some examples of {observation_domain} observations:
{domain_examples}

With the above (observation with four properties, requirements, templates, and examples), generate a unique set of {observations_N} observation per template following the examples. Think from the perspective of an {observation_domain} analyst, expert, top executive, or senior level person and even a college student, professional, research advisor, etc.
"""
observation_examples_prompt = PromptTemplate.from_template(observation_examples_template)

In [7]:
observation_input_prompts = [
    ("observation_properties", observation_properties_prompt),
    ("observation_requirements", observation_requirements_prompt),
    ("observation_templates", observation_templates_prompt),
    ("observation_examples", observation_examples_prompt),
]
pipeline_prompt = PipelinePromptTemplate(
    final_prompt=full_observation_prompt, pipeline_prompts=observation_input_prompts
)

  pipeline_prompt = PipelinePromptTemplate(


## Specific Templates for Domain Observations

- For now, generating 1 observation per template. From here, I'll try 3 and increase by increments/multiples of 3.

- With 1 observation per template,
    - 1 observation per template x 6 examples per domain so 6 observations per domain
    - 6 observations per domain x 4 domains = 24 observations per model
    - 24 observations per model x 2 models = 48 observations across all models
    - 48 observations across all models x 2 batches = 96 across all batches

In [8]:
examples_per_template = 1
generate_N_observations_per_template = 1 * examples_per_template

### Template for Financial Observations

In [9]:
financial_outputs = """stock price, net profit, revenue, operating cash flow, research and development expenses, operating income, gross profit."""
financial_requirements = """- Should be based on real-world financial earnings reports.
   - Suppose the time when <o> was made is during any earning season.
   - Include stocks from all sectors such as consumer staples, energy, finance, health care, industrials, materials, media, real estate, retail, technology, utilities, defense, etc.
   - Include the US Dollar sign ($) before or USD after the amount of the financial output."""
financial_examples = """
   - financial examples for template 1:
      1. Joseph, the young entrepreneur, observed that the revenue at FUBU (his parents clothing line) had increased for Q3 2028.
      2. BJ monitored the operating cash flow at UF's school of Engineering  and saw it decreased in 05/2021.
      3. An fresh investor noticed the ETFs in his portfolio exponentially grew from Apr 7, 1997 to Apr 7, 2009.
   - financial examples for template 2:
      1. On March 15, 2025 to March 16, 2026, Goldman Sachs observed that the interest rates at the Federal Reserve rose.
      2. On April 2, 2000, Fidelity oted that the valuation of the market value at Tesla fell.
      3. On 1/23/2012, Chase analysts recorded that their stock prices increased.
   - financial examples for template 3:
      1. Charles Schwab observed that on 3/2/2035, the NASDAQ composite index climbed moderately.
      2. BlackRock documented that on April 22, 2028, the value of Bitcoin rose sharply.
      3. Morgan Stanley reported that on the 3rd of May, 2025, their stock price declined.
   - financial examples for template 4:
      1. According to Chase Bank, the returns at emerging market equities gone down in May 2035.
      2. According to Ryan, the revenue at Meta Platforms dropped in Q2 2055.
      3. According to Apple, the trading volume it had increased on 1/2/2027.
   - financial examples for template 5:
      1. On Q1 2012, Wells Fargo noted that U.S. Treasury yields remained stable.
      2. On 5/5/2002, Bob detected that the inflation rate at Wells Fargo remained stable
      3. On June 1997, Rob recorded that the stocks he held remained stable.
   - financial examples for template 6:
      1. Apple stock price decreased in Quarter 3 of 2046, according to Roger.
      2. The NASDAQ index rose in 7/5/2060, according to Bank of America.
      3. The stock price increased in July 1999, according to my records.
"""
financial_input_dict = {
    "observation_domain": "financial",
    "observation_domain_output": financial_outputs,
    "domain_requirements": financial_requirements,
    "domain_examples": financial_examples,
    "observations_N": generate_N_observations_per_template
}
financial_prompt_output = pipeline_prompt.format(**financial_input_dict)
print(financial_prompt_output)


An observation <o> = (<o_s>, <o_t>, <o_d>, <o_a>), where it consists of the following four properties:

    1. <o_s>, any source entity in the financial domain.
        - Can be a person (with a name) or a financial person such as a financial reporter, financial analyst, financial expert, financial top executive, financial senior level person, etc, civilian.
        - Can only be an organization that is associated with the financial observation.
    2. <o_t>, any target entity in the financial domain.
        - Can be a person (with a name) or a financial person such as a financial reporter, financial analyst, financial expert, financial top executive, financial senior level person, etc).
        - Can only be an organization that is associated with the financial observation.
    3. <o_d>, date or time range when <o> is expected to come to fruition or when one should observe the <o>.
        - Forecast can range from a second to anytime in the future.
        - Answers the questions: "

###  Template for Health Observations

In [10]:
health_outputs = """obesity rates, prevalence of chronic illnesses, average physical activity levels, nutritional intake, etc."""
health_requirements = """- Should be based on real-world health reports.
   - Suppose the time when <o> was made is during any season such as flu season, allergy season, pandemic, epidemic, etc.
   - Include reports from all Health organization, researcher, doctor, physical therapist, physician assistant, nurse practictioners, fitness expert, etc."""
health_examples = """
   - health examples for template 1:
      1. Florida caught that the patients' blood glucose at all hospitals in Florida improved from Q1 2021 to Q3 2021.
      2. Nurse John observed that the heart rate in patients at Alaska's General Hospital had stabilized from 2023 January to 2023 Dec.
      3. I noted the number of visits my patients in Piscataway, NJ decreased from start of week to end of week.
   - health examples for template 2:
      1. On 3/14/2017, the CDC observed that vaccination adherence at urban elementary schools changed.
      2. On September 8, 2034, Sam reported that the calcium intake at prenatal clinics in the Midwest increased.
      3. On the 12th of November, 2020, I recorded that hypertension rates at the state level decreased.
   - health examples for template 3:
      1. The NIH reported that on 6/22/2042, public participation in mental health workshops declined.
      2. Alex identified that on October 19, 2074, the diabetes prevalence at regional hospitals decreased.
      3. I noted that on January 3, 2025, the daily step count of individuals significantly changed.
   - health examples for template 4:
      1. According to the study conducted at UF, the hydration levels at Florida middle schools dropped in Spring 2035.
      2. According to Joe, the fiber intake at the university cafeterias fell on 10/14/2055.
      3. According to my study of wellness habits, his average sleep duration decreased in Q4, 2029.
   - health examples for template 5:
      1. On 5/2/2028, Dr. Maria Thompson tracked that national cholesterol averages declined.
      2. In July 2051, Professor James Liu recorded that aerobic capacity among children increased.
      3. In Q1 2027, Dr. Aisha Reynolds observed that her protein consumption remained stable.
   - health examples for template 6:
      1. Physical activity levels among seniors fell on 8/7/2015, according to Dr. Elena Morales’ study.
      2. Nutritional awareness in South America became more evident on the 3rd of September 2016, according to me.
      3. Sarah's flu vaccination participation rose in early Q4 2024, according to Sarah’s records.
"""
health_input_dict = {
    "observation_domain": "health",
    "observation_domain_output": health_outputs,
    "domain_requirements": health_requirements,
    "domain_examples": health_examples,
    "observations_N": generate_N_observations_per_template
}
health_prompt_output = pipeline_prompt.format(**health_input_dict)

###  Template for Policy Observations

In [11]:
policy_outputs = """election outcomes, economic reforms, legislative impacts."""
policy_requirements = """- Should be based on real-world policy reports.
    - Suppose the time when <o> was made is during an election cycle or non-election cycles.
    - Include policies & laws, from all sectors such as consumer staples, energy, finance, health care, industrials, materials, media, real estate, retail, technology, utilities, defense, etc."""
policy_examples = """
   - policy examples for template 1:
      1. Local journalist, Aaron, identified economic reforms in Thomson, GA rose jan 2033.
      2. Policy analyst, Michael (Ph.D), remarked that the home tax in Austin, TX had increased on 7/9/28.
      3. Policy maker Sarah noted that company employment rates in her city San Francisco had risen from Q1 2025 to Q3 2025.
   - policy examples for template 2:
      1. On 4/5/2030, the Brookings Institution noted that lobbying intensity at swing districts stayed stable.
      2. On March 3, 2021, the International Monetary Fund observed that trade policy compliance at Southeast Asian nations rose.
      3. On the 18th of July, 2026, policy analyst Rachel Kim reported that tax incentives in her clean energy firms decreased.
   - policy examples for template 3:
      1. Representative Angela Brooks observed that on October 15, 2027, the infrastructure funding distribution remained stable.
      2. Economist Dr. Henry Zhao recorded that on 6/4/2023, the property tax rate in urban zones increased.
      3. Senator Michael Greene noted that on November 3, 2022, his campaign donations in rural counties declined.
   - policy examples for template 4:
      1. According to state representate Alicia Ramirez, civic participation at state-level agencies increased in early 2018.
      2. According to Thomas Nguyen, regulatory support in the transportation sector fell in Q1 2034.
      3. According to policy advisor Natalie Chen, the job creation rate at her nonprofit coalition remained stable in October 2026.
   - policy examples for template 5:
      1. On 3/2/2024, Senator Jordan Ellis observed that educational grant spending stayed stable.
      2. In March 2026, economist Dr. Priya Nandakumar reported that food assistance claims in urban counties increased.
      3. In Q4 of 2022, policy strategist Kevin Adler noted that his green tech subsidy approvals declined.
   - policy examples for template 6:
      1. Renewable energy investments inflated in Q3 2023, according to Dr. Elena Foster.
      2. Housing subsidies decreased in December 2051, according to Senator Marcus Lee.
      3. My advocacy involvement in education reform stayed the same on 9/5/2020, noted by me.
"""
policy_input_dict = {
    "observation_domain": "policy",
    "observation_domain_output": policy_outputs,
    "domain_requirements": policy_requirements,
    "domain_examples": policy_examples,
    "observations_N": generate_N_observations_per_template
}
policy_prompt_output = pipeline_prompt.format(**policy_input_dict)

###  Template for Weather Observations

In [12]:
weather_outputs = """temperature, precipitation, wind speed, humidity, etc."""
weather_requirements = """- Should be based on real-world weather reports.
    - Suppose the time when <o> was made is during any season and any location (ie: Florida known for hurricanes, California known for wildfires, etc).
    - Include reports from all meteorologists, weather organizations, or any type of weather entity."""
weather_examples = """
   - weather examples for template 1:
        1. The street cleaner monitored the snow in Minnesota increase from 12/8/9 to 2/8/10.
        2. Jade, a farmer, caught that the rainfall in Kansas had decreased at midnight.
        3. I identified the wind speed in North Dakota picked up drastically today.
   - weather examples for template 2:
      1. On 1/1/2024, Meteorologist Lisa Park reported that the temperature at San Diego rose.
      2. On 2023 Aug 15, Dr. Mark Williams noted that the air pressure at Dallas became lower.
      3. On October 3, 2025, Chicago’s meteorological team recorded that the wind gusts in the suburbs remained stable.
   - weather examples for template 3:
      1. Anna Lee, PH.D observed that in May 2024, the dew point at Denver decreased.
      2. Meteorologist John Roberts noted that on August 12, 2022, the wind chill in New York increased.
      3. I recorded that on 6/9/2023, the humidity at Coral Gables stayed the same.
   - weather examples for template 4:
      1. According to Me, the heat index at Philadelphia rose in 07/2023.
      2. According to Meteorologist Jake Wilson, the rainfall levels at Portland stayed consistent on March 18, 2022.
      3. According to Dylan, the wind speed at his cabin dropped on 11/6/2025.
   - weather examples for template 5:
      1. On December 3, 2023, Meteorologist Claire Thompson noted that the cloud coverage in Buffalo increased.
      2. On 4/28/2022, I observed that the humidity at Tampa remained stable.
      3. During the month of September, the Los Angeles Weather Bureau recorded that the precipitation levels in Pasadena fell.
   - weather examples for template 6:
      1. Temperature in Las Vegas rose in July 2023, according to Meteorologist Nina Patel.
      2. Humidity in Austin declined on August 3, 2024, according to Dr. Kevin Morales.
      3. Wind speed in Key West remained stable on 10/9/2023, according to the Florida Weather Bureau.
"""
weather_input_dict = {
    "observation_domain": "weather",
    "observation_domain_output": weather_outputs,
    "domain_requirements": weather_requirements,
    "domain_examples": weather_examples,
    "observations_N": generate_N_observations_per_template
}
weather_prompt_output = pipeline_prompt.format(**weather_input_dict)

###  Template for Sports Observations

In [13]:
sport_outputs = """score, touchdown, goal, points, win, lose, etc."""
sport_requirements = """- Should be based on real-world sports.
    - Suppose the time when $p$ was made is during any season of sports.
    - Include reports from all sports professionals, coaches, or any type of sport entity."""
sport_examples = """
    - sport examples for template 1:
        1. Coach Lisa Martinez observed that the assist rate at the New York Knicks dropped in March 2022.
        2. Analyst Mark Johnson noted that the batting average at the Boston Red Sox remained stable in July 2023.
        3. Ryan recorded that the win percentage he had in tennis improved on 5/10/2019.
    - sport examples for template 2:
        1. On May 18, 2023, Coach Maria Lopez observed that the pass completion rate at the Denver Broncos increased.
        2. On 11/5/2021, Analyst David Kim noted that the home run count at the New York Yankees rose sharply.
        3. On August 12, 2024, Detravious recorded that the serve accuracy he had in volleyball declined.
    - sport examples for template 3:
        1. Coach Elena Ruiz observed that on 9/22/2022, the foul count at FC Barcelona increased.
        2. Analyst Marcus Lee noted that on June 10, 2021, the shot accuracy at the LA Lakers dropped.
        3. George Jr. recorded that on November 2, 2023, the save percentage he had in hockey stayed consistent.
    - sport examples for template 4:
        1. According to Coach Sarah Nguyen, the three-point percentage at the Houston Rockets decreased in February 2024.
        2. According to Analyst Trevor Simmons, the rushing yards at the Buffalo Bills increased on 12/18/2023.
        3. According to Real Madrid staff, the win ratio at Real Madrid improved in April 2022.
    - sport examples for template 5:
        1. In January 2021, Coach Miguel Torres observed that the tackle success rate at Juventus stayed stable.
        2. In March 2030, Analyst Fiona Bennett recorded that the win rate at the Los Angeles Clippers decreased slightly.
        3. In July 2023, Calvin noted that the goals per match he had in soccer increased steadily.
    - sport examples for template 6:
        1. The corner kick count at Liverpool FC surged in March 2016, according to Me.
        2. The win percentage at the San Francisco 49ers dropped slightly in October 2023, according to Analyst Priya Sharma.
        3. The shot accuracy on Arnold's basketball team remained steady in 11/2022, according to Arnold.
"""
sport_input_dict = {
    "observation_domain": "sport",
    "observation_domain_output": sport_outputs,
    "domain_requirements": sport_requirements,
    "domain_examples": sport_examples,
    "observations_N": generate_N_observations_per_template
}
sport_prompt_output = pipeline_prompt.format(**sport_input_dict)

###  Template for Miscellaneous Observations

In [14]:
miscellaneous_outputs = """These outputs will take in any random output relating ot any real world situation."""
miscellaneous_requirements = """These outputs will take in any random output relating ot any real world situation.
    - Suppose the time when <p> was made is during any season or part of the year.
    - Include any type of entity.."""
miscellaneous_examples = """
    - miscellaneous examples for template 1:
        1. Professor Laura White observed that attendance rates at Midtown University declined in April 2002.
        2. Chef Alberto Reyes noted that the sourness of the lemon tart at his kitchen increased in summer 2020.
        3. Gaming expert Jordan Lee recorded that the probability of drawing a queen during poker night rose in March 2023.
    - miscellaneous examples for template 2:
        1. On 4/12/2018, Professor Maria Jackson observed that class participation at Highschool rose steadily.
        2. On November 22, 2023, Chef Gabrielle Moreau recorded that the burger flavor at Burger King increased.
        3. On August 6, 2005, Oliver Cheng noted that the chance of rolling an odd number stayed stable.
    - miscellaneous examples for template 3:
        1. Dr. Sarah McDonald observed that on Dec 15, 2022, library usage at Jefferson High increased.
        2. Coach Tony Roberts noted that on 6/30/2021, the sprint times for the Track Club improved.
        3. Sarah recorded that on February 12, 2023, the odds of drawing an ace from the deck dropped.
    - miscellaneous examples for template 4:
        1. According to Coach Andre Collins, the turnover rate for the Basketball Club increased in March 3, 2019.
        2. According to Chef Emily Gonzalez, the creaminess of the cheesecake at The Velvet Crumb dropped in late Winter 2020.
        3. According to game analyst Tom Spencer, the frequency of triple rolls in Settlers of Catan increased in 2023.
    - miscellaneous examples for template 5:
        1. In Summer 2021, Professor Kim Tansley observed that dropout rates at Lakefield College remained consistent.
        2. In 3/2017, Chef Michael Harris reported that customer satisfaction at Chipotle improved slightly.
        3. In May of 2004, I recorded that the randomness of coin flips during the simulation remained unchanged.
    - miscellaneous examples for template 6:
        1. Lecture attendance improved in Spring 2023, according to Professor B.T.
        2. The texture of the sourdough crust changed in September 2022, according to Chef Veronica Miller.
        3. FD's probability of rolling a critical hit remained stable on 2/17/2024, according to FD.
"""
miscellaneous_input_dict = {
    "observation_domain": "miscellaneous",
    "observation_domain_output": miscellaneous_outputs,
    "domain_requirements": miscellaneous_requirements,
    "domain_examples": miscellaneous_examples,
    "observations_N": generate_N_observations_per_template
}
miscellaneous_prompt_output = pipeline_prompt.format(**miscellaneous_input_dict)

## Generate Observations

### Text Generation Models

In [15]:
tgmf = TextGenerationModelFactory()

# Groq Cloud (https://console.groq.com/docs/overview)
gemma_29b_generation_model = tgmf.create_instance('gemma2-9b-it') 
llama_318b_instant_generation_model = tgmf.create_instance('llama-3.1-8b-instant') 
llama_3370b_versatile_generation_model = tgmf.create_instance('llama-3.3-70b-versatile')  
# llama_guard_4_12b_generation_model = tgmf.create_instance('meta-llama/llama-guard-4-12b')  

text_generation_models_groqcloud = [gemma_29b_generation_model, llama_318b_instant_generation_model]

# NaviGator (https://api.ai.it.ufl.edu/ui/)
# llama_3170b_generation_model = tgmf.create_instance('llama-3.1-70b-instruct')  
# llama_3370b_generation_model = tgmf.create_instance('llama-3.3-70b-instruct')  
# mixtral_87b_instruct_generation_model = tgmf.create_instance('mixtral-8x7b-instruct') 
# llama_318b_generation_model = tgmf.create_instance('llama-3.1-8b-instruct')  
# mistral_7b_generation_model = tgmf.create_instance('mistral-7b-instruct')  
# mistral_small_31_generation_model = tgmf.create_instance('mistral-small-3.1')

# text_generation_models_navigator = [llama_3170b_generation_model, llama_3370b_generation_model, 
#                                     mixtral_87b_instruct_generation_model, llama_318b_generation_model,
#                                     mistral_7b_generation_model, mistral_small_31_generation_model
#                                     ]

In [18]:
N_batches = 1
# observation_domains = ["finance", "health", "policy", "weather", "sport", "miscellaneous"]
observation_domains = ["sport"]

# observation_prompt_outputs = {
#     "finance": financial_prompt_output,
#     "health": health_prompt_output,
#     "policy": policy_prompt_output,
#     "weather": weather_prompt_output,
#     "sport": sport_prompt_output,
#     "miscellaneous": miscellaneous_prompt_output,
# }
observation_prompt_outputs = {
    "sport": sport_prompt_output
}
prediction_label = 0
save_log = "data"
batched_observations_df = tgmf.batch_generate_data(N_batches=N_batches, 
                                text_generation_models=text_generation_models_groqcloud, 
                                domains=observation_domains,
                                prompt_outputs=observation_prompt_outputs,
                                sentence_label=prediction_label,
                                save_path=save_log)

  0%|          | 0/1 [00:00<?, ?it/s]

sport --- gemma2-9b-it --- GROQ_CLOUD
generates:
T1:  Analyst Emily Chen observed that the turnover rate at the Miami Heat remained stable in September 2023. 

T2: On 08/21/2024, Coach David Rodriguez monitored the  free throw percentage at the Chicago Bulls changed. 

T3:  Analyst Kevin Lee noted on 10/15/2022, the  field goal percentage at the  Dallas Cowboys fell.

T4: According to Coach Sophia Jackson, the points per game at the  Toronto Raptors rose in June 2025. 

T5: In 03/2026, Analyst Michael Garcia envisioned that the  tackles per game at the  Los Angeles Rams decreased. 

T6: The  assists per game at the  Golden State Warriors increased in December 2024, according to Coach Jessica Lee. 




sport --- llama-3.1-8b-instant --- GROQ_CLOUD


100%|██████████| 1/1 [00:01<00:00,  1.47s/it]

generates:
T1: Coach Thompson observed that the free throw percentage at the Chicago Bulls dropped in February 2024.
T2: On August 25, 2022, Analyst Patel noted that the home run count at the Los Angeles Dodgers rose sharply.
T3: George observed that on 10/15/2023, the turnover rate at the Miami Heat stayed consistent.
T4: According to Manager Lee, the assist ratio at the Toronto Raptors improved in November 2021.
T5: In April 2023, Coach Chen envisioned that the field goal percentage at the Golden State Warriors decreased slightly.
T6: The save percentage at the Boston Bruins surged in December 2020, according to Analyst Brooks.

Start logging batch
log_directory: /Users/detraviousjamaribrinkley/Documents/Development/research_labs/uf_ds/predictions/data/observation_logs
Save CSV: /Users/detraviousjamaribrinkley/Documents/Development/research_labs/uf_ds/predictions/data/observation_logs/batch_18-observation/batch_18-from_df.csv

CSV to Log





In [None]:
pd.set_option('max_colwidth', 800)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
batched_observations_df