## Simple Predetermined

In [None]:
import pandas as pd
import random
from openai import OpenAI

df = pd.read_csv(r"D:\LLMTables\LLMTablesQA\Question Generation\TestTables_5\sportset_coldtemp_30_13.csv")
client = OpenAI()
row = df.sample(1, random_state=random.randint(1, 1000), ignore_index=True)
def generate_qa_pairs(row, num_pairs=1):
    
    # Temporal
    prompt = f"""
    Objective:
    Create complex, logic-driven questions based on a dataset row that represents a real-world scenario. Each question should require multi-step reasoning, utilizing both structured and unstructured elements of the row. The questions should not directly reveal answers, instead encouraging logical deductions and practical reasoning.
    Row Data: {row.to_dict(orient='records')[0]}
    Guidelines for Question Generation:
    1. Inter-Cell Logic: Questions must connect multiple data points within the row, encouraging indirect reasoning between structured and unstructured components.
    2. Structured + Unstructured: Utilize both types of data (e.g., scores, descriptions) when formulating questions.
    3. Avoid Direct Retrieval: Do not allow answers to be directly extracted from the row; ensure questions require reasoning.
    4. Multi-hop Reasoning: Answers should require multiple logical operations or connections to arrive at.
    5. Unique, Deterministic Answer: Ensure only one valid answer can be derived from the data, based on logical connections.
    6. Simplified Language: Keep the language of questions concise and clear, avoiding unnecessary complexity.
    7. No Word Overlap: Avoid using the same words or phrases from the row in the questions to maintain a level of difficulty.
    Single Subtle Hint: Include one subtle clue to guide the reasoning, but ensure further deductions are needed for the correct answer.
    9. Semantic Diversity: Aim for semantic diversity in the questions, avoiding repetition of themes or concepts.
    
    Example Questions + why they are good questions:
    1. In which month did the NY Knicks team play their games, considering it was during the final quarter of the year?
    Reason: There are just the right amount of giveaways to answer the question. You can see that the team is New York knocks and find their games at the end of the year.
    2. Where outside USA has the boston team played?
    Reason: "Outside USA" is a good marker for indirect clues which can confuse the person answering the question.
    3. In which city did a team achieve a victory with a margin of 17 points on the last Tuesday of October? 
    Reaso8. n: There are two markers point margin and day which can help the model to filter out the information without giving too much information away.
    4. Where was a game held that featured a standout performance of 41 points and nearly full attendance?

    Take inspiration from these questions and try to make the question but not so vague that multiple answers exist in the table for that question.
    Please note that the question should be an independent entity in terms of understanding it. This means it should not require the row on which it was created to understand the question itself. 
    Generate {num_pairs} such questions. Ensure that the answer is given in maximum 3-5 words.

    Answer format:
    Q: <Generated Question>
    A: <Generated Answer>
    """

    # Define your messages
    messages = [
        {"role": "system", "content": "You are an expert in generating complex questions from tabular data. Your task is to create questions that require analyzing inter-cell relationships within the provided table and can be answered directly or indirectly using the table's content. The questions should leverage both structured and unstructured components of the table data."},
        {"role": "user", "content": prompt}
    ]

    # Make the API call
    completion = client.chat.completions.create(
        model="gpt-4o-mini",  # or the model you are using
        messages=messages,
        temperature=0.5
    )

    # Print the response
    print(completion.choices[0].message.content)

print(row)

generate_qa_pairs(row, 3)

   day     month  year   dayname  season              stadium       city  \
0   10  December  2016  Saturday    2016  Quicken Loans Arena  Cleveland   

  state  attendance  capacity  game_id  \
0  Ohio           0     19400     2596   

                                             summary  
0  LeBron James led the way for the Cavs as he we...  
Q1: What is the name of the venue where a game was played on a day with zero attendees, despite a player scoring 44 points and achieving their 73rd 40-point game?
A1: Quicken Loans Arena

Q2: In which city did a team break another team's three-game winning streak, with a player contributing 10 assists and nearly achieving a triple-double?
A2: Cleveland

Q3: Which state hosted a game where the home team maintained their strong home record and achieved their fourth consecutive win during the winter season?
A3: Ohio


- World Knowledge reference
    Q: Where did a team secure a victory in a winter season game with a 9 point score difference?
    Q: Where outside USA has the boston team played?

    - Team relationships 
    Q: Which rookie was moved to the bench in favor of Louis Williams in the backcourt in an overfilled stadium?
    Q: During which game of Kobe Bryant’s farewell tour did he score 21 points?
    Q: Who filled in for Jonas Valanciunas due to injury and scored a season-high 15 points?

- General info
    Q: In which stadium did attendance exceed capacity during a December game?
    Q: Which team narrowly defeated the Lakers to break a two-game losing streak?

### Unstructured Predetermined

In [40]:
df = pd.read_csv(r"D:\LLMTables\LLMTablesQA\Question Generation\TestTables_5\sportset_coldtemp_30_13.csv")
client = OpenAI()
# row = """
# day,month,year,dayname,season,stadium,city,state,attendance,capacity,game_id,summary
# 7,December,2015,Monday,2015,Air Canada Centre,Toronto,Ontario,20200,19800,2100,"The Raptors ( 13 - 9 ) escaped the Lakers ( 3 - 18 ) with a 102 - 93 victory at Air Canada Centre on Monday . Los Angeles trailed by just a point after the third quarter , but Toronto was able to pull away with five minutes remaining in the game to end a two - game losing streak . The Raptors got plenty of contributions in the fourth quarter , but the guy who carried them all game was Terrence Ross , who scored a season - high 22 points ( 8 - 12 FG , 4 - 6 3Pt , 2 - 2 FT ) over 39 minutes while also grabbing six rebounds . Kobe Bryant had a decent night shooting the ball as he continues his farewell tour , scoring 21 points on 8 - of - 16 shooting with eight boards and four assists . Larry Nance Jr. replaced Julius Randle in the starting lineup , but failed to outperform him . Randle scored 15 points ( 6 - 13 FG , 3 - 3 FT ) with 11 rebounds for his fifth double - double in six games . The other youngster moved to the bench was rookie D'Angelo Russell , who was replaced by Louis Williams alongside Jordan Clarkson in the backcourt . Russell played 21 minutes off the bench , scoring nine points on 4 - of - 12 shooting from the field . Nick Young was held out of the lineup again by coach Byron Scott . After suffering two close losses to the Nuggets and Warriors - - each at home - - the Raptors got back on track , albeit barely , against the Lakers . With center Jonas Valanciunas sidelined with a fractured hand , Bismack Biyombo has seen a bump in playing time , and finally made it count Monday . Biyombo finished with a season - high 15 points ( 4 - 8 FG , 7 - 11 FT ) with 13 rebounds and two blocks over 29 minutes . Kyle Lowry was his usual self , scoring a game - high 27 points 9 - 19 FG , 5 - 11 3Pt , 4 - 4 FT ) with seven rebounds and six assists . The Raptors will have their hands full Wednesday when the Spurs come to town ; the Lakers play the sixth game of an eight - game road trip against the T-Wolves ."
# """
row = df.sample(1, random_state=random.randint(1, 1000), ignore_index=True)
def generate_qa_pairs(row, num_pairs=1):
    prompt = f"""
    Row Data: {row.to_dict(orient='records')[0]}

    Objective:
    Create simple, generalizable questions that can be answered uniquely from the given row when evaluated against the full table. The questions should:
    - Be based on a unique property or a combination of properties from the row. Make sure that only one answer exists in the table for the question
    - Avoid excessive specificity or anchoring to exact values, instead using general descriptions or time references (e.g., "final Friday of November").
    - Be concise and intuitive, with a natural language structure.
    - Ensure the question requires reasoning or cross-referencing across columns for a unique answer.
    - Make sure that the answer is not too vague and is grounded to the row it is being generated by in some way.

    Question Generation Guidelines:
    1. Inter-Cell Logic: Questions must connect multiple data points within the row, encouraging indirect reasoning between structured and unstructured components.
    2. Structured + Unstructured: Utilize both types of data (e.g., scores, descriptions) when formulating questions.
    3. Avoid Direct Retrieval: Do not allow answers to be directly extracted from the row; ensure questions require reasoning.
    4. Multi-hop Reasoning: Answers should require multiple logical operations or connections to arrive at.
    5. Unique, Deterministic Answer: Ensure only one valid answer can be derived from the data, based on logical connections.
    6. Simplified Language: Keep the language of questions concise and clear, avoiding unnecessary complexity.
    7. No Word Overlap: Avoid using the same words or phrases from the row in the questions to maintain a level of difficulty.
    8. Single Subtle Hint: Include one subtle clue to guide the reasoning, but ensure further deductions are needed for the correct answer.
    9. Semantic Diversity: Aim for semantic diversity in the questions, avoiding repetition of themes or concepts.
 

    Examples:
    Q: Which team was preparing to host the Spurs following a close win over the Lakers?
    Q: Which team was preparing to host the Rockets after rebounding from a loss to Brooklyn with a win over the Knicks?
    Q: Which team outscored their opponent by 24 points in the first half at Madison Square Garden?
    Q: Which team was set to face the Pelicans in New Orleans following a loss to the Cavaliers?
    Q: Which team was heading into the All-Star break on a four-game winning streak, with a game against the Pistons up next?

    **Generate {num_pairs} questions in the following format**:
    Q: <Generated Question>
    A: <Generated Answer>
    """
    
    # Define your messages
    messages = [
        {"role": "system", "content": "You are an expert in creating evaluation questions from tabular data. Your task is to design generalized, simple questions using unique row properties."},
        {"role": "user", "content": prompt}
    ]

    # Make the API call
    completion = client.chat.completions.create(
        model="gpt-4o",  # or the model you are using
        messages=messages,
        temperature=0.3 # Slightly increase temperature for creative output
    )

    # Print the response
    print(completion.choices[0].message.content)

print(row)

generate_qa_pairs(row, 10)


   day    month  year  dayname  season           stadium     city    state  \
0    1  January  2019  Tuesday    2018  Scotiabank Arena  Toronto  Ontario   

   attendance  capacity  game_id  \
0       19800     19800     5800   

                                             summary  
0  The Toronto Raptors defeated the Utah Jazz , 1...  
Q1: Which team managed to secure a victory at a venue in Ontario after a strong third-quarter performance?
A1: Toronto Raptors

Q2: Which team overcame a halftime deficit to win a game where a player scored 45 points?
A2: Toronto Raptors

Q3: Which team played a game on the first day of a month, where their bench player contributed 14 points?
A3: Toronto Raptors

Q4: Which team played a game in a Canadian city and had a player scoring 28 points with 10 rebounds?
A4: Toronto Raptors

Q5: Which team won a game despite losing the final quarter by four points?
A5: Toronto Raptors

Q6: Which team had a player who scored 45 points in a game held on a Tuesday

In [42]:
# Prompt taken from End-to-End Qa in Chain of Table paper
answer_prompt = f"""
Here is the table to answer this question. Answer the question as an entity. Return an explanation on how you reached the answer.
{string}
Question: What is the name of the venue where a game was played on a day with zero attendees, despite a player scoring 44 points and achieving their 73rd 40-point game?
The answer is: 
"""

# Define your messages
messages = [
    {"role": "system", "content": "You are an expert in answering questions from tabular data."},
    {"role": "user", "content": answer_prompt}
]

# Make the API call
completion = client.chat.completions.create(
    model="gpt-4o-mini",  # or the model you are using
    messages=messages,
    temperature = 0.0
)

# Print the response
print(completion.choices[0].message.content)

The answer is: Quicken Loans Arena

**Explanation:**
To arrive at this answer, I searched through the provided table for a game where the attendance was recorded as zero. I found that on December 10, 2016, at Quicken Loans Arena, LeBron James scored 44 points, marking the 73rd 40-point game of his career. The table explicitly states that the attendance for this game was zero, which matches the criteria of the question.


In [22]:
import pandas as pd
df = pd.read_csv("D:\LLMTables\LLMTablesQA\Question Generation\TestTables_5\sportset_coldtemp_30_13.csv")

string = ''
string = '/*\n'
col_list = df.columns.values.tolist()
string += 'col : ' + ' | '.join(df.columns) + '\n'
for row_id, row in df.iterrows():
    string += f'row {row_id} : '
    for column_id, header in enumerate(df.columns):
        string += str(row[header])
        if column_id != len(df.columns) - 1:
            string += ' | '
    string += '\n'
string += '*/\n'

In [10]:
# Prompt taken from End-to-End Qa in Chain of Table paper
answer_prompt = f"""
Here is the table to answer this question. Answer the question as an entity. Return an explanation on how you reached the answer.
{string}
Question: In which city did a team with a home advantage struggle to maintain possession, resulting in numerous turnovers?
The answer is: 
"""

# Define your messages
messages = [
    {"role": "system", "content": "You are an expert in answering questions from tabular data."},
    {"role": "user", "content": answer_prompt}
]

# Make the API call
completion = client.chat.completions.create(
    model="gpt-4o-mini",  # or the model you are using
    messages=messages,
    temperature = 0.0
)

# Print the response
print(completion.choices[0].message.content)

RateLimitError: Error code: 429 - {'error': {'message': 'Rate limit reached for gpt-4o-mini in organization org-fRla5InIDQaDKnPLyFBX0VHH on requests per day (RPD): Limit 200, Used 200, Requested 1. Please try again in 7m12s. Visit https://platform.openai.com/account/rate-limits to learn more.', 'type': 'requests', 'param': None, 'code': 'rate_limit_exceeded'}}

### Temporal Predetermined

#### Single Row for experimentation

In [114]:
df = pd.read_csv(r"D:\LLMTables\LLMTablesQA\Question Generation\TestTables_5\sportset_coldtemp_30_13.csv")
client = OpenAI()
# row = """
# day,month,year,dayname,season,stadium,city,state,attendance,capacity,game_id,summary
# 7,December,2015,Monday,2015,Air Canada Centre,Toronto,Ontario,20200,19800,2100,"The Raptors ( 13 - 9 ) escaped the Lakers ( 3 - 18 ) with a 102 - 93 victory at Air Canada Centre on Monday . Los Angeles trailed by just a point after the third quarter , but Toronto was able to pull away with five minutes remaining in the game to end a two - game losing streak . The Raptors got plenty of contributions in the fourth quarter , but the guy who carried them all game was Terrence Ross , who scored a season - high 22 points ( 8 - 12 FG , 4 - 6 3Pt , 2 - 2 FT ) over 39 minutes while also grabbing six rebounds . Kobe Bryant had a decent night shooting the ball as he continues his farewell tour , scoring 21 points on 8 - of - 16 shooting with eight boards and four assists . Larry Nance Jr. replaced Julius Randle in the starting lineup , but failed to outperform him . Randle scored 15 points ( 6 - 13 FG , 3 - 3 FT ) with 11 rebounds for his fifth double - double in six games . The other youngster moved to the bench was rookie D'Angelo Russell , who was replaced by Louis Williams alongside Jordan Clarkson in the backcourt . Russell played 21 minutes off the bench , scoring nine points on 4 - of - 12 shooting from the field . Nick Young was held out of the lineup again by coach Byron Scott . After suffering two close losses to the Nuggets and Warriors - - each at home - - the Raptors got back on track , albeit barely , against the Lakers . With center Jonas Valanciunas sidelined with a fractured hand , Bismack Biyombo has seen a bump in playing time , and finally made it count Monday . Biyombo finished with a season - high 15 points ( 4 - 8 FG , 7 - 11 FT ) with 13 rebounds and two blocks over 29 minutes . Kyle Lowry was his usual self , scoring a game - high 27 points 9 - 19 FG , 5 - 11 3Pt , 4 - 4 FT ) with seven rebounds and six assists . The Raptors will have their hands full Wednesday when the Spurs come to town ; the Lakers play the sixth game of an eight - game road trip against the T-Wolves ."
# """
row = df.sample(1, random_state=random.randint(1, 1000), ignore_index=True)
def generate_qa_pairs(row, num_pairs=1):
    prompt = f"""
    Row Data: {row.to_dict(orient='records')[0]}

    Objective:
        Create complex, logic-driven, temporal questions based on a dataset row that represents a real-world scenario. Each question should require multi-step reasoning, utilizing both structured and unstructured elements of the row. The questions should not directly reveal answers, instead encouraging logical deductions and practical reasoning.
        Row Data: {row.to_dict(orient='records')[0]}
        Guidelines for Question Generation:
        - Self-Contained: The question should make sense on its own without needing to refer to data points from the table.
        - Use Both Structured and Unstructured Data: Include both numbers (scores, stats) and descriptions in the question.
        - Require Reasoning: The answer should not be directly available from the table; it should require logical thinking.
        - Multi-Step Reasoning: Answers should involve multiple logical connections or steps to arrive at.
        - Unique Answer: There should be only one valid answer based on the data.
        - Clear and Simple Language: Keep the questions straightforward and easy to understand.
        - No Word Overlap: Avoid repeating exact phrases from the row in the question.
        - Subtle Hint: Include one subtle clue, but further reasoning should still be needed for the correct answer.
        - Time-Based: The question should involve elements of time, comparing events or situations at different times.
        - Focus on Season Progression: Questions should relate to different stages of the season (early, mid, late) or the timing of events.    
    
    Example Questions:
        1. Which time zone did the 76ers team play in the end of november?
        Reason: The question has minimum giveaways such as team name and a rough figure of the time period and is encouraging to look into the time zone of the game.
        2. In which season did Minnesota Timeberwolves win a home game?
        Reason: With the help of minimal clues such as the team name and home game it is refering to the NBA season encouraging inherent knowledge
        3. How many days rest do the Cleveland Cavaliers have before their next game?
        Reason: The summary mentions when the Cleveland Cavaliers play next so this encourages simple numerical reasoning over time elements
        4. Does the game between knicks and 76ers take place in regular NBA season?
        Reason: By looking at the row having the games between the two teams, you have to tell whether it is normal NBA season or off season
        5. What was the Wizards' win-loss record after their last game of the 2015 season?
        Reason: This ensure that the model goes through various wizards games to count the wins and the losses by filtering for the 2015 season.
              
      Please make sure that the questions do not rely on the table for understanding. Meaning they should make sense in isolation as well. Generate {num_pairs} such questions. Ensure that the answer is given in maximum 3-5 words.
        MAKE SURE THE QUESTIONS ARE INDIRECT AND COMPLEX AND UTILIZE VARIOUS ASPECTS OF THE TABLE.
        Please give answer in JSON format:
        Q: <Generated Question>
        A: <Generated Answer>
    """
    
    # Define your messages
    messages = [
        {"role": "system", "content": "You are an expert in creating evaluation questions from tabular data. Your task is to design generalized, simple questions using unique row properties.Please give valid output JSON"},
        {"role": "user", "content": prompt}
    ]

    chat_completion, *_ = client.chat.completions.create(
            model="gpt-4o", 
            messages=messages,
            response_format={"type": "json_object"},
            temperature=0.6
    ).choices
    content = chat_completion.message.content
    reply = json.loads(content)
    return reply

print(row)

generate_qa_pairs(row, 5)


   day     month  year dayname  season          stadium      city     state  \
0    5  December  2014  Friday    2014  Barclays Center  Brooklyn  New York   

   attendance  capacity  game_id  \
0       16100     17700      665   

                                             summary  
0  The Atlanta Hawks ( 12 - 6 ) defeated the Broo...  


{'questions': [{'Q': 'How many games in total did the Atlanta Hawks win in the regular season before facing the Brooklyn Nets on December 5th, 2014?',
   'A': '12'},
  {'Q': 'Considering their previous game against the San Antonio Spurs, how many consecutive games have the Brooklyn Nets failed to achieve a .500 win ratio by December 5th, 2014?',
   'A': '22 days'},
  {'Q': 'What is the percentage difference in the free throw shooting efficiency between the Atlanta Hawks and the Brooklyn Nets during their game on December 5th, 2014?',
   'A': '0%'},
  {'Q': 'How many days before playing the Nuggets did the Atlanta Hawks extend their winning streak to five games in December 2014?',
   'A': '2 days'},
  {'Q': 'On what day of the week did the Brooklyn Nets fail to improve their win-loss record to .500 for the first time since November 13th, 2014?',
   'A': 'Friday'}]}

##### Answering Check

In [108]:
# Prompt taken from End-to-End Qa in Chain of Table paper
answer_prompt = f"""
Here is the table to answer this question. Answer the question as an entity. 
{string}
Question: When did the Brooklyn Nets last play before facing the Philadelphia 76ers in the playoffs?
The answer is: 
"""

# Define your messages
messages = [
    {"role": "system", "content": "You are an expert in answering questions from tabular data."},
    {"role": "user", "content": answer_prompt}
]

# Make the API call
completion = client.chat.completions.create(
    model="gpt-4o-mini",  # or the model you are using
    messages=messages,
    temperature = 0.0
)

# Print the response
print(completion.choices[0].message.content)

April 3, 2019


### Temporal end-to-end generation pipeline

In [109]:
import pandas as pd
import random
import os
import json
import openai  

def read_csv_files(folder_path):
    csv_files = []
    for filename in os.listdir(folder_path):
        if filename.endswith(".csv"):
            file_path = os.path.join(folder_path, filename)
            csv_files.append(file_path)
    return csv_files

def sample_row(df):
    return df.sample(1, random_state=random.randint(1, 1000), ignore_index=True)

def generate_qa_pairs(row, num_pairs=1):
    prompt = f"""
    Objective:
        Create complex, logic-driven, temporal questions based on a dataset row that represents a real-world scenario. Each question should require multi-step reasoning, utilizing both structured and unstructured elements of the row. The questions should not directly reveal answers, instead encouraging logical deductions and practical reasoning.
        Row Data: {row.to_dict(orient='records')[0]}
        Guidelines for Question Generation:
        1. Inter-Cell Logic: Questions must connect multiple data points within the row, encouraging indirect reasoning between structured and unstructured components.
        2. Structured + Unstructured: Utilize both types of data (e.g., scores, descriptions) when formulating questions.
        3. Avoid Direct Retrieval: Do not allow answers to be directly extracted from the row; ensure questions require reasoning.
        4. Multi-hop Reasoning: Answers should require multiple logical operations or connections to arrive at.
        5. Unique, Deterministic Answer: Ensure only one valid answer can be derived from the data, based on logical connections.
        6. Simplified Language: Keep the language of questions concise and clear, avoiding unnecessary complexity.
        7. No Word Overlap: Avoid using the same words or phrases from the row in the questions to maintain a level of difficulty.
        8. Single Subtle Hint: Include one subtle clue to guide the reasoning, but ensure further deductions are needed for the correct answer.
        10. Semantic Diversity: Aim for semantic diversity in the questions, avoiding repetition of themes or concepts.
        11. Time Based Nature: The question should be temporal in nature, that is, it should utilize various elements of time to create logically diffcult question.
        12. Encourage Comparative Time Reasoning: Emphasize questions that compare events or details across different time points within the row. This could involve past events, future scenarios, or comparisons within the same game. 
        13. Focus on Season Progression and Event Timing: Emphasize questions that relate to the progression within the season, such as “early,” “mid,” or “late” season positioning
    
    Example Questions:
        1. Which time zone did the 76ers team play in the end of november?
        Reason: The question has minimum giveaways such as team name and a rough figure of the time period and is encouraging to look into the time zone of the game.
        2. In which season did Minnesota Timeberwolves win a home game?
        Reason: With the help of minimal clues such as the team name and home game it is refering to the NBA season encouraging inherent knowledge
        3. How many days rest do the Cleveland Cavaliers have before their next game?
        Reason: The summary mentions when the Cleveland Cavaliers play next sothis encourages simple numerical reasoning over time elements
        4. Does the game between knicks and 76ers take place in regular NBA season?
        Reason: By looking at the row having the games between the two teams, you have to tell whether it is normal NBA season or off season
        5. Which team faced a significant disadvantage due to a key player's injury while playing in February?
        Reason: Combines a key information in summary with the month in which the game was played
        6. Which team won against a division leader with a notable performance in December?
        7. What was the result of a matchup where one team scored 99 points and the event took place on a January weekday in California?
       
        Generate {num_pairs} such questions. Ensure that the answer is given in maximum 3-5 words.

        Answer format:
        Q: <Generated Question>
        A: <Generated Answer>
    """
    
    messages = [
        {"role": "system", "content": "You are an expert in generating complex questions from tabular data. Your task is to create questions that require analyzing inter-cell relationships within the provided table and can be answered directly or indirectly using the table's content. The questions should leverage both structured and unstructured components of the table data."},
        {"role": "user", "content": prompt}
    ]

    completion = client.chat.completions.create(
        model="gpt-4o-mini", 
        messages=messages,
        temperature=0.5
    )
    return completion.choices[0].message.content  

def process_csv_file(file_path, table_id, num_questions=5):
    df = pd.read_csv(file_path)
    qa_list = []
    for i in range(num_questions):
        row = sample_row(df) 
        qa_output = generate_qa_pairs(row, num_pairs=1) 
        question = qa_output.split("\n")[0].strip()  
        answer = qa_output.split("\n")[1].strip()
        question_id = f"{table_id}_{i + 1}"
        qa_entry = {
            "question_id": question_id,
            "question": question,
            "answer": answer,
            "row": row.to_dict(orient='records')[0], 
            "table_id": table_id,
            "table_absolute_path": file_path,
            "question_type": "Predetermined"
        }
        qa_list.append(qa_entry)
    return qa_list

def process_all_csvs(folder_path, num_questions=5):
    output_data = {"results": []} 
    csv_files = read_csv_files(folder_path)
    
    for table_id, file_path in enumerate(csv_files):
        qa_entries = process_csv_file(file_path, table_id, num_questions)
        output_data["results"].extend(qa_entries)
    
    return output_data

def save_to_json(output_data, output_json_path):
    with open(output_json_path, "w") as json_file:
        json.dump(output_data, json_file, indent=4)
    print(f"Output saved to {output_json_path}")

def main_pipeline(folder_path, output_json_path, num_questions=5):
    output_data = process_all_csvs(folder_path, num_questions)
    save_to_json(output_data, output_json_path)

if __name__ == "__main__":
    folder_path = r"D:\LLMTables\LLMTablesQA\Question Generation\TestTables_5"  
    output_json_path = r"D:\LLMTables\LLMTablesQA\Question Generation\Predetermined-SingleRow\temporal_output_new.json" 
    num_questions_per_csv = 5
    main_pipeline(folder_path, output_json_path, num_questions=num_questions_per_csv)

Output saved to D:\LLMTables\LLMTablesQA\Question Generation\Predetermined-SingleRow\temporal_output_new.json


#### Hard evluation pipeline (exact matching of answers)

In [110]:
import pandas as pd
import json
import openai

def convert_to_pipe_format(path_to_csv):
    df = pd.read_csv(path_to_csv)
    string = '/*\n'
    col_list = df.columns.values.tolist()
    string += 'col : ' + ' | '.join(df.columns) + '\n'
    for row_id, row in df.iterrows():
        string += f'row {row_id} : '
        for column_id, header in enumerate(df.columns):
            string += str(row[header])
            if column_id != len(df.columns) - 1:
                string += ' | '
        string += '\n'
    string += '*/\n'
    string += f'columns:{col_list}\n'
    return string

def generate_short_answer(table, question):
    answer_prompt = f"""
    Here is the table to answer this question. Answer the question in 3-4 words max.
    {table}
    Question: {question}
    The answer is: 
    """
    messages = [
        {"role": "system", "content": "You are an expert in answering questions from tabular data."},
        {"role": "user", "content": answer_prompt}
    ]
    completion = client.chat.completions.create(
        model="gpt-4o-mini", 
        temperature=0,
        messages=messages
    )
    generated_answer = completion.choices[0].message.content.strip()
    return generated_answer

def evaluate_qa_pair(qa_pair, correct_answers_list):
    table_path = qa_pair['table_absolute_path']
    table = convert_to_pipe_format(table_path)
    question = qa_pair['question']
    generated_answer = generate_short_answer(table, question)
    correct_answer = qa_pair['answer'].replace("A: ", "").strip()
    if generated_answer.lower() == correct_answer.lower():
        correct_answers_list.append(qa_pair)
        return True
    else:
        print("incorrect answer")
        print(question)
        print("actual: " + correct_answer)
        print("generated: " + generated_answer)
        return False
    
def process_evaluation(json_data):
    total_questions = len(json_data)
    print("total questions: " + str(total_questions))
    correct_answers = 0
    incorrect_answers = []

    for qa_pair in json_data:
        if evaluate_qa_pair(qa_pair, correct_answers_list=[]):
            correct_answers += 1
        else:
            incorrect_answers.append(qa_pair)
    accuracy = (correct_answers / total_questions) * 100
    return correct_answers, accuracy, incorrect_answers

def save_incorrect_answers(incorrect_answers, output_path):
    with open(output_path, "w") as json_file:
        json.dump(incorrect_answers, json_file, indent=4)
    print(f"Incorrectly answered questions saved to {output_path}")

def evaluation_pipeline(input_json_path, incorrect_output_json_path):
    with open(input_json_path, "r") as file:
        json_data = json.load(file)
    correct_answers, accuracy, incorrect_answers = process_evaluation(json_data['results'])
    print(f"Total Correct Answers: {correct_answers}")
    print(f"Accuracy: {accuracy:.2f}%")
    save_incorrect_answers(incorrect_answers, incorrect_output_json_path)

if __name__ == "__main__":
    input_json_path = r"D:\LLMTables\LLMTablesQA\Question Generation\Predetermined-SingleRow\temporal_output_new.json"  
    incorrect_output_json_path = r"D:\LLMTables\Question Generation\incorrect_answers.json"  
    evaluation_pipeline(input_json_path, incorrect_output_json_path)

total questions: 25
incorrect answer
Q: What was the attendance at a game where one team was in last place in their division, played in February?
actual: 16,600
generated: 16600
incorrect answer
Q: Which team managed to secure a narrow victory despite trailing in the final minutes of a game played in early December?
actual: Memphis Grizzlies
generated: Utah Jazz
incorrect answer
Q: Which team improved their record in February after a close victory at home?
actual: Boston Celtics
generated: Chicago Bulls
incorrect answer
Q: How did the Celtics perform in the third quarter of their game against Minnesota in March 2017?
actual: Outscored Minnesota 27-17
generated: Outscored Minnesota 27-17.
incorrect answer
Q: What was the attendance at the game following Detroit's loss to New York?
actual: 18,400 fans
generated: 13600
incorrect answer
Q: Which team struggled early in January but aimed for improvement later that month?
actual: Chicago Bulls
generated: Indiana Pacers
incorrect answer
Q: Wh

FileNotFoundError: [Errno 2] No such file or directory: 'D:\\LLMTables\\Question Generation\\incorrect_answers.json'

##### Testing with GPT-4o

In [None]:
import pandas as pd
import random
import os
import json
import openai  

def read_csv_files(folder_path):
    csv_files = []
    for filename in os.listdir(folder_path):
        if filename.endswith(".csv"):
            file_path = os.path.join(folder_path, filename)
            csv_files.append(file_path)
    return csv_files

def sample_row(df):
    return df.sample(1, random_state=random.randint(1, 1000), ignore_index=True)

def generate_qa_pairs(row, num_pairs=1):
    prompt = f"""
    Row Data: {row.to_dict(orient='records')[0]}

    Objective:
        Create complex, logic-driven, temporal questions based on a dataset row that represents a real-world scenario. Each question should require multi-step reasoning, utilizing both structured and unstructured elements of the row. The questions should not directly reveal answers, instead encouraging logical deductions and practical reasoning.
        Row Data: {row.to_dict(orient='records')[0]}
        Guidelines for Question Generation:
        - Self-Contained: The question should make sense on its own without needing to refer to data points from the table.
        - Use Both Structured and Unstructured Data: Include both numbers (scores, stats) and descriptions in the question.
        - Require Reasoning: The answer should not be directly available from the table; it should require logical thinking.
        - Multi-Step Reasoning: Answers should involve multiple logical connections or steps to arrive at.
        - Unique Answer: There should be only one valid answer based on the data.
        - Clear and Simple Language: Keep the questions straightforward and easy to understand.
        - No Word Overlap: Avoid repeating exact phrases from the row in the question.
        - Subtle Hint: Include one subtle clue, but further reasoning should still be needed for the correct answer.
        - Time-Based: The question should involve elements of time, comparing events or situations at different times.
        - Focus on Season Progression: Questions should relate to different stages of the season (early, mid, late) or the timing of events.    
    
    Example Questions:
        1. Which time zone did the 76ers team play in the end of november?
        Reason: The question has minimum giveaways such as team name and a rough figure of the time period and is encouraging to look into the time zone of the game.
        2. In which season did Minnesota Timeberwolves win a home game?
        Reason: With the help of minimal clues such as the team name and home game it is refering to the NBA season encouraging inherent knowledge
        3. How many days rest do the Cleveland Cavaliers have before their next game?
        Reason: The summary mentions when the Cleveland Cavaliers play next so this encourages simple numerical reasoning over time elements
        4. Does the game between knicks and 76ers take place in regular NBA season?
        Reason: By looking at the row having the games between the two teams, you have to tell whether it is normal NBA season or off season
        5. What was the Wizards' win-loss record after their last game of the 2015 season?
        Reason: This ensure that the model goes through various wizards games to count the wins and the losses by filtering for the 2015 season.
              
      Please make sure that the questions do not rely on the table for understanding. Meaning they should make sense in isolation as well. Generate {num_pairs} such questions. Ensure that the answer is given in maximum 3-5 words.
        MAKE SURE THE QUESTIONS ARE INDIRECT AND COMPLEX AND UTILIZE VARIOUS ASPECTS OF THE TABLE.
        Please give answer in JSON format:
        Q: <Generated Question>
        A: <Generated Answer>
    """
    
    # Define your messages
    messages = [
        {"role": "system", "content": "You are an expert in creating evaluation questions from tabular data. Your task is to design generalized, simple questions using unique row properties.Please give valid output JSON"},
        {"role": "user", "content": prompt}
    ]

    chat_completion, *_ = client.chat.completions.create(
            model="gpt-4o", 
            messages=messages,
            response_format={"type": "json_object"},
            temperature=0.6
    ).choices
    content = chat_completion.message.content
    reply = json.loads(content)
    return reply

def process_csv_file(file_path, table_id, num_questions=5):
    df = pd.read_csv(file_path)
    qa_list = []
    for i in range(num_questions):
        row = sample_row(df) 
        qa_output = generate_qa_pairs(row, num_pairs=1) 
        print(qa_output)
        question = qa_output.split("\n")[0].strip()  
        answer = qa_output.split("\n")[1].strip()
        question_id = f"{table_id}_{i + 1}"
        qa_entry = {
            "question_id": question_id,
            "question": question,
            "answer": answer,
            "row": row.to_dict(orient='records')[0], 
            "table_id": table_id,
            "table_absolute_path": file_path,
            "question_type": "Predetermined"
        }
        qa_list.append(qa_entry)
    return qa_list

def process_all_csvs(folder_path, num_questions=5):
    output_data = {"results": []} 
    csv_files = read_csv_files(folder_path)
    
    for table_id, file_path in enumerate(csv_files):
        qa_entries = process_csv_file(file_path, table_id, num_questions)
        output_data["results"].extend(qa_entries)
    
    return output_data

def save_to_json(output_data, output_json_path):
    with open(output_json_path, "w") as json_file:
        json.dump(output_data, json_file, indent=4)
    print(f"Output saved to {output_json_path}")

def main_pipeline(folder_path, output_json_path, num_questions=5):
    output_data = process_all_csvs(folder_path, num_questions)
    save_to_json(output_data, output_json_path)

if __name__ == "__main__":
    folder_path = r"D:\LLMTables\LLMTablesQA\Question Generation\TestTables_5"  
    output_json_path = r"D:\LLMTables\LLMTablesQA\Question Generation\Predetermined-SingleRow\gpt_4o_temporal_output_new.json" 
    num_questions_per_csv = 5
    main_pipeline(folder_path, output_json_path, num_questions=num_questions_per_csv)

Q: Considering the timing of the Pelicans' win and the injuries mentioned, which player's absence is expected to impact the Magic's early season performance the most?

A: Victor Oladipo
Q: Considering the Grizzlies' recent performances, which team did they face at the end of their homestand, and what day of the week did this occur?

A: Orlando Magic, Monday
Q: Considering the Grizzlies' defensive strategy on October 28, 2017, how did their performance against the Rockets compare to their earlier season opener against the same team?

A: Similar defensive dominance
Q: How many days after Kemba Walker's wrist injury did the Hornets face the Minnesota Timberwolves at home?

A: Two days later
Q: Which team, known for strong defense but weak three-point shooting, hosted a game against a rival during the mid-season of 2015?

A: Memphis Grizzlies
Q: During which month did a team previously competing for the top conference spot suffer a downturn, yet manage a key victory against a strong oppone

In [None]:
import pandas as pd
import json
import openai

def convert_to_pipe_format(path_to_csv):
    df = pd.read_csv(path_to_csv)
    string = '/*\n'
    col_list = df.columns.values.tolist()
    string += 'col : ' + ' | '.join(df.columns) + '\n'
    for row_id, row in df.iterrows():
        string += f'row {row_id} : '
        for column_id, header in enumerate(df.columns):
            string += str(row[header])
            if column_id != len(df.columns) - 1:
                string += ' | '
        string += '\n'
    string += '*/\n'
    string += f'columns:{col_list}\n'
    return string

def generate_short_answer(table, question):
    answer_prompt = f"""
    Here is the table to answer this question. Answer the question in 3-4 words max.
    {table}
    Question: {question}
    The answer is: 
    """
    messages = [
        {"role": "system", "content": "You are an expert in answering questions from tabular data."},
        {"role": "user", "content": answer_prompt}
    ]
    completion = client.chat.completions.create(
        model="gpt-4o-mini", 
        temperature=0,
        messages=messages
    )
    generated_answer = completion.choices[0].message.content.strip()
    return generated_answer

def evaluate_qa_pair(qa_pair, correct_answers_list):
    table_path = qa_pair['table_absolute_path']
    table = convert_to_pipe_format(table_path)
    question = qa_pair['question']
    generated_answer = generate_short_answer(table, question)
    correct_answer = qa_pair['answer'].replace("A: ", "").strip()
    if generated_answer.lower() == correct_answer.lower():
        correct_answers_list.append(qa_pair)
        return True
    else:
        print("incorrect answer")
        print(question)
        print("actual: " + correct_answer)
        print("generated: " + generated_answer)
        return False
    
def process_evaluation(json_data):
    total_questions = len(json_data)
    print("total questions: " + str(total_questions))
    correct_answers = 0
    incorrect_answers = []

    for qa_pair in json_data:
        if evaluate_qa_pair(qa_pair, correct_answers_list=[]):
            correct_answers += 1
        else:
            incorrect_answers.append(qa_pair)
    accuracy = (correct_answers / total_questions) * 100
    return correct_answers, accuracy, incorrect_answers

def save_incorrect_answers(incorrect_answers, output_path):
    with open(output_path, "w") as json_file:
        json.dump(incorrect_answers, json_file, indent=4)
    print(f"Incorrectly answered questions saved to {output_path}")

def evaluation_pipeline(input_json_path, incorrect_output_json_path):
    with open(input_json_path, "r") as file:
        json_data = json.load(file)
    correct_answers, accuracy, incorrect_answers = process_evaluation(json_data['results'])
    print(f"Total Correct Answers: {correct_answers}")
    print(f"Accuracy: {accuracy:.2f}%")
    save_incorrect_answers(incorrect_answers, incorrect_output_json_path)

if __name__ == "__main__":
    input_json_path = r"D:\LLMTables\LLMTablesQA\Question Generation\Predetermined-SingleRow\gpt_4o_temporal_output_new.json"  
    incorrect_output_json_path = r"D:\LLMTables\LLMTablesQA\Question Generation\Predetermined-SingleRow\incorrect_answers.json"  
    evaluation_pipeline(input_json_path, incorrect_output_json_path)

## Final temporal pipeline
Note: change model to gpt-4o-mini as question generation on that is better. Check attached JSONs for both for reference

In [None]:
import pandas as pd
import random
import os
import json
import openai  

def read_csv_files(folder_path):
    csv_files = []
    for filename in os.listdir(folder_path):
        if filename.endswith(".csv"):
            file_path = os.path.join(folder_path, filename)
            csv_files.append(file_path)
    return csv_files

def sample_row(df):
    return df.sample(1, random_state=random.randint(1, 1000), ignore_index=True)


def generate_qa_pairs(row, num_pairs=1):
    prompt = f"""
    Row Data: {row.to_dict(orient='records')[0]}

    Objective:
        Create complex, logic-driven, simple language, temporal questions based on a dataset row that represents a real-world scenario. Each question should require multi-step reasoning, utilizing both structured and unstructured elements of the row. The questions should not directly reveal answers, instead encouraging logical deductions and practical reasoning.
        Row Data: {row.to_dict(orient='records')[0]}
        Guidelines for Question Generation:
        - Make the question as natural and human sounding as possible. The language should be kept simple and human.
        - Self-Contained: The question should make sense on its own without needing to refer to data points from the table.
        - Use Both Structured and Unstructured Data: Include both numbers (scores, stats) and descriptions in the question.
        - Require Reasoning: The answer should not be directly available from the table; it should require logical thinking.
        - Multi-Step Reasoning: Answers should involve multiple logical connections or steps to arrive at.
        - Unique Answer: There should be only one valid answer based on the data.
        - Clear and Simple Language: Keep the questions straightforward and easy to understand.
        - No Word Overlap: Avoid repeating exact phrases from the row in the question.
        - Subtle Hint: Include one subtle clue, but further reasoning should still be needed for the correct answer.
        - Time-Based: The question should involve elements of time, comparing events or situations at different times.
        - Focus on Season Progression: Questions should relate to different stages of the season (early, mid, late) or the timing of events.    
    
    Example Questions:
        1. Which time zone did the 76ers team play in the end of november?
        Reason: The question has minimum giveaways such as team name and a rough figure of the time period and is encouraging to look into the time zone of the game.
        2. In which season did Minnesota Timeberwolves win a home game?
        Reason: With the help of minimal clues such as the team name and home game it is refering to the NBA season encouraging inherent knowledge
        3. How many days rest do the Cleveland Cavaliers have before their next game?
        Reason: The summary mentions when the Cleveland Cavaliers play next so this encourages simple numerical reasoning over time elements
        4. Does the game between knicks and 76ers take place in regular NBA season?
        Reason: By looking at the row having the games between the two teams, you have to tell whether it is normal NBA season or off season
        5. What was the Wizards' win-loss record after their last game of the 2015 season?
        Reason: This ensure that the model goes through various wizards games to count the wins and the losses by filtering for the 2015 season.
              
      Please make sure that the questions do not rely on the table for understanding. Meaning they should make sense in isolation as well. Generate {num_pairs} such questions. Ensure that the answer is given in maximum 3-5 words.
        MAKE SURE THE QUESTIONS ARE INDIRECT AND COMPLEX AND UTILIZE VARIOUS ASPECTS OF THE TABLE.
        Please give answer in JSON format:
        Q: <Generated Question>
        A: <Generated Answer>
    """
    
    messages = [
        {"role": "system", "content": "You are an expert in creating evaluation questions from tabular data."},
        {"role": "user", "content": prompt}
    ]

    chat_completion, *_ = client.chat.completions.create(
        model="gpt-4o", 
        messages=messages,
        response_format={"type": "json_object"},
        temperature=0.6
    ).choices

    content = chat_completion.message.content
    reply = json.loads(content)
    return reply


def process_csv_file(file_path, table_id, num_questions=5, existing_data=None):
    df = pd.read_csv(file_path)
    qa_list = existing_data["questions"] if existing_data else []

    for i in range(num_questions):
        row = sample_row(df) 
        qa_output = generate_qa_pairs(row, num_pairs=1) 

        question = qa_output["Q"]
        answer = qa_output["A"]
        question_id = f"{table_id}_{i + 1}"

        existing_entry = next((item for item in qa_list if item["question_id"] == question_id), None)
        if existing_entry:
            existing_entry["question"] = question
            existing_entry["answer"] = answer
            existing_entry["row"] = row.to_dict(orient='records')[0]
        else:
            qa_entry = {
                "question_id": question_id,
                "question": question,
                "answer": answer,
                "row": row.to_dict(orient='records')[0],
                "table_id": table_id,
                "table_absolute_path": file_path,
                "question_type": "Predetermined"
            }
            qa_list.append(qa_entry)

    return {"questions": qa_list}

def process_all_csvs(folder_path, num_questions=5, existing_data=None):
    output_data = {"questions": existing_data["questions"] if existing_data else []}
    csv_files = read_csv_files(folder_path)
    
    for table_id, file_path in enumerate(csv_files):
        table_data = {
            "questions": [item for item in output_data["questions"] if item["table_id"] == table_id]
        }
        updated_table_data = process_csv_file(file_path, table_id, num_questions, table_data)
        output_data["questions"].extend(updated_table_data["questions"])

    output_data["questions"] = list({v["question_id"]: v for v in output_data["questions"]}.values())
    
    return output_data


def save_to_json(output_data, output_json_path):
    with open(output_json_path, "w") as json_file:
        json.dump(output_data, json_file, indent=4)
    print(f"Output saved to {output_json_path}")

def load_existing_json(file_path):
    if os.path.exists(file_path):
        with open(file_path, "r") as json_file:
            return json.load(json_file)
    return None

def main_pipeline(folder_path, output_json_path, num_questions=5):
    existing_data = load_existing_json(output_json_path)
    output_data = process_all_csvs(folder_path, num_questions, existing_data)
    save_to_json(output_data, output_json_path)

if __name__ == "__main__":
    folder_path = r"D:\LLMTables\LLMTablesQA\Question Generation\TestTables_5"  
    output_json_path = r"D:\LLMTables\LLMTablesQA\Question Generation\Predetermined-SingleRow\gpt_4o_temporal_output_new.json" 
    num_questions_per_csv = 5
    main_pipeline(folder_path, output_json_path, num_questions=num_questions_per_csv)


Output saved to D:\LLMTables\LLMTablesQA\Question Generation\Predetermined-SingleRow\gpt_4o_temporal_output_new.json


In [119]:
import pandas as pd
import json
import openai

def convert_to_pipe_format(path_to_csv):
    df = pd.read_csv(path_to_csv)
    string = '/*\n'
    col_list = df.columns.values.tolist()
    string += 'col : ' + ' | '.join(df.columns) + '\n'
    for row_id, row in df.iterrows():
        string += f'row {row_id} : '
        for column_id, header in enumerate(df.columns):
            string += str(row[header])
            if column_id != len(df.columns) - 1:
                string += ' | '
        string += '\n'
    string += '*/\n'
    string += f'columns:{col_list}\n'
    return string

def generate_short_answer(table, question):
    answer_prompt = f"""
    Here is the table to answer this question. Answer the question in 3-4 words max.
    {table}
    Question: {question}
    The answer is: 
    """
    messages = [
        {"role": "system", "content": "You are an expert in answering questions from tabular data."},
        {"role": "user", "content": answer_prompt}
    ]
    completion = client.chat.completions.create(
        model="gpt-4o-mini", 
        temperature=0,
        messages=messages
    )
    generated_answer = completion.choices[0].message.content.strip()
    return generated_answer

def evaluate_qa_pair(qa_pair, correct_answers_list):
    table_path = qa_pair['table_absolute_path']
    table = convert_to_pipe_format(table_path)
    question = qa_pair['question']
    generated_answer = generate_short_answer(table, question)
    correct_answer = qa_pair['answer'].replace("A: ", "").strip()
    if generated_answer.lower() == correct_answer.lower():
        correct_answers_list.append(qa_pair)
        return True
    else:
        print("incorrect answer")
        print(question)
        print("actual: " + correct_answer)
        print("generated: " + generated_answer)
        return False
    
def process_evaluation(json_data):
    total_questions = len(json_data)
    print("total questions: " + str(total_questions))
    correct_answers = 0
    incorrect_answers = []

    for qa_pair in json_data:
        if evaluate_qa_pair(qa_pair, correct_answers_list=[]):
            correct_answers += 1
        else:
            incorrect_answers.append(qa_pair)
    accuracy = (correct_answers / total_questions) * 100
    return correct_answers, accuracy, incorrect_answers

def save_incorrect_answers(incorrect_answers, output_path):
    with open(output_path, "w") as json_file:
        json.dump(incorrect_answers, json_file, indent=4)
    print(f"Incorrectly answered questions saved to {output_path}")

def evaluation_pipeline(input_json_path, incorrect_output_json_path):
    with open(input_json_path, "r") as file:
        json_data = json.load(file)
    correct_answers, accuracy, incorrect_answers = process_evaluation(json_data['questions'])
    print(f"Total Correct Answers: {correct_answers}")
    print(f"Accuracy: {accuracy:.2f}%")
    save_incorrect_answers(incorrect_answers, incorrect_output_json_path)

if __name__ == "__main__":
    input_json_path = r"D:\LLMTables\LLMTablesQA\Question Generation\Predetermined-SingleRow\gpt_4o_temporal_output_new.json"  
    incorrect_output_json_path = r"D:\LLMTables\LLMTablesQA\Question Generation\Predetermined-SingleRow\incorrect_answers.json"  
    evaluation_pipeline(input_json_path, incorrect_output_json_path)

total questions: 25
incorrect answer
Considering the Miami Heat's victory on April 5, 2017, how many days after this win do they play their next game against the Raptors?
actual: Two days later
generated: Two days later.
incorrect answer
Considering Eric Bledsoe's performance trend on Wednesdays, how might his free throw effectiveness impact his points compared to field goals over his last four games?
actual: Relies on free throws.
generated: Free throw reliance.
incorrect answer
Considering the Grizzlies' performance against the 76ers and their upcoming match, how might their confidence level compare at the start of the season versus after this game?
actual: Higher after this game
generated: Increased confidence level.
incorrect answer
If a team has won four consecutive games, and their latest victory was on a Monday in March, how many games have they won in total this season if they had 37 wins before this streak?
actual: 41 wins
generated: 41 games
incorrect answer
Considering the B