<a href="https://colab.research.google.com/github/rlahiff/rlahiff-introduction-to-github/blob/main/Copy_of_FactCheckerBot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Task
Build a fact checker bot using Python that utilizes OpenAI GPT for fact-checking and saves the output to a Google Sheet.

## Set up the environment

### Subtask:
Install necessary libraries (OpenAI, google-api-python-client, google-auth-httplib2, google-auth-oauthlib). Authenticate with Google Sheets API.


**Reasoning**:
Install the required libraries using pip.



In [None]:
%pip install openai google-api-python-client google-auth-httplib2 google-auth-oauthlib



**Reasoning**:
Authenticate with the Google Sheets API using the downloaded credentials.



In [None]:
import os.path

from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

# If modifying these scopes, delete the file token.json.
SCOPES = ["https://www.googleapis.com/auth/spreadsheets"]

def authenticate_google_sheets():
  """Shows basic usage of the Sheets API.
  Prints values from a sample spreadsheet.
  """
  creds = None
  # The file token.json stores the user's access and refresh tokens, and is
  # created automatically when the authorization flow completes for the first
  # time.
  if os.path.exists("token.json"):
    creds = Credentials.from_authorized_user_file("token.json", SCOPES)
  # If there are no (valid) credentials available, let the user log in.
  if not creds or not creds.valid:
    if creds and creds.expired and creds.refresh_token:
      creds.refresh(Request())
    else:
      flow = InstalledAppFlow.from_client_secrets_file(
          "credentials.json", SCOPES
      )
      creds = flow.run_local_server(port=0)
    # Save the credentials for the next run
    with open("token.json", "w") as token:
      token.write(creds.to_json())

  try:
    service = build("sheets", "v4", credentials=creds)
    print("Successfully authenticated with Google Sheets API.")
    return service
  except HttpError as err:
    print(err)
    return None

# Call the authentication function
sheets_service = authenticate_google_sheets()

FileNotFoundError: [Errno 2] No such file or directory: 'credentials.json'

## Get input

### Subtask:
Define how the bot will receive the facts to be checked (e.g., from a user input, a list, or another source).


**Reasoning**:
Create a list variable to store the sample facts.



In [None]:
facts_to_check = [
    "The capital of France is Berlin.",
    "The sun is a planet.",
    "Water boils at 100 degrees Celsius at standard atmospheric pressure.",
    "Birds can fly.",
    "Humans have gills."
]

print(facts_to_check)

['The capital of France is Berlin.', 'The sun is a planet.', 'Water boils at 100 degrees Celsius at standard atmospheric pressure.', 'Birds can fly.', 'Humans have gills.']


## Fact checking with openai gpt

### Subtask:
Write code to send the fact to the OpenAI API and get the fact-checked response.


**Reasoning**:
Define a function to fact-check a single fact using the OpenAI API and then iterate through the list of facts, calling this function and storing the results.



In [None]:
import openai
import os

# Set your OpenAI API key
openai.api_key = os.environ.get("OPENAI_API_KEY")

def fact_check_with_openai(fact):
    """Fact checks a given fact using the OpenAI API."""
    try:
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "system", "content": "You are a helpful fact-checker."},
                {"role": "user", "content": f"Fact check this: '{fact}'. Provide a concise response stating if it's true or false and a brief explanation."}
            ],
            max_tokens=100
        )
        return response.choices[0].message['content'].strip()
    except Exception as e:
        print(f"An error occurred while fact-checking '{fact}': {e}")
        return "Error during fact-checking."

fact_checked_results = []
for fact in facts_to_check:
    checked_response = fact_check_with_openai(fact)
    fact_checked_results.append({"original_fact": fact, "checked_response": checked_response})

print(fact_checked_results)

An error occurred while fact-checking 'The capital of France is Berlin.': 

You tried to access openai.ChatCompletion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.

You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface. 

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742

An error occurred while fact-checking 'The sun is a planet.': 

You tried to access openai.ChatCompletion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.

You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface. 

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is ava

**Reasoning**:
The previous code failed because the OpenAI API usage is outdated. Update the code to use the new OpenAI library syntax.



In [None]:
from openai import OpenAI
import os

# Set your OpenAI API key
openai_client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

def fact_check_with_openai(fact):
    """Fact checks a given fact using the OpenAI API."""
    try:
        response = openai_client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "system", "content": "You are a helpful fact-checker."},
                {"role": "user", "content": f"Fact check this: '{fact}'. Provide a concise response stating if it's true or false and a brief explanation."}
            ],
            max_tokens=100
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        print(f"An error occurred while fact-checking '{fact}': {e}")
        return "Error during fact-checking."

fact_checked_results = []
for fact in facts_to_check:
    checked_response = fact_check_with_openai(fact)
    fact_checked_results.append({"original_fact": fact, "checked_response": checked_response})

print(fact_checked_results)

OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable

## Fact checking with openai gpt - retry 1

### Subtask:
Retry fact checking with openai gpt after fixing the missing API key issue.


**Reasoning**:
Set the OPENAI_API_KEY environment variable and then iterate through the facts_to_check list, calling the fact_check_with_openai function for each fact and appending the results to the fact_checked_results list as dictionaries. Finally, print the fact_checked_results list.



In [None]:
import os
from openai import OpenAI

# Set your OpenAI API key - Replace 'your-openai-api-key' with your actual key
os.environ[""] = "your-openai-api-key"

# Set your OpenAI API key
openai_client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

def fact_check_with_openai(fact):
    """Fact checks a given fact using the OpenAI API."""
    try:
        response = openai_client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "system", "content": "You are a helpful fact-checker."},
                {"role": "user", "content": f"Fact check this: '{fact}'. Provide a concise response stating if it's true or false and a brief explanation."}
            ],
            max_tokens=100
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        print(f"An error occurred while fact-checking '{fact}': {e}")
        return "Error during fact-checking."

fact_checked_results = []
for fact in facts_to_check:
    checked_response = fact_check_with_openai(fact)
    fact_checked_results.append({"original_fact": fact, "checked_response": checked_response})

print(fact_checked_results)

An error occurred while fact-checking 'The capital of France is Berlin.': Error code: 401 - {'error': {'message': 'Incorrect API key provided: your-ope*******-key. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}
An error occurred while fact-checking 'The sun is a planet.': Error code: 401 - {'error': {'message': 'Incorrect API key provided: your-ope*******-key. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}
An error occurred while fact-checking 'Water boils at 100 degrees Celsius at standard atmospheric pressure.': Error code: 401 - {'error': {'message': 'Incorrect API key provided: your-ope*******-key. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}
An error occurred while f

## Process the output

### Subtask:
Parse the response from OpenAI to extract the fact-checked information.


**Reasoning**:
Iterate through the fact_checked_results list and parse the 'checked_response' string to extract the truth value and explanation. Update the dictionary with this extracted information.



In [None]:
parsed_results = []
for item in fact_checked_results:
    checked_response = item.get('checked_response', '')
    truth_value = "Unknown"
    explanation = checked_response

    if "True" in checked_response:
        truth_value = "True"
        explanation = checked_response.replace("True: ", "", 1).strip()
    elif "False" in checked_response:
        truth_value = "False"
        explanation = checked_response.replace("False: ", "", 1).strip()
    elif "Error" in checked_response:
        truth_value = "Error"
        explanation = checked_response.strip()

    parsed_results.append({
        "original_fact": item.get('original_fact', ''),
        "truth_value": truth_value,
        "explanation": explanation
    })

print(parsed_results)

[{'original_fact': 'The capital of France is Berlin.', 'truth_value': 'Error', 'explanation': 'Error during fact-checking.'}, {'original_fact': 'The sun is a planet.', 'truth_value': 'Error', 'explanation': 'Error during fact-checking.'}, {'original_fact': 'Water boils at 100 degrees Celsius at standard atmospheric pressure.', 'truth_value': 'Error', 'explanation': 'Error during fact-checking.'}, {'original_fact': 'Birds can fly.', 'truth_value': 'Error', 'explanation': 'Error during fact-checking.'}, {'original_fact': 'Humans have gills.', 'truth_value': 'Error', 'explanation': 'Error during fact-checking.'}]


## Save to google sheets

### Subtask:
Write code to append the original fact and the fact-checked output to a Google Sheet.


**Reasoning**:
Write code to append the original fact and the fact-checked output to a Google Sheet using the authenticated sheets service.



In [None]:
# Define the Spreadsheet ID and the range
# Replace 'YOUR_SPREADSHEET_ID' with your actual Spreadsheet ID
SPREADSHEET_ID = 'YOUR_SPREADSHEET_ID'
RANGE_NAME = 'Sheet1!A:C' # Assuming columns A, B, C for Original Fact, Truth Value, Explanation

# Prepare the data to be written to the sheet
values_to_write = []
for result in parsed_results:
    values_to_write.append([
        result.get('original_fact', ''),
        result.get('truth_value', ''),
        result.get('explanation', '')
    ])

# Write the data to the Google Sheet
try:
    if sheets_service:
        body = {
            'values': values_to_write
        }
        result = sheets_service.values().append(
            spreadsheetId=SPREADSHEET_ID,
            range=RANGE_NAME,
            valueInputOption='USER_ENTERED',
            body=body
        ).execute()
        print(f"{result.get('updates').get('updatedCells')} cells appended.")
    else:
        print("Google Sheets service not available. Authentication failed in a previous step.")
except HttpError as error:
    print(f"An error occurred: {error}")

NameError: name 'sheets_service' is not defined

**Reasoning**:
The previous code failed because the `sheets_service` variable was not defined in the current session. I need to re-execute the authentication code to make the `sheets_service` available.



In [None]:
import os.path

from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

# If modifying these scopes, delete the file token.json.
SCOPES = ["https://www.googleapis.com/auth/spreadsheets"]

def authenticate_google_sheets():
  """Shows basic usage of the Sheets API.
  Prints values from a sample spreadsheet.
  """
  creds = None
  # The file token.json stores the user's access and refresh tokens, and is
  # created automatically when the authorization flow completes for the first
  # time.
  if os.path.exists("token.json"):
    creds = Credentials.from_authorized_user_file("token.json", SCOPES)
  # If there are no (valid) credentials available, let the user log in.
  if not creds or not creds.valid:
    if creds and creds.expired and creds.refresh_token:
      creds.refresh(Request())
    else:
      flow = InstalledAppFlow.from_client_secrets_file(
          "credentials.json", SCOPES
      )
      creds = flow.run_local_server(port=0)
    # Save the credentials for the next run
    with open("token.json", "w") as token:
      token.write(creds.to_json())

  try:
    service = build("sheets", "v4", credentials=creds)
    print("Successfully authenticated with Google Sheets API.")
    return service
  except HttpError as err:
    print(err)
    return None

# Call the authentication function
sheets_service = authenticate_google_sheets()

FileNotFoundError: [Errno 2] No such file or directory: 'credentials.json'

## Save to google sheets - retry 1

### Subtask:
Retry saving to google sheets after addressing the authentication issue.


**Reasoning**:
Define the spreadsheet ID and range, prepare the data, and attempt to append the data to the Google Sheet using the previously obtained `sheets_service`. Handle potential exceptions during the writing process.



In [None]:
# Define the spreadsheet ID and the range
# Replace 'YOUR_SPREADSHEET_ID' with the actual ID of your Google Sheet
SPREADSHEET_ID = 'YOUR_SPREADSHEET_ID'
RANGE_NAME = 'Sheet1!A:C'  # Assuming you want to write to columns A, B, and C of Sheet1

# Prepare the data to be written
# Each inner list represents a row: [original_fact, truth_value, explanation]
values_to_write = []
for item in parsed_results:
    values_to_write.append([
        item.get('original_fact', ''),
        item.get('truth_value', ''),
        item.get('explanation', '')
    ])

if sheets_service:
    try:
        body = {
            'values': values_to_write
        }
        result = sheets_service.spreadsheets().values().append(
            spreadsheetId=SPREADSHEET_ID,
            range=RANGE_NAME,
            valueInputOption='USER_ENTERED',
            insertDataOption='INSERT_ROWS',
            body=body
        ).execute()
        print(f"{result.get('updates').get('updatedCells')} cells appended.")
    except HttpError as error:
        print(f"An error occurred: {error}")
    except Exception as e:
        print(f"An unexpected error occurred: {e}")
else:
    print("Google Sheets service not available. Authentication failed in a previous step.")


NameError: name 'sheets_service' is not defined

**Reasoning**:
The previous attempt failed because `sheets_service` was not defined. Re-authenticate with Google Sheets API to get the `sheets_service` object and then attempt to save the data to the sheet again. This addresses the authentication issue from the previous step and retries the saving process.



In [None]:
import os.path

from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

# If modifying these scopes, delete the file token.json.
SCOPES = ["https://www.googleapis.com/auth/spreadsheets"]

def authenticate_google_sheets():
  """Shows basic usage of the Sheets API.
  Prints values from a sample spreadsheet.
  """
  creds = None
  # The file token.json stores the user's access and refresh tokens, and is
  # created automatically when the authorization flow completes for the first
  # time.
  if os.path.exists("token.json"):
    creds = Credentials.from_authorized_user_file("token.json", SCOPES)
  # If there are no (valid) credentials available, let the user log in.
  if not creds or not creds.valid:
    if creds and creds.expired and creds.refresh_token:
      creds.refresh(Request())
    else:
      flow = InstalledAppFlow.from_client_secrets_file(
          "credentials.json", SCOPES
      )
      creds = flow.run_local_server(port=0)
    # Save the credentials for the next run
    with open("token.json", "w") as token:
      token.write(creds.to_json())

  try:
    service = build("sheets", "v4", credentials=creds)
    print("Successfully authenticated with Google Sheets API.")
    return service
  except HttpError as err:
    print(err)
    return None

# Call the authentication function
sheets_service = authenticate_google_sheets()

# Define the spreadsheet ID and the range
# Replace 'YOUR_SPREADSHEET_ID' with the actual ID of your Google Sheet
SPREADSHEET_ID = 'YOUR_SPREADSHEET_ID'
RANGE_NAME = 'Sheet1!A:C'  # Assuming you want to write to columns A, B, and C of Sheet1

# Prepare the data to be written
# Each inner list represents a row: [original_fact, truth_value, explanation]
values_to_write = []
# Assuming 'parsed_results' is available from a previous step
for item in parsed_results:
    values_to_write.append([
        item.get('original_fact', ''),
        item.get('truth_value', ''),
        item.get('explanation', '')
    ])

if sheets_service:
    try:
        body = {
            'values': values_to_write
        }
        result = sheets_service.spreadsheets().values().append(
            spreadsheetId=SPREADSHEET_ID,
            range=RANGE_NAME,
            valueInputOption='USER_ENTERED',
            insertDataOption='INSERT_ROWS',
            body=body
        ).execute()
        print(f"{result.get('updates').get('updatedCells')} cells appended.")
    except HttpError as error:
        print(f"An error occurred: {error}")
    except Exception as e:
        print(f"An unexpected error occurred: {e}")
else:
    print("Google Sheets service not available. Authentication failed.")

FileNotFoundError: [Errno 2] No such file or directory: 'credentials.json'

## Summary:

### Data Analysis Key Findings

*   The initial setup and authentication with the Google Sheets API failed because the `credentials.json` file was not found.
*   The process of fact-checking using the OpenAI API encountered errors due to using an outdated method call (`openai.ChatCompletion.create`) and, subsequently, a missing OpenAI API key environment variable.
*   Despite the OpenAI API errors, the code for parsing the results successfully processed the error messages and extracted the information into a structured format.
*   Saving the processed data to Google Sheets failed repeatedly due to the persistent issue of the missing `credentials.json` file, which prevented successful authentication with the Google Sheets API.

### Insights or Next Steps

*   To proceed, the user needs to provide the `credentials.json` file from their Google Cloud Console to enable Google Sheets API authentication.
*   Ensure the `OPENAI_API_KEY` environment variable is correctly set with a valid API key to allow the OpenAI API calls to succeed.
