The following script utilises OpenAi's 'File-search tool', implemented in a simple terminal-based chat application

In order to run this application, the following steps are required:

1. Install the required packages

In [None]:
pip install openai rich

2. Set your OpenAI API key as an environmental variable or enter it when prompted.

Windows Set-up
- Run the following in the cmd prompt, replacing <yourkey> with your API key:

In [None]:
setx OPENAI_API_KEY "<yourkey>"

3. Run the script: 

In [None]:
python conversation.py

To improve the effectiveness of the Chat application in providing up-to-date, reliable results - the chatbot utilises 'Vector stores' via the openai API Storage Dashboard. 


(INSERT IMAGE OF VECTOR STORE EXAMPLE)

In this example, two websites - specifically recommended by POCD - have been 'scraped', converted into seperate txt. files, and stored in our vector store.

These websites are titled:
                             ' Describing the serverity of a hearing loss loss ' - Aussie Deaf Kids
                             ' Describing hearing loss ' - Aussie Deaf Kids


These websites have been converted into txt, as seen in the folder of this repository 'output_txt_files'

To utilise this vector store, the ID (vs_6801d393c83c819184cad189cc621a23), has been specified in our application. This ID is specific to my (Dylan's) OpenAI dashboard environment.

Listed below, is an explanation of each part of the application's infrastructure.

Package importation: 

In [None]:
import os
import openai
from rich.console import Console
from rich.markdown import Markdown

Initialising the OpenAI client:

openai_api_key = os.environ.get("OPENAI_API_KEY")
if not openai_api_key:
    openai_api_key = input("Please enter your OpenAI API key: ")
    os.environ["OPENAI_API_KEY"] = openai_api_key

client = openai.OpenAI(api_key=openai_api_key)
console = Console()

Vector store ID

VECTOR_STORE_ID = "vs_6801d393c83c819184cad189cc621a23"

In [None]:
def print_markdown(text):
    """Print text as markdown."""
    console.print(Markdown(text))

def chat_with_file_search():
    """Main chat loop with file search capabilities."""
    console.print("[bold green]File Search Enabled Chat[/bold green]")
    console.print("[italic]Type 'exit' to quit the chat[/italic]")
    console.print()
    
    previous_response_id = None
    
    while True:
        # Get user input
        user_input = input("\n[You]: ")
        
        if user_input.lower() == 'exit':
            console.print("[bold green]Thank you for chatting![/bold green]")
            break
        
        try:
            # Create a response with file search tool
            response = client.responses.create(
                model="gpt-4o-mini",
                input=user_input,
                previous_response_id=previous_response_id,
                tools=[{
                    "type": "file_search",
                    "vector_store_ids": [VECTOR_STORE_ID]
                }],
                include=["file_search_call.results"]
            )
            
            # Save the response ID for conversation continuity
            previous_response_id = response.id
            
            # Print the assistant's response
            console.print("\n[Assistant]:", style="bold blue")
            print_markdown(response.output_text)
            
            # If file search was used, print the citations
            if hasattr(response, 'file_search_calls') and response.file_search_calls:
                console.print("\n[Citations]:", style="bold yellow")
                for file_search_call in response.file_search_calls:
                    if hasattr(file_search_call, 'search_results') and file_search_call.search_results:
                        for i, result in enumerate(file_search_call.search_results, 1):
                            console.print(f"[{i}] File: {result.file.filename}")
                            console.print(f"    Excerpt: {result.text[:100]}...")
        
        except Exception as e:
            console.print(f"\n[bold red]Error: {str(e)}[/bold red]")

Verification of valid API key:

In [None]:
if __name__ == "__main__":
    # Check if the API key is valid
    try:
        client.models.list()
        chat_with_file_search()
    except Exception as e:
        console.print(f"[bold red]Failed to initialize: {str(e)}[/bold red]")
        console.print("[bold yellow]Make sure your API key is correct and has access to the file search feature.[/bold yellow]")

Listed below is an example of the chatbot in a terminal-based environment:

In [1]:
import os
import openai
from rich.console import Console
from rich.markdown import Markdown

# Initialize the OpenAI client
openai_api_key = os.environ.get("OPENAI_API_KEY")
if not openai_api_key:
    openai_api_key = input("Please enter your OpenAI API key: ")
    os.environ["OPENAI_API_KEY"] = openai_api_key

client = openai.OpenAI(api_key=openai_api_key)
console = Console()

# Vector store ID (from the provided documentation)
VECTOR_STORE_ID = "vs_6801d393c83c819184cad189cc621a23"

def print_markdown(text):
    """Print text as markdown."""
    console.print(Markdown(text))

def chat_with_file_search():
    """Main chat loop with file search capabilities."""
    console.print("[bold green]File Search Enabled Chat[/bold green]")
    console.print("[italic]Type 'exit' to quit the chat[/italic]")
    console.print()
    
    previous_response_id = None
    
    while True:
        # Get user input
        user_input = input("\n[You]: ")
        
        if user_input.lower() == 'exit':
            console.print("[bold green]Thank you for chatting![/bold green]")
            break
        
        try:
            # Create a response with file search tool
            response = client.responses.create(
                model="gpt-4o-mini",
                input=user_input,
                previous_response_id=previous_response_id,
                tools=[{
                    "type": "file_search",
                    "vector_store_ids": [VECTOR_STORE_ID]
                }],
                include=["file_search_call.results"]
            )
            
            # Save the response ID for conversation continuity
            previous_response_id = response.id
            
            # Print the assistant's response
            console.print("\n[Assistant]:", style="bold blue")
            print_markdown(response.output_text)
            
            # If file search was used, print the citations
            if hasattr(response, 'file_search_calls') and response.file_search_calls:
                console.print("\n[Citations]:", style="bold yellow")
                for file_search_call in response.file_search_calls:
                    if hasattr(file_search_call, 'search_results') and file_search_call.search_results:
                        for i, result in enumerate(file_search_call.search_results, 1):
                            console.print(f"[{i}] File: {result.file.filename}")
                            console.print(f"    Excerpt: {result.text[:100]}...")
        
        except Exception as e:
            console.print(f"\n[bold red]Error: {str(e)}[/bold red]")

if __name__ == "__main__":
    # Check if the API key is valid
    try:
        client.models.list()
        chat_with_file_search()
    except Exception as e:
        console.print(f"[bold red]Failed to initialize: {str(e)}[/bold red]")
        console.print("[bold yellow]Make sure your API key is correct and has access to the file search feature.[/bold yellow]")

Can you describe how hearing loss is measured?

what are some positive news today?

After conclusing the chatbot, the tokens spent, amounted to 87,923 - at a cost of $0.03 - Where:
Spend categories:

    - LLM UTILISATION (GPT 4o mini) INPUT           = $0.007
    - LLM UTILISATION (GPT 4o mini) OUTPUT          = $0.001
    - LLM UTILISATION (GPT 4o mini) cached input    = $0.003
    - FILE SEARCH TOOL CALLS                        = $0.015 
