**Prompt Engineering: OpenAI API Key Integration**

In [1]:
# Install the 0.28.0 version of OpenAI Python library (if not already installed)
!pip install openai==0.28.0

Collecting openai==0.28.0
  Downloading openai-0.28.0-py3-none-any.whl.metadata (13 kB)
Downloading openai-0.28.0-py3-none-any.whl (76 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 1.57.4
    Uninstalling openai-1.57.4:
      Successfully uninstalled openai-1.57.4
Successfully installed openai-0.28.0


In [2]:
# Import the OpenAI library
import openai

# Assign your OpenAI API key
from google.colab import userdata
openai.api_key = userdata.get('openai_api_key')

# Select your required large language model (LLM) from the below set of variables
language_model = "o1"                 # model supports at most 1,00,000 completion tokens
# language_model = "o1-mini"            # model supports at most 65,536 completion tokens
# language_model = "o1-preview"         # model supports at most 32,768 completion tokens

# language_model = "gpt-3.5-turbo"      # model supports at most 4,096 completion tokens
# language_model = "gpt-3.5-turbo-16k"  # model supports at most 16,384 completion tokens

# language_model = "gpt-4-turbo"        # model supports at most 4,096 completion tokens
# language_model = "gpt-4"              # model supports at most 8,192 completion tokens
# language_model = "gpt-4o"             # model supports at most 16,384 completion tokens
# language_model = "gpt-4-32k"          # model supports at most 32,768 completion tokens

In [3]:
# Define a function to interact with OpenAI's Chat API (LLM supported: o1)
def ask_openai_o1(prompt, model=language_model):
    try:
        response = openai.ChatCompletion.create(
            model=model,
            messages=[
                {"role": "system", "content": "You are a helpful assistant."},
                {"role": "user", "content": prompt},
            ],
            max_completion_tokens=60000  # parameter name compatible for o1 model
        )
        return response.choices[0].message['content'].strip()
    except Exception as e:
        return f"An error occurred: {e}"

In [4]:
# Define a function to interact with OpenAI's Chat API (LLMs supported: o1-mini, o1-preview)
def ask_openai_o1_series(prompt, model=language_model):
    try:
        response = openai.ChatCompletion.create(
            model=model,
            messages=[
                {
                    "role": "user",
                    "content": (
                        "You are a helpful assistant. "  # Incorporate 'system' role content here
                        + prompt
                    )
                }
            ],
            max_completion_tokens=32000  # parameter name compatible for o1 model
        )
        return response.choices[0].message['content'].strip()
    except Exception as e:
        return f"An error occurred: {e}"

In [5]:
# Define a function to interact with OpenAI's Chat API (LLMs supported: 3.5+, 4+ and 4o)
def ask_openai_3_and_4_series(prompt, model=language_model):
    try:
        response = openai.ChatCompletion.create(
            model=model,
            messages=[
                {"role": "system", "content": "You are a helpful assistant."},
                {"role": "user", "content": prompt},
            ],
            max_tokens=15000,  # parameter name compatible for 3.5+, 4+ and 4o
            temperature=0.5
        )
        return response.choices[0].message['content'].strip()
    except Exception as e:
        return f"An error occurred: {e}"

In [6]:
# Simple prompt engineering

"""
Objective: Develop a Python-based solution that automates the process of fetching all Excel files with the same structure
(e.g., identical column names and formats) from a specified folder path. The solution should concatenate these files into a single DataFrame,
preserving the structure and ensuring no data is lost.
"""

prompt = """

Task:
Create a Python script that takes an input folder path containing Excel files with an identical structure and concatenates the data from all the files into one comprehensive DataFrame.
________________________________________
Inputs:
  •	Input Folder Path: The folder containing the Excel files to be processed.
  •	Excel File Structure: All files have the same column names and formats.
Expected Output:
  •	A single consolidated DataFrame containing the combined data from all the Excel files.
________________________________________

Persona:
You are a Python developer specializing in data automation. Your expertise lies in handling structured datasets, writing efficient scripts, and ensuring robustness and scalability.
________________________________________

Format:
Provide the Python script with detailed comments and instructions for execution. Ensure that the script:
  •	Reads all Excel files from the specified folder.
  •	Verifies that all files have the same structure.
  •	Concatenates the files into one DataFrame.
________________________________________
Important Notes:
1.	File Structure Validation:
  •	Ensure all Excel files in the folder have the same column structure; otherwise, concatenation may fail.
2.	Error Handling:
  •	The script must include error messages for missing files, empty folders, and other issues to aid debugging.
3.	Scalability:
  •	Optimize for handling multiple files and large datasets, with options for saving results in different formats.

"""

result = ask_openai_o1(prompt)
print("Response from OpenAI:\n\n", result)

Response from OpenAI:

 #!/usr/bin/env python3
"""
Script Name: excel_concatenator.py

Description:
    This script reads all Excel files from a specified folder, verifies that they have the same
    column structure, and concatenates them into a single pandas DataFrame. The script includes
    error handling for issues such as:
        - Invalid folder paths
        - Empty folders
        - Mismatched file structures
        - Large dataset handling

Usage:
    1. Ensure that Python 3.x is installed on your system.
    2. Install required packages: pandas and openpyxl (for .xlsx files).
       Example: pip install pandas openpyxl
    3. Run the script from a terminal or command prompt:
       python excel_concatenator.py /path/to/excel/folder
    4. The script will print out status messages and either:
       - Display a sample of the combined DataFrame.
       - Display an error message if an issue occurs.
       - Optionally save the DataFrame to a CSV or Excel file, if desired (co