In [15]:
import asyncio
import os
from openai import AsyncOpenAI
import pandas as pd
from jinja2 import Template

In [16]:
from dotenv import load_dotenv
load_dotenv()
key = os.getenv("OPENAI_API_KEY")

In [80]:
queries_list = ["Executive recruitment for an Operations Director in the Maghreb to lead production and manage regional operations.",
                "Executive search for a Regional Director of Development in the Pacific Islands to support economic growth initiatives.",
                "Searching for a Regional General Manager in the Lesser Antilles to oversee business operations and growth strategies.",
                "Generate a precise list of individuals currently serving as President, CEO, or COO in companies similar to Jawbone within the wearable technology sector. These companies focus on products such as fitness trackers, smartwatches, health monitoring devices, or other wearable electronics but do not necessarily have to manufacture them. The individuals should be based in the United States or Europe and hold an MBA degree. Exclude individuals from companies not focused on wearable technology. Precision is important",]

In [166]:
USE_CASE_GENERATION_SYSTEM = """You are an AI assistant tasked with analyzing instruction statements and converting them into clear, comprehensive use cases. You should break down complex requirements into understandable components and provide practical examples. Focus on clarity, precision, and actionable insights."""
USE_CASE_GENERATION_USER = """
Please analyze this instruction statement and create a detailed use case that explains:

1. The system's primary objective
2. The rules and constraints for processing the information
3. The expected outcomes

Write the use case in an "As a system..." format and include specific examples demonstrating correct implementation.

Instruction Statement:
{{INSTRUCTION}}
"""

In [184]:
QUERY_GENERATION_PROMPT_SYSTEM = """You are an AI assistant tasked with generating recruitment queries based on a given use case. Your goal is to create a set of queries that reflect the original, unprocessed state of job titles and related information as mentioned in the use case, before any normalization or standardization occurs.""" 
QUERY_GENERATION_PROMPT_USER = """
Here is the use case you should base your queries on:

{{USE_CASE}}

Analyze the use case carefully, paying attention to:
1. The types of job titles mentioned
2. Any variations or non-standard forms of job titles
3. Additional information included with job titles (e.g., company names, locations)
4. Different seniority levels or specializations mentioned

Generate 10 diverse recruitment queries that represent the "before" state of the data, as it would appear prior to being processed by the job title normalization system described in the use case. Your queries should:

1. Include the non-standardized or varied forms of job titles
2. Incorporate any additional information (like company names or locations) that the system is meant to remove
3. Reflect different aspects mentioned in the use case (e.g., different seniority levels, specializations)
4. Mimic realistic search queries a recruiter might use before data normalization

Remember, the goal is to create queries that showcase the type of data the normalization system is designed to process, not the output of the system.

Provide your response as a JSON list of strings, without any additional explanation or HTML tags. Start your response with:

Each query should be on a new line and enclosed in double quotes, with a comma at the end (except for the last query). Ensure that the queries are diverse and cover different aspects mentioned in the use case.""" 

In [121]:
NER_MANAGEMENT_LEVEL_TITLE_SYSTEM_PROMPT_ORIGINAL = """
You are an intelligent assistant dedicated to extracting management levels and job titles from user queries. Before doing so, you must understand what a functional area is.
"""

NER_MANAGEMENT_LEVEL_TITLE_USER_PROMPT_ORIGINAL = """

Instructions:
1. Management Levels: Only return management levels that match the predefined set: ["Partners,"," "Founder or Co-founder," "Board of Directors," "CSuite/Chiefs," "Executive VP or Sr. VP," "General Manager," "Manager," "Senior Partner," "Junior Partner," "VP," "Director," "Senior (All Senior-Level Individual Contributors)," "Mid (All Mid-Level Individual Contributors)," "Junior (All Junior-Level Individual Contributors)"]. MANAGEMENT CAN ONLY BE FROM THIS PREDEFINED SET, Nothing ELSE.
2. Job Titles: Normalize the job title after extracting it from the text. For example, convert "ceo" to "Chief Executive Officer" and always include both the full title and its abbreviation (confirmed ones), e.g., "VP of Engineering" and "Vice President of Engineering." or "Chief Innovation Officer" and "CINO". ENSURE LOGICAL and EXACT job titles such as 'Architect' NOT 'Architect who is skilled in VR'. Job titles MUST BE CONCISE AND TO THE POINT and shouldn't include company names or region names. Do not change the title for normalization.
3. Response Format: Your response must be a dictionary with two keys: "management_levels" and "titles". Each key should have a list of management levels and titles respectively.
4. Identify the key phrases in a prompt. Key phrase is a title and its function, IF THE function is mentioned. If the function would be mentioned, it will be classified as a "Job Title". For example, "CEOs working in Automotive Industry and VP of Engineering of Microsoft" has the "CEOs" and "VP of Engineering" as Key Phrases. In this case ONLY VP CANNOT be considered a key phrase. ONLY CONSIDER THE KEY PHRASES MENTIONED, DO NOT ASSUME. Past and current designations dont matter.
5. Check whether the KEY PHRASE should be classified as a Job Title or a Management Level. IT SHOULD NEVER BE CLASSIFIED INTO BOTH. This is ESSENTIAL.
5. If a key phrase is classified as title, don't include it in the management levels. For example, if "VP of Engineering" is classified as title then don't include "VP" in management levels. Industry or company names will not be included in job titles.
6. If a key phrase is classified as a management level, don't include it in the title. For example, if "Vice Presidents" is classified as management level then don't include "Vice Presidents" in titles. A job title will ONLY be a title and its business function. No other DETAIL should be added. A Management Level cannot COME FROM WITHIN A JOB TITLE.
7. Remember: One instance of a key phrase should be considered for either management level or job title, not both. Each will fall either into management levels or job title but WILL NEVER FALL INTO BOTH. A KEY PHRASE CANNOT BE IN MANAGEMENT LEVELS AND TITLE, BOTH. A Management Level cannot COME FROM WITHIN A JOB TITLE. 
8. If the word 'executive' is mentioned, specific considerations should be taken into account.

Take a deep breath and understand.
Query: "Give me VPs working in Microsoft": # VPs is the KEY PHRASE. It will be evaluated as a whole key phrase ONLY. It cannot be classified into a management level and a title both.
Management Level Focus: In this query, "VPs" should refer to individuals at the management level of Vice President within Microsoft. This means you are asking for a list of people who occupy the VP rank across various departments or divisions within the company. VPs can cover the complete domain of of 'VP' in management level. The emphasis is on their standing in the organizational hierarchy, regardless of their specific job titles. A management title will only be selected if it covers the complete domain in the predefined set.
Job Title Focus: If you were asking about "VPs" in terms of job titles, you'd be interested in individuals whose specific title is "VP" of a certain business function, such as "VP of Marketing" (Marketing is a business function) or "VP of Engineering" (Engineering is a business function). If a function is clearly mentioned then it would be JOB TITLE. "VP of Microsoft" does NOT have a function (Microsoft is an organization) neither does "Automotive VPs" (Automotive can ONLY be an industry). Identify the BUSINESS functions accurately. Then they CANNOT come under management levels.
Output: {"management_levels": ["VP"], "titles": []}

Query: "The CFOs working in google or facebook": #CFOs is a KEY PHRASE. It will be evaluated as a whole key phrase ONLY. It cannot be classified into a management level and a title both.
Management Level Focus: In this query, "CFOs" does not cover the complete 'C-Suite' domain. ONLY IF COMPLETE DOMAIN IS COVERED then the key phrase will be in management level. One job title, even if it is on the top or head of the heirarchy, does not cover the complete domain. If a user wants all 'executives', without any business function specified then three management levels will be covered, namely "CSuite," "Executive VP or Sr. VP" and "VP" so all MUST come. However, the word 'executive' is mentioned in relation to a business function, only titles specific to that function should be included. For example, if 'Marketing Executives' is mentioned, titles such as 'CMO', 'Chief Marketing Officer', 'Senior VP of Marketing', 'Senior Vice President of Marketing', 'VP of Marketing', and 'Vice President of Marketing' should be included. The word 'executive' or 'executives' would, thus, NEVER be included neither as job title nor management level.
Job Title Focus: As a CFO would only be a chief in finance, the CFO being discussed here comes under job title, not management level.
Output: {"management_levels": [], "titles": ["CFO", "Chief Finance Officer"]} # Job titles MUST BE CONCISE and TO THE POINT, mentioning ONLY the TITLE and the BUSINESS FUNCTION if the business function is given. No added details, such as company name or group.

If terms like 'leader', 'expert', 'specialist', or similar are mentioned, extract a maximum of 2-3 relevant job titles associated with those terms based on the context, focusing on the most appropriate leadership or expertise roles.

For each management level and title, also tell why you put it there. If a business function can be clearly identified, the key phrase will be a JOB title. ALWAYS make the necessary changes when the word 'executive' or 'leader' or 'expert', etc., is mentioned in the user query and get LOGICAL titles. Management level of 'Manager' will not be chosen when a specific type of 'manager' (senior managers, project manager, etc.) are asked for. ONLY identify and consider complete key phrases EXPLICITLY MENTIONED IN THE PROMPT, and each key phrase will either be in management level or title, NEVER consider THE SAME KEY PHRASE for BOTH. Evaluate key phrases separately.
Always return a JSON object in your output.

User Query: {{QUERY}}
Let's think step by step about each key phrase."""


In [145]:

# NER_MANAGEMENT_LEVEL_TITLE_SYSTEM_PROMPT = """You are an AI assistant tasked with extracting management levels and job titles from a given query. Your goal is to analyze the query, identify relevant key phrases, and categorize them appropriately as either management levels or job titles.
# """

# NER_MANAGEMENT_LEVEL_TITLE_USER_PROMPT = """
# Here is the query you need to analyze:
# <query>
# {{QUERY}}
# </query>

# Follow these steps to extract management levels and titles:

# 1. Management Levels: Only return management levels that match the predefined set: ["Partners,"," "Founder or Co-founder," "Board of Directors," "CSuite/Chiefs," "Executive VP or Sr. VP," "General Manager," "Manager," "Senior Partner," "Junior Partner," "VP," "Director," "Senior (All Senior-Level Individual Contributors)," "Mid (All Mid-Level Individual Contributors)," "Junior (All Junior-Level Individual Contributors)"]. MANAGEMENT CAN ONLY BE FROM THIS PREDEFINED SET, Nothing ELSE.
# 2. Job Titles: Normalize the job title after extracting it from the text. For example, convert "ceo" to "Chief Executive Officer" and always include both the full title and its abbreviation (confirmed ones), e.g., "VP of Engineering" and "Vice President of Engineering." or "Chief Innovation Officer" and "CINO". ENSURE LOGICAL and EXACT job titles such as 'Architect' NOT 'Architect who is skilled in VR'. Job titles MUST BE CONCISE AND TO THE POINT and shouldn't include company names or region names. Do not change the title for normalization.
# 3. Response Format: Your response must be a dictionary with two keys: "management_levels" and "titles". Each key should have a list of management levels and titles respectively.
# 4. Identify the key phrases in a prompt. Key phrase is a title and its function, IF THE function is mentioned. If the function would be mentioned, it will be classified as a "Job Title". For example, "CEOs working in Automotive Industry and VP of Engineering of Microsoft" has the "CEOs" and "VP of Engineering" as Key Phrases. In this case ONLY VP CANNOT be considered a key phrase. ONLY CONSIDER THE KEY PHRASES MENTIONED, DO NOT ASSUME. Past and current designations dont matter.
# 5. Check whether the KEY PHRASE should be classified as a Job Title or a Management Level. IT SHOULD NEVER BE CLASSIFIED INTO BOTH. This is ESSENTIAL.
# 5. If a key phrase is classified as title, don't include it in the management levels. For example, if "VP of Engineering" is classified as title then don't include "VP" in management levels. Industry or company names will not be included in job titles.
# 6. If a key phrase is classified as a management level, don't include it in the title. For example, if "Vice Presidents" is classified as management level then don't include "Vice Presidents" in titles. A job title will ONLY be a title and its business function. No other DETAIL should be added. A Management Level cannot COME FROM WITHIN A JOB TITLE.
# 7. Remember: One instance of a key phrase should be considered for either management level or job title, not both. Each will fall either into management levels or job title but WILL NEVER FALL INTO BOTH. A KEY PHRASE CANNOT BE IN MANAGEMENT LEVELS AND TITLE, BOTH. A Management Level cannot COME FROM WITHIN A JOB TITLE. 
# 8. If the word 'executive' is mentioned, specific considerations should be taken into account.

# Take a deep breath and understand.
# Query: "Give me VPs working in Microsoft": # VPs is the KEY PHRASE. It will be evaluated as a whole key phrase ONLY. It cannot be classified into a management level and a title both.
# Management Level Focus: In this query, "VPs" should refer to individuals at the management level of Vice President within Microsoft. This means you are asking for a list of people who occupy the VP rank across various departments or divisions within the company. VPs can cover the complete domain of of 'VP' in management level. The emphasis is on their standing in the organizational hierarchy, regardless of their specific job titles. A management title will only be selected if it covers the complete domain in the predefined set.
# Job Title Focus: If you were asking about "VPs" in terms of job titles, you'd be interested in individuals whose specific title is "VP" of a certain business function, such as "VP of Marketing" (Marketing is a business function) or "VP of Engineering" (Engineering is a business function). If a function is clearly mentioned then it would be JOB TITLE. "VP of Microsoft" does NOT have a function (Microsoft is an organization) neither does "Automotive VPs" (Automotive can ONLY be an industry). Identify the BUSINESS functions accurately. Then they CANNOT come under management levels.
# Output: {"management_levels": ["VP"], "titles": []}

# Query: "The CFOs working in google or facebook": #CFOs is a KEY PHRASE. It will be evaluated as a whole key phrase ONLY. It cannot be classified into a management level and a title both.
# Management Level Focus: In this query, "CFOs" does not cover the complete 'C-Suite' domain. ONLY IF COMPLETE DOMAIN IS COVERED then the key phrase will be in management level. One job title, even if it is on the top or head of the heirarchy, does not cover the complete domain. If a user wants all 'executives', without any business function specified then three management levels will be covered, namely "CSuite," "Executive VP or Sr. VP" and "VP" so all MUST come. However, the word 'executive' is mentioned in relation to a business function, only titles specific to that function should be included. For example, if 'Marketing Executives' is mentioned, titles such as 'CMO', 'Chief Marketing Officer', 'Senior VP of Marketing', 'Senior Vice President of Marketing', 'VP of Marketing', and 'Vice President of Marketing' should be included. The word 'executive' or 'executives' would, thus, NEVER be included neither as job title nor management level.
# Job Title Focus: As a CFO would only be a chief in finance, the CFO being discussed here comes under job title, not management level.
# Output: {"management_levels": [], "titles": ["CFO", "Chief Finance Officer"]} # Job titles MUST BE CONCISE and TO THE POINT, mentioning ONLY the TITLE and the BUSINESS FUNCTION if the business function is given. No added details, such as company name or group.

# If terms like 'leader', 'expert', 'specialist', or similar are mentioned, extract a maximum of 2-3 relevant job titles associated with those terms based on the context, focusing on the most appropriate leadership or expertise roles.

# For each management level and title, also tell why you put it there. If a business function can be clearly identified, the key phrase will be a JOB title. ALWAYS make the necessary changes when the word 'executive' or 'leader' or 'expert', etc., is mentioned in the user query and get LOGICAL titles. Management level of 'Manager' will not be chosen when a specific type of 'manager' (senior managers, project manager, etc.) are asked for. ONLY identify and consider complete key phrases EXPLICITLY MENTIONED IN THE PROMPT, and each key phrase will either be in management level or title, NEVER consider THE SAME KEY PHRASE for BOTH. Evaluate key phrases separately.

# After analyzing the query, generate two outputs:

# 1. A reasoning paragraph that explains your thought process step-by-step. Include:
#    - Identification of key phrases
#    - Evaluation of each key phrase (management level or job title)
#    - Reasoning behind your classifications
#    - Any special considerations (e.g., handling of 'executive' or 'leader' terms)
#    - Explanation of domain coverage for management levels
#    - Keep this moderate, not too long and not too concise
#    - While writing the reasoning, refrain from using I and addressing yourself.

# 2. A JSON object with two keys: "management_levels" and "titles". Each key should have a list of extracted management levels and titles respectively.   

# Present your output in the following format:


# <rationale>
# Your step-by-step reasoning and rationale paragraph goes here.
# </rationale>

# <json_output>
# {
#   "management_levels": [...],
#   "titles": [...]
# }
# </json_output>


# Remember to adhere strictly to the guidelines provided, especially regarding the classification of key phrases and the handling of special terms like 'executive'.
# """


In [271]:
async def chatgpt_response_a(query, temperature=0.7, model="gpt-4o-mini", **kwargs):

    """
    Function to run prompts on chatgpt

    Args:
        key (string): openai api key
        messages (list): list of object that has the chat that you want to process with chatgpt. i.e. system prompt, assistant prompt and user prompt
        temperature (float, optional): Temperature of gpt for generations. Defaults to 0.7.
        model (str, optional): The model you want to use. Defaults to "gpt-4o-mini".

    Returns:
        string: chatgpt result
    """
    # user_message = Template(NER_MANAGEMENT_LEVEL_TITLE_USER_PROMPT).render({"QUERY" : query})
    user_message = Template(NER_MANAGEMENT_LEVEL_TITLE_USER_PROMPT_ORIGINAL).render({"QUERY" : query})
    messages = [
            # {"role": "system", "content": NER_MANAGEMENT_LEVEL_TITLE_SYSTEM_PROMPT},
            {"role": "system", "content": NER_MANAGEMENT_LEVEL_TITLE_SYSTEM_PROMPT_ORIGINAL},
            {"role": "user", "content": f"User Query: {user_message}"},
            
    ]
    openai_object = {
        "model": model,
        "messages": messages,
        "temperature": temperature,
    }
    aclient = AsyncOpenAI(api_key=key)

    openai_object.update(kwargs)

    response = await aclient.chat.completions.create(**openai_object)
    response = response.__dict__
    response["choices"] = [choice.__dict__ for choice in response["choices"]]
    for choice in response["choices"]:
        choice["message"] = choice["message"].__dict__
    return response



async def generate_queries_a(query, temperature=0.7, model="gpt-4o-mini", **kwargs):

    """
    Function to run prompts on chatgpt

    Args:
        key (string): openai api key
        messages (list): list of object that has the chat that you want to process with chatgpt. i.e. system prompt, assistant prompt and user prompt
        temperature (float, optional): Temperature of gpt for generations. Defaults to 0.7.
        model (str, optional): The model you want to use. Defaults to "gpt-4o-mini".

    Returns:
        string: chatgpt result
    """
    # user_message = Template(NER_MANAGEMENT_LEVEL_TITLE_USER_PROMPT).render({"QUERY" : query})
    user_message = Template(QUERY_GENERATION_PROMPT_USER).render({"USE_CASE" : query})
    messages = [
            # {"role": "system", "content": NER_MANAGEMENT_LEVEL_TITLE_SYSTEM_PROMPT},
            {"role": "system", "content": QUERY_GENERATION_PROMPT_SYSTEM},
            {"role": "user", "content": f"User Query: {user_message}"},
            
    ]
    openai_object = {
        "model": model,
        "messages": messages,
        "temperature": temperature,
    }
    aclient = AsyncOpenAI(api_key=key)

    openai_object.update(kwargs)

    response = await aclient.chat.completions.create(**openai_object)
    response = response.__dict__
    response["choices"] = [choice.__dict__ for choice in response["choices"]]
    for choice in response["choices"]:
        choice["message"] = choice["message"].__dict__
    return response




async def generate_usecase_a(query, temperature=0.7, model="gpt-4o-mini", **kwargs):

    """
    Function to run prompts on chatgpt

    Args:
        key (string): openai api key
        messages (list): list of object that has the chat that you want to process with chatgpt. i.e. system prompt, assistant prompt and user prompt
        temperature (float, optional): Temperature of gpt for generations. Defaults to 0.7.
        model (str, optional): The model you want to use. Defaults to "gpt-4o-mini".

    Returns:
        string: chatgpt result
    """
    # user_message = Template(NER_MANAGEMENT_LEVEL_TITLE_USER_PROMPT).render({"QUERY" : query})
    user_message = Template(USE_CASE_GENERATION_USER).render({"INSTRUCTION" : query})
    messages = [
            # {"role": "system", "content": NER_MANAGEMENT_LEVEL_TITLE_SYSTEM_PROMPT},
            {"role": "system", "content": USE_CASE_GENERATION_SYSTEM},
            {"role": "user", "content": f"User Query: {user_message}"},
            
    ]
    openai_object = {
        "model": model,
        "messages": messages,
        "temperature": temperature,
    }
    aclient = AsyncOpenAI(api_key=key)

    openai_object.update(kwargs)

    response = await aclient.chat.completions.create(**openai_object)
    response = response.__dict__
    response["choices"] = [choice.__dict__ for choice in response["choices"]]
    for choice in response["choices"]:
        choice["message"] = choice["message"].__dict__
    return response

In [257]:
instruction1 = """Job Titles: Normalize the job title after extracting it from the text. For example, convert "ceo" to "Chief Executive Officer" and always include both the full title and its abbreviation (confirmed ones), e.g., "VP of Engineering" and "Vice President of Engineering." or "Chief Innovation Officer" and "CINO". ENSURE LOGICAL and EXACT job titles such as 'Architect' NOT 'Architect who is skilled in VR'. Job titles MUST BE CONCISE AND TO THE POINT and shouldn't include company names or region names. Do not change the title for normalization."""
instruction2 = """Identify the key phrases in a prompt. Key phrase is a title and its function, IF THE function is mentioned. If the function would be mentioned, it will be classified as a "Job Title". For example, "CEOs working in Automotive Industry and VP of Engineering of Microsoft" has the "CEOs" and "VP of Engineering" as Key Phrases. In this case ONLY VP CANNOT be considered a key phrase. ONLY CONSIDER THE KEY PHRASES MENTIONED, DO NOT ASSUME. Past and current designations dont matter."""
instruction3 = """Check whether the KEY PHRASE should be classified as a Job Title or a Management Level. IT SHOULD NEVER BE CLASSIFIED INTO BOTH. This is ESSENTIAL.
If a key phrase is classified as title, don't include it in the management levels. For example, if "VP of Engineering" is classified as title then don't include "VP" in management levels. Industry or company names will not be included in job titles.
If a key phrase is classified as a management level, don't include it in the title. For example, if "Vice Presidents" is classified as management level then don't include "Vice Presidents" in titles. A job title will ONLY be a title and its business function. No other DETAIL should be added. A Management Level cannot COME FROM WITHIN A JOB TITLE.
"""
instruction4 = """Management level of 'Manager' will not be chosen when a specific type of 'manager' (senior managers, project manager, etc.) are asked for."""
instruction5 = """If terms like 'leader', 'expert', 'specialist', or similar are mentioned, extract a maximum of 2-3 relevant job titles. ALWAYS make the necessary changes when the word 'executive' or 'leader' or 'expert', etc., is mentioned in the user query and get LOGICAL titles."""
instruction6 = """Queries where both management level and title is present such as `Find me a Director for Engineering who has previously worked as a VP of a tech firm"""
# instruction7 = """Queries where """



In [258]:
use_case = await generate_usecase_a(instruction6)
use_case = use_case['choices'][0]['message']['content']

In [259]:
print(use_case)

**Use Case Title:** Querying Management Level and Title for Recruitment

**1. The system's primary objective:**
As a system, the primary objective is to facilitate recruitment by efficiently processing queries that specify both management level and job title. This will enable users to find candidates who meet specific criteria, ensuring that the recruitment process is targeted and effective.

**2. The rules and constraints for processing the information:**
- The query must include both a management level (e.g., Director, VP) and a job title (e.g., Engineering).
- The system should only return candidates who have previously held the specified job title and management level.
- The search should be limited to candidates who have worked in relevant sectors (e.g., technology firms).
- The system must handle variations in title formats (e.g., 'Vice President' vs. 'VP') and synonyms (e.g., 'Engineering' vs. 'Tech').
- The query should ignore candidates that do not meet both criteria, ensuring

In [260]:
queries = await generate_queries_a(use_case)

In [261]:
instruction5

"If terms like 'leader', 'expert', 'specialist', or similar are mentioned, extract a maximum of 2-3 relevant job titles. ALWAYS make the necessary changes when the word 'executive' or 'leader' or 'expert', etc., is mentioned in the user query and get LOGICAL titles."

In [262]:
print(queries['choices'][0]['message']['content'])

[
  "Looking for a VP of Engineering at a major tech company who has held a Director role in the past.",
  "Can you find me a Director of Technology who used to be a Vice President at a software firm?",
  "Searching for a senior manager in Engineering, previously a VP, from any tech-related organization.",
  "I need a Chief Technology Officer who has experience as an Engineering Director in a startup environment.",
  "Find candidates who have been a VP of Tech and currently hold a Director position in an IT company.",
  "Looking for a head of Engineering who was a Vice President in a digital solutions firm.",
  "Can you locate a Director of Engineering who has experience as a VP in a telecommunications company?",
  "Searching for a VP of Product who has previously been a Director at a technology startup.",
  "Looking for a Director of Engineering who has held a Vice President title in a high-tech industry.",
  "Find me a Director of Development with past experience as a VP in a softwar

In [None]:
print

In [147]:
import json
import re

def parse_response(text):
    """
    Parse a text response containing rationale and JSON data.
    
    Args:
        text (str): Input text containing rationale and JSON data
        
    Returns:
        tuple: (rationale, parsed_json)
            - rationale (str): The explanatory text before the JSON
            - parsed_json (dict): The parsed JSON data
    """
    # Find the JSON part using regex
    # Look for content between ```json and ``` markers
    json_match = re.search(r'```json\s*(.*?)\s*```', text, re.DOTALL)
    
    if not json_match:
        raise ValueError("No JSON content found in the specified format")
    
    # Extract the JSON string and parse it
    json_str = json_match.group(1).strip()
    parsed_json = json.loads(json_str)
    
    # Get the rationale (everything before the JSON)
    rationale = text[:json_match.start()].strip()
    
    return rationale, parsed_json

In [148]:
results = []

In [149]:
for query in queries_list:
    result = await chatgpt_response_a(query)
    print('-'*47)
    print("Query:", query)
    rationale, response_json = parse_response(result['choices'][0]['message']['content'])
    print("Rationale: ", rationale)
    print("Response JSON: ", response_json)
    results.append({
        "query" : query,
        "rationale" : rationale,
        "response_json" : response_json
    })

-----------------------------------------------
Query: Executive recruitment for an Operations Director in the Maghreb to lead production and manage regional operations.
Rationale:  To analyze the user query, let's identify the key phrases and determine their classifications:

1. **"Executive recruitment"**: This phrase suggests a focus on hiring for high-level positions, but it doesn't specify a title or management level directly. Since "executive" is mentioned, we will not classify it as a management level or a job title.

2. **"Operations Director"**: This is a clear job title that specifies a function (Operations). The title "Operations Director" falls under the category of job titles because it indicates a specific role and responsibility within an organization.

3. **"Maghreb"**: This refers to a geographic region and does not contribute to the classification of management levels or job titles.

4. **"lead production"** and **"manage regional operations"**: These phrases describe

In [92]:
# Creating Batch Job for 50 queries to see the current performance:

In [123]:
import pandas as pd

In [272]:
query = """'Generate a precise list of individuals currently serving as President, CEO, or COO in companies similar to Jawbone within the wearable technology sector. These companies focus on products such as fitness trackers, smartwatches, health monitoring devices, or other wearable electronics but do not necessarily have to manufacture them. The individuals should be based in the United States or Europe and hold an MBA degree. Exclude individuals from companies not focused on wearable technology. Precision is important'"""

In [273]:
result = await chatgpt_response_a(query)

In [274]:
result = result['choices'][0]['message']['content']

In [275]:
print(result)

In the provided user query, the key phrases to evaluate are "President," "CEO," and "COO."

1. **President**: This title refers to a specific job function and does not encompass a complete management level as per the predefined set. Therefore, it should be classified as a job title.
  
2. **CEO**: This abbreviation stands for Chief Executive Officer, which is a specific job function. Thus, it is classified as a job title.

3. **COO**: This abbreviation stands for Chief Operating Officer, which is also a specific job function. Hence, it is classified as a job title.

Each of these key phrases represents a specific title related to a function within an organization and does not reflect a broader management level category that includes multiple roles.

Now, let's summarize the classifications:

- **Management Levels**: There are no management levels to extract from this query as all key phrases are specific job titles.
- **Titles**: The titles derived from the key phrases are:
  - "Presid

In [231]:
df = pd.read_excel("AI Search Test Suite .xlsx")

In [232]:
df.columns

Index(['Unnamed: 0', 'Results', 'Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4',
       'Comments ', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8', 'Question'],
      dtype='object')

In [233]:
df = df.rename(columns={'Comments ': 'Comments'})
issues_df = df.query('Comments.notna()')
non_issues_df = df[~df.index.isin(issues_df.index)]

In [234]:
len(df), len(issues_df), len(non_issues_df), len(non_issues_df) + len(issues_df)

(362, 146, 216, 362)

In [244]:
issues_queries = []
non_issues_queries = []

In [245]:
for idx, row in issues_df.iterrows():
    issues_queries.append(str(row["Unnamed: 0"]))
for idx, row in non_issues_df.iterrows():
    non_issues_queries.append(str(row["Unnamed: 0"]))


In [246]:
len(issues_queries), len(non_issues_queries), len(issues_queries) + len(non_issues_queries), len(df)

(146, 216, 362, 362)

In [248]:
with open("queries-data.json") as f:
    json_data = json.load(f)

In [251]:
# json_data["issues_queries"] = issues_queries
json_data["non_issues_queries"] = non_issues_queries

In [254]:
with open("queries_dataset.json", "w") as f:
    json.dump(json_data, f, indent=2)

In [247]:
{"issues_queries" : issues_queries,
 "non_issues_queries" : non_issues_queries}

["Find senior software engineers or data scientists with at least 10 years of experience, who have worked at Google, Microsoft, or Facebook in the past, but are currently employed at a technology company like Tesla or Apple. They should have skills in Python, AI, or cloud computing, hold a master's degree in computer science from universities like Stanford or MIT, and be located in New York, San Francisco, or London. The candidates should have experience in both the finance and healthcare industries.",
 'Find me data analysts who are experts in Excel, Tableau, SQL, data visualization, and statistical analysis, based in London, UK, and not from the telecommunications or vehicle transportation industry.',
 "Find someone who is currently a Senior Software Engineer at Facebook, previously worked at Google, has 5-10 years of experience in software development, holds a Bachelor's degree in Computer Science from Stanford University, and is based in San Francisco",
 'Looking for presidents who

In [110]:
df.head()

Unnamed: 0.1,Unnamed: 0,Results,Unnamed: 2,Unnamed: 3,Unnamed: 4,Comments,Unnamed: 6,Unnamed: 7,Unnamed: 8,Question
0,Show customer service representatives.,Job title(current),,,,,,,,
1,Find sales executives.,Job title(current),,,,,,,,
2,Show network engineers.,Job title(current),,,,,,,,
3,Show HR managers.,Job title(current),,,,,,,,
4,Find marketing specialists.,Job title(current),,,,,,,,


In [115]:
df.columns

Index(['Unnamed: 0', 'Results', 'Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4',
       'Comments', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8', 'Question'],
      dtype='object')

In [135]:
query = "Executive recruitment for an Operations Director in the Maghreb to lead production and manage regional operations."

In [136]:
result = await chatgpt_response_a(query)
print("Query:", query)
print(result['choices'][0]['message']['content'])

Query: Executive recruitment for an Operations Director in the Maghreb to lead production and manage regional operations.
In the user query, we have the following key phrases to evaluate:

1. **"Executive recruitment"** - This phrase indicates a focus on recruitment for executive roles but does not specify a job title or management level. It does not fit into either category.
   
2. **"Operations Director"** - This clearly specifies a job title and its function, which is to lead operations. The title "Operations Director" will be normalized as "Director of Operations" (as "Director" is included in the predefined set). Hence, it will be classified under titles.

3. **"lead production"** - This phrase describes a function but does not correspond to a specific job title. It does not include any key phrase that qualifies as a management level or a job title.

4. **"manage regional operations"** - This further emphasizes the responsibilities associated with the role but does not identify a 

In [137]:
result = await chatgpt_response_a(query)
print("Query:", query)
print(result['choices'][0]['message']['content'])

Query: Executive recruitment for an Operations Director in the Maghreb to lead production and manage regional operations.
In the user query, the key phrases we need to consider are:

1. "Executive recruitment"
2. "Operations Director"

Now, let's evaluate these key phrases based on the guidelines provided:

1. **"Executive recruitment"**: 
   - This phrase indicates a focus on recruiting for high-level positions but does not specify a job title or management level within the context of the query.
   - Therefore, it will not be classified as either a management level or a job title.

2. **"Operations Director"**: 
   - This phrase clearly identifies a specific job title. The role of "Director" corresponds to a management level, but since it is part of the job title "Operations Director," it cannot be classified separately as a management level.
   - The job title is normalized to "Operations Director."

Based on these evaluations, we can conclude:

- Management Levels: There are no mana

In [139]:
parsed_response = parse_response(result['choices'][0]['message']['content'])

In [143]:
print(parsed_response[0])

In the user query, the key phrases we need to consider are:

1. "Executive recruitment"
2. "Operations Director"

Now, let's evaluate these key phrases based on the guidelines provided:

1. **"Executive recruitment"**: 
   - This phrase indicates a focus on recruiting for high-level positions but does not specify a job title or management level within the context of the query.
   - Therefore, it will not be classified as either a management level or a job title.

2. **"Operations Director"**: 
   - This phrase clearly identifies a specific job title. The role of "Director" corresponds to a management level, but since it is part of the job title "Operations Director," it cannot be classified separately as a management level.
   - The job title is normalized to "Operations Director."

Based on these evaluations, we can conclude:

- Management Levels: There are no management levels explicitly mentioned since "Executive" does not refer to a specific management level in this context, and "D

In [144]:
print(parsed_response[1])

{'management_levels': [], 'titles': ['Operations Director']}
