In [11]:
from dotenv import load_dotenv
import os 

_ = load_dotenv()


In [2]:
NER_MANAGEMENT_LEVEL_TITLE_SYSTEM_PROMPT_ORIGINAL = """
You are an intelligent assistant dedicated to extracting management levels and job titles from user queries. Before doing so, you must understand what a functional area is.
"""

NER_MANAGEMENT_LEVEL_TITLE_USER_PROMPT_ORIGINAL = """

Instructions:
1. Management Levels: Only return management levels that match the predefined set: ["Partners,"," "Founder or Co-founder," "Board of Directors," "CSuite/Chiefs," "Executive VP or Sr. VP," "General Manager," "Manager," "Senior Partner," "Junior Partner," "VP," "Director," "Senior (All Senior-Level Individual Contributors)," "Mid (All Mid-Level Individual Contributors)," "Junior (All Junior-Level Individual Contributors)"]. MANAGEMENT CAN ONLY BE FROM THIS PREDEFINED SET, Nothing ELSE.
2. Job Titles: Normalize the job title after extracting it from the text. For example, convert "ceo" to "Chief Executive Officer" and always include both the full title and its abbreviation (confirmed ones), e.g., "VP of Engineering" and "Vice President of Engineering." or "Chief Innovation Officer" and "CINO". ENSURE LOGICAL and EXACT job titles such as 'Architect' NOT 'Architect who is skilled in VR'. Job titles MUST BE CONCISE AND TO THE POINT and shouldn't include company names or region names. Do not change the title for normalization.
3. Response Format: Your response must be a dictionary with two keys: "management_levels" and "titles". Each key should have a list of management levels and titles respectively.
4. Identify the key phrases in a prompt. Key phrase is a title and its function, IF THE function is mentioned. If the function would be mentioned, it will be classified as a "Job Title". For example, "CEOs working in Automotive Industry and VP of Engineering of Microsoft" has the "CEOs" and "VP of Engineering" as Key Phrases. In this case ONLY VP CANNOT be considered a key phrase. ONLY CONSIDER THE KEY PHRASES MENTIONED, DO NOT ASSUME. Past and current designations dont matter.
5. Check whether the KEY PHRASE should be classified as a Job Title or a Management Level. IT SHOULD NEVER BE CLASSIFIED INTO BOTH. This is ESSENTIAL.
5. If a key phrase is classified as title, don't include it in the management levels. For example, if "VP of Engineering" is classified as title then don't include "VP" in management levels. Industry or company names will not be included in job titles.
6. If a key phrase is classified as a management level, don't include it in the title. For example, if "Vice Presidents" is classified as management level then don't include "Vice Presidents" in titles. A job title will ONLY be a title and its business function. No other DETAIL should be added. A Management Level cannot COME FROM WITHIN A JOB TITLE.
7. Remember: One instance of a key phrase should be considered for either management level or job title, not both. Each will fall either into management levels or job title but WILL NEVER FALL INTO BOTH. A KEY PHRASE CANNOT BE IN MANAGEMENT LEVELS AND TITLE, BOTH. A Management Level cannot COME FROM WITHIN A JOB TITLE. 
8. If the word 'executive' is mentioned, specific considerations should be taken into account.

Take a deep breath and understand.
Query: "Give me VPs working in Microsoft": # VPs is the KEY PHRASE. It will be evaluated as a whole key phrase ONLY. It cannot be classified into a management level and a title both.
Management Level Focus: In this query, "VPs" should refer to individuals at the management level of Vice President within Microsoft. This means you are asking for a list of people who occupy the VP rank across various departments or divisions within the company. VPs can cover the complete domain of of 'VP' in management level. The emphasis is on their standing in the organizational hierarchy, regardless of their specific job titles. A management title will only be selected if it covers the complete domain in the predefined set.
Job Title Focus: If you were asking about "VPs" in terms of job titles, you'd be interested in individuals whose specific title is "VP" of a certain business function, such as "VP of Marketing" (Marketing is a business function) or "VP of Engineering" (Engineering is a business function). If a function is clearly mentioned then it would be JOB TITLE. "VP of Microsoft" does NOT have a function (Microsoft is an organization) neither does "Automotive VPs" (Automotive can ONLY be an industry). Identify the BUSINESS functions accurately. Then they CANNOT come under management levels.
Output: {"management_levels": ["VP"], "titles": []}

Query: "The CFOs working in google or facebook": #CFOs is a KEY PHRASE. It will be evaluated as a whole key phrase ONLY. It cannot be classified into a management level and a title both.
Management Level Focus: In this query, "CFOs" does not cover the complete 'C-Suite' domain. ONLY IF COMPLETE DOMAIN IS COVERED then the key phrase will be in management level. One job title, even if it is on the top or head of the heirarchy, does not cover the complete domain. If a user wants all 'executives', without any business function specified then three management levels will be covered, namely "CSuite," "Executive VP or Sr. VP" and "VP" so all MUST come. However, the word 'executive' is mentioned in relation to a business function, only titles specific to that function should be included. For example, if 'Marketing Executives' is mentioned, titles such as 'CMO', 'Chief Marketing Officer', 'Senior VP of Marketing', 'Senior Vice President of Marketing', 'VP of Marketing', and 'Vice President of Marketing' should be included. The word 'executive' or 'executives' would, thus, NEVER be included neither as job title nor management level.
Job Title Focus: As a CFO would only be a chief in finance, the CFO being discussed here comes under job title, not management level.
Output: {"management_levels": [], "titles": ["CFO", "Chief Finance Officer"]} # Job titles MUST BE CONCISE and TO THE POINT, mentioning ONLY the TITLE and the BUSINESS FUNCTION if the business function is given. No added details, such as company name or group.

If terms like 'leader', 'expert', 'specialist', or similar are mentioned, extract a maximum of 2-3 relevant job titles associated with those terms based on the context, focusing on the most appropriate leadership or expertise roles.

For each management level and title, also tell why you put it there. If a business function can be clearly identified, the key phrase will be a JOB title. ALWAYS make the necessary changes when the word 'executive' or 'leader' or 'expert', etc., is mentioned in the user query and get LOGICAL titles. Management level of 'Manager' will not be chosen when a specific type of 'manager' (senior managers, project manager, etc.) are asked for. ONLY identify and consider complete key phrases EXPLICITLY MENTIONED IN THE PROMPT, and each key phrase will either be in management level or title, NEVER consider THE SAME KEY PHRASE for BOTH. Evaluate key phrases separately.
Always return a JSON object in your output.

User Query: {{QUERY}}
Let's think step by step about each key phrase."""


In [3]:

NER_MANAGEMENT_LEVEL_TITLE_SYSTEM_PROMPT = """You are an AI assistant tasked with extracting management levels and job titles from a given query. Your goal is to analyze the query, identify relevant key phrases, and categorize them appropriately as either management levels or job titles.
"""

NER_MANAGEMENT_LEVEL_TITLE_USER_PROMPT = """
Here is the query you need to analyze:
<query>
{{QUERY}}
</query>

Follow these steps to extract management levels and titles:

1. Management Levels: Only return management levels that match the predefined set: ["Partners,"," "Founder or Co-founder," "Board of Directors," "CSuite/Chiefs," "Executive VP or Sr. VP," "General Manager," "Manager," "Senior Partner," "Junior Partner," "VP," "Director," "Senior (All Senior-Level Individual Contributors)," "Mid (All Mid-Level Individual Contributors)," "Junior (All Junior-Level Individual Contributors)"]. MANAGEMENT CAN ONLY BE FROM THIS PREDEFINED SET, Nothing ELSE.
2. Job Titles: Normalize the job title after extracting it from the text. For example, convert "ceo" to "Chief Executive Officer" and always include both the full title and its abbreviation (confirmed ones), e.g., "VP of Engineering" and "Vice President of Engineering." or "Chief Innovation Officer" and "CINO". ENSURE LOGICAL and EXACT job titles such as 'Architect' NOT 'Architect who is skilled in VR'. Job titles MUST BE CONCISE AND TO THE POINT and shouldn't include company names or region names. Do not change the title for normalization.
3. Response Format: Your response must be a dictionary with two keys: "management_levels" and "titles". Each key should have a list of management levels and titles respectively.
4. Identify the key phrases in a prompt. Key phrase is a title and its function, IF THE function is mentioned. If the function would be mentioned, it will be classified as a "Job Title". For example, "CEOs working in Automotive Industry and VP of Engineering of Microsoft" has the "CEOs" and "VP of Engineering" as Key Phrases. In this case ONLY VP CANNOT be considered a key phrase. ONLY CONSIDER THE KEY PHRASES MENTIONED, DO NOT ASSUME. Past and current designations dont matter.
5. Check whether the KEY PHRASE should be classified as a Job Title or a Management Level. IT SHOULD NEVER BE CLASSIFIED INTO BOTH. This is ESSENTIAL.
5. If a key phrase is classified as title, don't include it in the management levels. For example, if "VP of Engineering" is classified as title then don't include "VP" in management levels. Industry or company names will not be included in job titles.
6. If a key phrase is classified as a management level, don't include it in the title. For example, if "Vice Presidents" is classified as management level then don't include "Vice Presidents" in titles. A job title will ONLY be a title and its business function. No other DETAIL should be added. A Management Level cannot COME FROM WITHIN A JOB TITLE.
7. Remember: One instance of a key phrase should be considered for either management level or job title, not both. Each will fall either into management levels or job title but WILL NEVER FALL INTO BOTH. A KEY PHRASE CANNOT BE IN MANAGEMENT LEVELS AND TITLE, BOTH. A Management Level cannot COME FROM WITHIN A JOB TITLE. 
8. If the word 'executive' is mentioned, specific considerations should be taken into account.

Take a deep breath and understand.
Query: "Give me VPs working in Microsoft": # VPs is the KEY PHRASE. It will be evaluated as a whole key phrase ONLY. It cannot be classified into a management level and a title both.
Management Level Focus: In this query, "VPs" should refer to individuals at the management level of Vice President within Microsoft. This means you are asking for a list of people who occupy the VP rank across various departments or divisions within the company. VPs can cover the complete domain of of 'VP' in management level. The emphasis is on their standing in the organizational hierarchy, regardless of their specific job titles. A management title will only be selected if it covers the complete domain in the predefined set.
Job Title Focus: If you were asking about "VPs" in terms of job titles, you'd be interested in individuals whose specific title is "VP" of a certain business function, such as "VP of Marketing" (Marketing is a business function) or "VP of Engineering" (Engineering is a business function). If a function is clearly mentioned then it would be JOB TITLE. "VP of Microsoft" does NOT have a function (Microsoft is an organization) neither does "Automotive VPs" (Automotive can ONLY be an industry). Identify the BUSINESS functions accurately. Then they CANNOT come under management levels.
Output: {"management_levels": ["VP"], "titles": []}

Query: "The CFOs working in google or facebook": #CFOs is a KEY PHRASE. It will be evaluated as a whole key phrase ONLY. It cannot be classified into a management level and a title both.
Management Level Focus: In this query, "CFOs" does not cover the complete 'C-Suite' domain. ONLY IF COMPLETE DOMAIN IS COVERED then the key phrase will be in management level. One job title, even if it is on the top or head of the heirarchy, does not cover the complete domain. If a user wants all 'executives', without any business function specified then three management levels will be covered, namely "CSuite," "Executive VP or Sr. VP" and "VP" so all MUST come. However, the word 'executive' is mentioned in relation to a business function, only titles specific to that function should be included. For example, if 'Marketing Executives' is mentioned, titles such as 'CMO', 'Chief Marketing Officer', 'Senior VP of Marketing', 'Senior Vice President of Marketing', 'VP of Marketing', and 'Vice President of Marketing' should be included. The word 'executive' or 'executives' would, thus, NEVER be included neither as job title nor management level.
Job Title Focus: As a CFO would only be a chief in finance, the CFO being discussed here comes under job title, not management level.
Output: {"management_levels": [], "titles": ["CFO", "Chief Finance Officer"]} # Job titles MUST BE CONCISE and TO THE POINT, mentioning ONLY the TITLE and the BUSINESS FUNCTION if the business function is given. No added details, such as company name or group.

If terms like 'leader', 'expert', 'specialist', or similar are mentioned, extract a maximum of 2-3 relevant job titles associated with those terms based on the context, focusing on the most appropriate leadership or expertise roles.

For each management level and title, also tell why you put it there. If a business function can be clearly identified, the key phrase will be a JOB title. ALWAYS make the necessary changes when the word 'executive' or 'leader' or 'expert', etc., is mentioned in the user query and get LOGICAL titles. Management level of 'Manager' will not be chosen when a specific type of 'manager' (senior managers, project manager, etc.) are asked for. ONLY identify and consider complete key phrases EXPLICITLY MENTIONED IN THE PROMPT, and each key phrase will either be in management level or title, NEVER consider THE SAME KEY PHRASE for BOTH. Evaluate key phrases separately.

After analyzing the query, generate two outputs:

1. A reasoning paragraph that explains your thought process step-by-step. Include:
   - Identification of key phrases
   - Evaluation of each key phrase (management level or job title)
   - Reasoning behind your classifications
   - Any special considerations (e.g., handling of 'executive' or 'leader' terms)
   - Explanation of domain coverage for management levels
   - Keep this moderate, not too long and not too concise
   - While writing the reasoning, refrain from using I and addressing yourself.

2. A JSON object with two keys: "management_levels" and "titles". Each key should have a list of extracted management levels and titles respectively.   

Present your output in the following format:


<rationale>
Your step-by-step reasoning and rationale paragraph goes here.
</rationale>

<json_output>
{
  "management_levels": [...],
  "titles": [...]
}
</json_output>


Remember to adhere strictly to the guidelines provided, especially regarding the classification of key phrases and the handling of special terms like 'executive'.
"""


In [6]:
import json

with open('../finetuning-code/queries_dataset.json') as f:
    queries = json.load(f)['use_case_1']

In [7]:
queries

['Looking for a ceo for our startup in Silicon Valley.',
 'Hiring a vp of sales with experience in tech companies.',
 'Need a cto at XYZ Corp who is graduated in CS from harvard.',
 'Searching for a COO based in Los Angeles with prior experience in business strategy.',
 'Seeking a vp of tech with Agile certification for our team.',
 'Looking for an architect specializing in commercial buildings.',
 'We need a ciso for our security firm with a strong background in encryption and networking domains.',
 'Hiring a cfo for financial department in our firm.',
 'Looking for a cmo at ABC Agency who has prior proven experience in marketing for large tech companies',
 'Searching a CSO to lead our strategy team with expertise in project planning and educational background from stanford business school.']

In [None]:
from together import Together

client = Together()

response = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
    messages=[],
    max_tokens=null,
    temperature=0.7,
    top_p=0.7,
    top_k=50,
    repetition_penalty=1,
    stop=["<|eot_id|>","<|eom_id|>"],
    stream=True
)
for token in response:
    if hasattr(token, 'choices'):
        print(token.choices[0].delta.content, end='', flush=True)

In [31]:
from jinja2 import Template
from together import Together

async def together_ai_response(query, temperature=0.1, model="gpt-4o"):

    """
    Function to run prompts on chatgpt

    Args:
        key (string): openai api key
        messages (list): list of object that has the chat that you want to process with chatgpt. i.e. system prompt, assistant prompt and user prompt
        temperature (float, optional): Temperature of gpt for generations. Defaults to 0.7.
        model (str, optional): The model you want to use. Defaults to "gpt-4o".

    Returns:
        string: chatgpt result
    """
    # user_message = Template(NER_MANAGEMENT_LEVEL_TITLE_USER_PROMPT).render({"QUERY" : query})
    user_message = Template(NER_MANAGEMENT_LEVEL_TITLE_USER_PROMPT_ORIGINAL).render({"QUERY" : query})
    messages = [
            # {"role": "system", "content": NER_MANAGEMENT_LEVEL_TITLE_SYSTEM_PROMPT},
            {"role": "system", "content": NER_MANAGEMENT_LEVEL_TITLE_SYSTEM_PROMPT_ORIGINAL},
            {"role": "user", "content": f"User Query: {user_message}"},
            
    ]

    client = Together()

    # response = client.chat.completions.create(
    # model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    # messages=messages,
    # max_tokens=1024,
    # temperature=0.7,
    # top_p=0.7,
    # top_k=50,
    # repetition_penalty=1,
    # stop=["<|eot_id|>","<|eom_id|>"],
    # stream=True
    # )

    response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3",
    messages=messages,
    max_tokens=1024,
    temperature=0.7,
    top_p=0.7,
    top_k=50,
    repetition_penalty=1,
    stop=["<｜end▁of▁sentence｜>"],
    stream=True
)
    for token in response:
        if hasattr(token, 'choices'):
            print(token.choices[0].delta.content, end='', flush=True)
    return response

In [32]:
query = queries[7]
print(query)

Hiring a cfo for financial department in our firm.


In [33]:
result = await together_ai_response(query)

ValidationError: 1 validation error for ChatCompletionChunk
choices.0.finish_reason
  Input should be 'length', 'stop', 'eos', 'tool_calls' or 'error' [type=enum, input_value='', input_type=str]
    For further information visit https://errors.pydantic.dev/2.10/v/enum

In [37]:
from together import Together

client = Together()

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3",
    messages=[
        {
                "role": "user",
                "content": "hello"
        },
        {
                "role": "assistant",
                "content": "Hello! How can I assist you today? 😊"
        }
],
    max_tokens=1024,
    temperature=0.7,
    top_p=0.7,
    top_k=50,
    repetition_penalty=1,
    stop=["<｜end▁of▁sentence｜>"],
    stream=True
)


In [42]:
print(response)

<generator object ChatCompletions.create.<locals>.<genexpr> at 0x1203b51c0>
