<a href="https://colab.research.google.com/github/hasse-h/python-NLP/blob/master/child_psych_question_classifier.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Instructions** 🤔


Upon starting, select `Runtime -> run all` from menu above.

Once the code has started, you can enter a query to the query cell.

Then select the query cell, and `Runtime -> Run after`

If the model encounters an error, select `Runtime -> run all` and restart


### **Model operating principles**
This model first selects whether the question is *closed* or *open-ended*, and after this, it will classify it to the subcategories.

Closed questions can be *posing*, *multiple choice* or *too complicated*.

Open-ended questiosn can be *directives*, *invitations* or *facilitators*.

## Set-up:

In [169]:
# @title
!pip install langchain_openai



In [170]:
# @title
from langchain_core.pydantic_v1 import BaseModel, Field, constr

In [171]:
# @title
from langchain_core.prompts import *

In [172]:
# @title
from langchain_core.output_parsers import *

In [173]:
# @title
!pip install openai



In [174]:
# @title
import os
os.environ["OPENAI_API_KEY"] = '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'

In [175]:
# @title
from langchain_openai import ChatOpenAI

## **Enter a query here:**

In [377]:
query_to_a_child = "hi!"

## Main logic:

In [378]:
# @title
llm = ChatOpenAI()

In [379]:
# @title
llm.model_name = 'gpt-4-turbo'

In [380]:
# @title
closed = """questions that presuppose yes or no as an answer, they usually start with modal verbs or phrases such as Can you..., Could you..., Would you... Do you..., Have you..., etc."""

In [381]:
option_posing = """questions that ask for for confirmation or denial of a presented fact (for example, it was last week, wasn’t it? / was it last week?)"""

In [382]:
multiple_choice = """question has a list of options that one is asked to choose from. A list of two or more options that have an “or” in between (such as was it a car, house or boat? Is it tall or short?)"""

In [383]:
too_complicated = """more than one sentence per question, or something else than option posing or multiple choice"""

In [384]:
open_ended = """questions that do not allow yes or no as an answer, either statements or typically beginning with "What...", "How...", “Tell me more…”, "Tell me about..."""

In [385]:
directive = """questions that direct the answerer, where onnly filler words or discourse markers come before the question word (for example: “So, who was there with you?”, “ok, and what is it that he was doing?”)."""

In [386]:
invitation = """questions that ask the answerer to tell more, such as tell me more about that, tell me more about x, tell me more about it, tell me about x, then what happened? What happened then? what happened after that/before that/first, last/next, and then? """

In [387]:
facilitator = """short utterances that encourage the answerer to continue talking without actually asking anything. These are things like: go on, alright, ok, I see, I understand etc. Also anything that does not fall into other categories)"""

In [388]:
# @title
class QuestionClassificationBasic(BaseModel):
    basic_type: constr(regex='^(Closed|Open-Ended)$') = Field(description="Is the text open or closed-ended? Choose 'Closed' or 'Open-Ended'.")
    justification: str = Field(description="Justify our choice in 10 to 15 words")

In [389]:
class QuestionClassificationOpen(BaseModel):
    sub_type: constr(regex='^(Directive|Invitation|Facilitator)$') = Field(description="Is the text a directive, invitation or facilitator? Choose between 'Directive', 'Invitation' or 'Facilitator'.")
    justification: str = Field(description="Justify our choice in 10 to 15 words")

In [390]:
class QuestionClassificationClosed(BaseModel):
    sub_type: constr(regex='^(Option Posing|Multiple Choice|Too Complicated)$') = Field(description="Is the text option posing, multiple choice or too complicated? Choose between 'Option Posing', 'Multiple Choice' or 'Too Complicated'.")
    justification: str = Field(description="Justify our choice in 10 to 15 words")

In [391]:
# @title
basic_output_parser = JsonOutputParser(pydantic_object=QuestionClassificationBasic)

In [392]:
# @title
prompt_basic = PromptTemplate(
    template=
     """You are a world-leading forensic psychologist.
     Your task is to classify whether an input text is Closed or Open-Ended.
     It does not have to be a quesstion, it can be anything, even one word such as 'ok' or 'yes', so you must
     be prepared to classify any text accordingly. Never refuse to classify
     Thsee two categores are defined as follows:"""

     f"{closed}"

     f"{open_ended}"

    """Think your decisions carefully, step by step, focusing on features of the text
    \n{format_instructions}\n{query}\n""",
    input_variables=["query"],
    partial_variables={"format_instructions": basic_output_parser.get_format_instructions()},
)

In [393]:
basic_chain = prompt_basic | llm | basic_output_parser

## Model intermediate classification


In [394]:
basic_type_classifier = basic_chain.invoke({"query": query_to_a_child})
print(basic_type_classifier)

{'basic_type': 'Open-Ended', 'justification': "The text 'hi!' is a greeting not requiring a yes or no."}


## More logic:

In [395]:
question_classification =  QuestionClassificationOpen if basic_type_classifier['basic_type'] == 'Open-Ended' else QuestionClassificationClosed

In [396]:
final_output_parser = JsonOutputParser(pydantic_object=question_classification)

In [397]:
# @title
prompt_open = PromptTemplate(
    template=
     """You are a world-leading forensic psychologist.
     Your task is to classify whether an input text is
     a 'Directive', 'Invitation' or 'Facilitator'.
     The input text can be anything, even one word such as 'ok' or 'yes', o you must
     be prepared to classify any text.. Never refuse to classify.
     Thsee three categores are defined as follows:"""

    f"{directive}"

    f"{invitation}"

    f"{facilitator}"

    """Think your decisions carefully, step by step,focusing on features of the text
    \n{format_instructions}\n{query}\n""",
    input_variables=["query"],
    partial_variables={"format_instructions": final_output_parser.get_format_instructions()},
)

In [398]:
# @title
prompt_closed = PromptTemplate(
    template=
     """You are a world-leading forensic psychologist.
     Your task is to classify whether an input text is
     'Option Posing', 'Multiple Choice' or 'Too Complicated'.
     The input text can be anything, even one word such as 'ok' or 'yes', so you must
     be prepared to classify any text.. Never refuse to classify.
     Thsee three categores are defined as follows:"""

    f"{option_posing}"

    f"{multiple_choice}"

    f"{too_complicated}"

    """Think your decisions carefully, step by step, focusing on features of the text
    \n{format_instructions}\n{query}\n""",
    input_variables=["query"],
    partial_variables={"format_instructions": final_output_parser.get_format_instructions()},
)

In [399]:
branch = prompt_open if basic_type_classifier['basic_type'] == 'Open-Ended' else prompt_closed

In [400]:
final_chain = branch | llm | final_output_parser

## Model final classification

In [401]:
final_classifier = final_chain.invoke({"query": query_to_a_child})
print(final_classifier)

{'sub_type': 'Facilitator', 'justification': "The text 'hi!' is a short greeting, encouraging interaction."}
