In [1]:
!pip install openai
!pip install tiktoken

Collecting tiktoken
  Downloading tiktoken-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Downloading tiktoken-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m12.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: tiktoken
Successfully installed tiktoken-0.9.0


In [6]:
import openai
from getpass import getpass
key = getpass('Enter Deepseek API key: ')


Enter API key: ··········


In [7]:
from openai import OpenAI

client = OpenAI(api_key=key, base_url="https://api.deepseek.com")

In [9]:
import urllib.request
import tiktoken
from bs4 import BeautifulSoup
choice = input("Please provide the material to build the interview from. \nEnter '1' to input a URL or '2' to input plain text: ")
if choice == '1':
  encoding = tiktoken.get_encoding("cl100k_base")
  url = input("Enter the URL: ")
  response = urllib.request.urlopen(url).read()
  soup = BeautifulSoup(response, features ="html.parser")
  input_text = soup.get_text()
elif choice == '2':
  input_text = input("Enter the plain text: ")

Please provide the material to build the interview from. 
Enter '1' to input a URL or '2' to input plain text: 1
Enter the URL: https://www.citizensadvice.org.uk/housing/repairs-and-housing/repairs-and-housing-conditions/whos-responsible-for-repairs/check-if-your-landlord-has-to-do-repairs/


In [10]:
def generate_with_reasoner(input_text, prompt):
  response = client.chat.completions.create(
      model="deepseek-reasoner",
      messages=[
          {"role": "system", "content": prompt},
          {"role": "user", "content": input_text},
      ],
      temperature=0,
      frequency_penalty=0,
      presence_penalty=0,
  )
  return response

Prompt for identifying conditions and constructing Docassemble questions/events. Output will be saved as skeleton.yml.

In [11]:
prompt_1 = """You are to assume the role of a legal expert building an expert system. The input text describes some
            criteria for a legal conclusion being either true or false. You will output a JSON object that
            represents a structure for the expert system. The object contains the purpose of the events (conclusions) and
            questions.

            To generate questions, you must identify all conditions that can affect the conclusion. Some will
            have a direct effect, and some may have an overriding effect. A condition that has an overriding effect, means
            that it invalidates the necessity of another condition. When identifying explicit conditions, carefully consider
            whether they are independent or dependent on other information.

            There should only be two events - one which occurs when the conclusion is true and
            one which occurs when the conclusion is false. For each event:

            1. Define the "event" name.
            2. Define the event "question" (the written conclusion that the user will see).

            There must be a number of questions necessary to accurately determine the conclusion. The first question, should
            act as an initial screen, so it should have the "mandatory" tag and a "Continue" button.
            For all other question:

            1. Define a clear "question" for the user.
            2. Define the question type: either "yesno" or "date".
            3. If the input text includes necessary context or information, include this in a subquestion (e.g. question: "Is your home unfit
               for ihabitation?" subquestion: "Homes are unfit for inhabitation if they are...").
            4. Define the field name which will store the user's response (boolean or date value).

            ***Rules for questions***
            These rules must hold true at all times.

            - The question type must be either "yesno"` or "date". No other types are permitted.
            - If any condition in the input text involves a specific date, time period, or deadline, you must define a date field — not a yes/no.
            - Ensure the array contains only the minimum set of questions needed to reach a correct and fair decision.
            - Include a "subquestion" field if the material contains information necessary to answer the question. Do not use "subquestion" to explain legal rules, timeframes, or exclusions.
            - Never include a "subquestion" that describes eligibility cutoffs, such as “Required after [date]” or “Does not apply if...”. These should be handled with additional questions or evaluated
              behind the scenes when when the system is built.
            - If a legal condition requires checking exceptions (e.g. unless, except), create an explicit question to capture that.
            - Questions must be easy for a user to understand and answer. If a sentence in the input text is too complex, break it into multiple questions.


            ***Example Input***
            "Check if your family members can get pre-settled status or settled status. If you’re an EU, EEA or Swiss citizen living in the UK,
            some of your family can also apply to come and live in the UK. They can apply for pre-settled or settled status from the EU Settlement
            Scheme if both:
            - you have pre-settled or settled status
            - your relationship with your family member started by 31 December 2020 - unless you’re a Swiss citizen
            If your family member is a child who was born after 31 December 2020, you can also apply for them to come and live in the UK.
            If you came to the UK on a visa after 31 December 2020, you can't use the EU Settlement Scheme to bring your family to the UK.
            You’ll need to check if your visa allows you to bring your family to the UK. The EEA includes EU countries and also Iceland, Liechtenstein
            and Norway."


            ***Example Output***
            [
              {
                "mandatory": true,
                "question": "This interview will determine whether your family member is able to apply for pre-settled or settled status from the EU Settlement Scheme.",
                "buttons": [
                  {
                    "Continue": "continue"
                  }
                ]
              },
              {
                "event": "can_apply_event",
                "question": "You meet the basic requirements for your family to apply for pre-settled or settled status from the EU Settlement Scheme.",
                "buttons": [
                  {
                    "Exit": "exit"
                  }
                ]
              },
              {
                "question": "Are you an EU citizen?",
                "yesno": "is_eu_citizen"
              },
              {
                "question": "Do you have settled status?",
                "yesno": "is_settled"
              },
              {
                "question": "Do you have pre-settled status?",
                "yesno": "is_pre_settled"
              },
              {
                "question": "What date did your relationship with your family member start?",
                "subquestion": "If they are a blood relative, this will be their date of birth",
                "fields": [
                  {
                    "Date": "date_of_relation_start",
                    "datatype": "date"
                  }
                ]
              },
              {
                "question": "Are you Swiss?",
                "yesno": "is_swiss"
              },
              {
                "question": "What is your family member's date of birth?",
                "fields": [
                  {
                    "Date": "date_of_birth",
                    "datatype": "date"
                  }
                ]
              },
              {
                "question": "Did you come to the UK on a visa?",
                "yesno": "arrived_on_visa"
              },
              {
                "question": "What date did you arrive on this visa?",
                "fields": [
                  {
                    "Date": "date_of_arrival",
                    "datatype": "date"
                  }
                ]
              }
            ]

         """

In [12]:
response = generate_with_reasoner(input_text, prompt_1)
reasoning_content = response.choices[0].message.reasoning_content
output = response.choices[0].message.content

In [13]:
import json

lines = output.splitlines()
cleaned_json = "\n".join(lines[1:-1])

json_response = json.loads(cleaned_json)

In [14]:
import yaml

with open('skeleton.yml', 'w') as outfile:
    yaml.dump_all(json_response, outfile, explicit_start=True, allow_unicode=True, default_flow_style=False)

Prompt for adapting skeleton.yml to include interview logic, based on relationships between conditions. Output will be saved as interview.yml

In [22]:
prompt_2 = """You are assuming the role of a legal expert developing an expert system. Your task is to complete a docassemble interview that reaches a legal conclusion.

            You will be provided with:
            - A skeleton interview
            - A legal text, which is the sole source of truth

            You must edit the interview by adding code blocks that control the flow based on user responses. Use only the legal text to determine the logic and construct the necessary questions.

            Code Block Requirements:

            - The first code block must be the only block marked "mandatory: True".
              It must contain the main logic that evaluates to a boolean value (e.g., "can_apply = True" or "False").
              This block must trigger an event depending on the outcome of the evaluation ("can_apply_event" or "cannot_apply_event").

            - Subsequent blocks may define boolean variables based on conditions (e.g. date comparisons or grouped expressions).
              These blocks should come after the main logic and must only be evaluated if needed.
              Do not give booleans default values at the start of a code block, as this may cause the code block to be terminated
              without completing.

            - Use the "as_datetime()" function from docassemble's "DADateTime" class for date handling.
              Example: "cutoff = as_datetime('12/31/2020')"
              Do not use "datetime" directly. Use "today()" instead of "datetime.date.today()".
              To add time to a date variable use .plus(months=...)

            - When writing logic, use a single, cohesive boolean expression with proper logical operators and parentheses.
              Do not use multiple "if" statements.
              Do not reference field names unless they are defined in a corresponding question block.
              If a required question is missing, add it, based strictly on the legal source material.

            Rules for Questions (Must Always Be Followed):

            - The question type must be "yesno" or "date" — no other types are allowed.
            - If a legal condition involves a specific date, deadline, or time period, you must use a "date" field.
            - Include only the minimum set of questions required to reach a fair and accurate legal conclusion.
            - If the user would need extra context to answer a question (e.g. legal definitions, factual criteria), include a "subquestion" field.
              - Do not use "subquestion" to explain legal rules, timeframes, or exceptions.
            - If the legal text includes exceptions (e.g. "unless", "except"), create a separate yes/no question to handle that logic.
            - Questions must be phrased simply and clearly for users.
              If a legal sentence is too complex, break it into multiple questions.

            You must output the full interview in valid YAML format.

            Example:

            ```yaml
            ---
            mandatory: True
            code: |
              if (
                is_eu_citizen and
                (is_settled or is_pre_settled) and
                (is_swiss or relation_started_by or was_born_after) and
                not (arrived_on_visa and arrived_after)
              ):
                can_apply = True
              else:
                can_apply = False

              if can_apply:
                can_apply_event
              else:
                cannot_apply_event
            ---
            code: |
              relation_cutoff = as_datetime('12/31/2020')
              if date_of_relation_start <= relation_cutoff:
                relation_started_by = True
              else:
                relation_started_by = False
            ---
            code: |
              birth_cutoff = as_datetime('12/31/2020')
              if date_of_birth > birth_cutoff:
                was_born_after = True
              else:
                was_born_after = False
            ---
            code: |
              arrival_cutoff = as_datetime('12/31/2020')
              if date_of_arrival > arrival_cutoff:
                arrived_after = True
              else:
                arrived_after = False
            ```

        """

In [23]:
with open('skeleton.yml', 'r') as f:
    skeleton_docs = list(yaml.load_all(f, yaml.FullLoader))
    skeleton_str = '\n---\n'.join([yaml.dump(doc) for doc in skeleton_docs])


In [24]:
response = generate_with_reasoner(input_text + skeleton_str, prompt_2)
reasoning_content = response.choices[0].message.reasoning_content
output = response.choices[0].message.content

In [25]:
import yaml

lines = output.splitlines()
cleaned_yaml = "\n".join(lines[1:-1])

with open('interview.yml', 'w') as outfile:
    outfile.write(cleaned_yaml)