# Rule Creation from Ad Hoc Format
We need to take in the custom ad hoc representation of rules and output the correct XML representation of the rule. Here is the current evaluation sheet https://docs.google.com/spreadsheets/d/1IcKvUz15310M0p4P7Wbm06veaH5qeT_mo5FEoVjj8Lo/edit?pli=1#gid=58372770

### Util Functions

In [1]:
import backoff
import logging
import openai
from typing import List


def setup_logger(name: str):
    # Create a logger instance
    logger = logging.getLogger(name)
    logger.setLevel(logging.INFO)

    # Create a formatter for the log messages
    formatter = logging.Formatter('%(asctime)s [%(levelname)s] %(message)s')

    # Create a console handler for the log messages
    console_handler = logging.StreamHandler()
    console_handler.setLevel(logging.INFO)
    console_handler.setFormatter(formatter)

    # Add the handlers to the logger
    # logger.addHandler(file_handler)
    logger.addHandler(console_handler)
    return logger


logger = setup_logger("notebook_logger")


@backoff.on_exception(backoff.expo, openai.error.OpenAIError, logger=logger)
def call_gpt_with_backoff(messages: List, model: str = "gpt-4", temperature: float = 0.7, max_length: int = 256) -> str:
    """
    Generic function to call GPT4 with specified messages
    """
    return call_gpt(model=model, messages=messages, temperature=temperature, max_length=max_length)


def call_gpt(model: str, messages: List, temperature: float = 0.7, max_length: int = 256) -> str:
    """
    Generic function to call GPT4 with specified messages
    """
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature,
        max_tokens=max_length,
        frequency_penalty=0.0,
        top_p=1
    )
    return response['choices'][0]['message']['content']


def generate_simple_message(system_prompt: str, user_prompt: str) -> List[dict]:
    return [
        {
            'role': 'system',
            'content': system_prompt
        },
        {
            'role': 'user',
            'content': user_prompt
        }
    ]

In [2]:
openai.api_key = input("Enter your api key: ")

### Current Prompt

In [3]:
SYSTEM_PROMPT = """You are a system that takes in ad hoc rule syntax and some other info to then translate the rule into full xml rules. Here are some examples:


Ad Hoc:
( and is ) not without ( consequence consequences )
Rule Number:
30120
Correction:
$0 significant @ $0 weighty @ $0 consequential 
Category:
Conciseness
Explanation:
Would using fewer words help tighten the sentence?
Test Sentence:
The event is not without consequence. 
Corrected Test Sentence:
The event is significant.

XML Rule:
<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30120">
    <pattern>
        <token regexp="yes">and|is</token>
        <token>not</token>
        <token>without</token>
        <token regexp="yes">consequence|consequences</token>
    </pattern>
    <message>Would using fewer words help tighten the sentence?</message>
    <suggestion><match no="1"/> significant</suggestion>
    <suggestion><match no="1"/> weighty</suggestion>
    <suggestion><match no="1"/> consequential</suggestion>
    <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":3,"priority":"4.145","WORD":true,"OUTLOOK":true}</short>
    <example correction="is significant|is weighty|is consequential">The event <marker>is not without consequence</marker>.</example>
</rule>

###

Ad Hoc:
( CT(be) and ) a bit ( JJ.*? more !much !of )
Rule Number:
30115
Correction:
$0 $3
Category:
Conciseness
Explanation:
Would cutting <i>a bit</i> help tighten the sentence?
Test Sentence:
The book does this and a bit more. 
Corrected Test Sentence:
The book does this and more. 

XML Rule:
<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30115">
    <pattern>
        <or>
            <token inflected="yes">be</token>
            <token>and</token>
        </or>
        <token>a</token>
        <token>bit</token>
        <token postag="JJ.*" postag_regexp="yes">
            <exception regexp="yes">more|much|of</exception>
        </token>
    </pattern>
    <message>Would cutting *a bit* help tighten the sentence?</message>
    <suggestion><match no="1"/> <match no="4"/></suggestion>
    <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":1,"priority":"4.174","WORD":true,"OUTLOOK":true}</short>
    <example correction="was challenging">The project <marker>was a bit challenging</marker>.</example>
</rule>

###

Ad Hoc:
( RX(.*?) !abortion !anywhere !cases !chart !everywhere !used !violation ) except where ( RX(.*?) !noted !otherwise !permitted !specifically !such )
Rule Number:
30116
Correction:
$0 unless $3
Category:
Fresh Language
Explanation:
Would direct language such as <i>unless<i> convey your point just as effectively?<linebreak/><linebreak/><b>Example</b> from Justice Sotomayor: “[I]t contends that no aged-out child may retain her priority date <b>unless</b> her petition is also eligible for automatic conversion.”<linebreak/><linebreak/><b>Example</b> from Office of Legal Counsel: “The 2019 Opinion reasoned that Congress lacks constitutional authority to compel the Executive Branch . . . even when a statute vests the committee with a right to the information, <b>unless</b> the information would serve a legitimate legislative purpose.”<linebreak/><linebreak/><b>Example</b> from Morgan Chu: “During this arbitration, [Defendant] stopped paying royalties and refused to pay anything <b>unless</b> ordered to do so.”<linebreak/><linebreak/><b>Example</b> from Paul Clement: “The bottom line is that there is no preemption <b>unless</b> state law conflicts with some identifiable federal statute.”<linebreak/><linebreak/><b>Example</b> from Andy Pincus: “The law does not permit a claim for defamation <b>unless</b> the allegedly false statement has caused actual harm.”<linebreak/><linebreak/><b>Example</b> from Microsoft’s Standard Contract: “Licenses granted on a subscription basis expire at the end of the applicable subscription period set forth in the Order, <b>unless</b> renewed.”
Test Sentence:
Italic type is used for examples except where they are presented in lists.
Corrected Test Sentence:
Italic type is used for examples unless they are presented in lists.

XML Rule:
<rule id="{new_rule_id}" name="BRIEFCATCH_DIRECT_LANGUAGE_30116">
    <pattern>
        <token>
            <exception regexp="yes">abortion|anywhere|cases|chart|everywhere|used|violation</exception>
        </token>
        <token>except</token>
        <token>where</token>
        <token>
            <exception regexp="yes">noted|otherwise|permitted|specifically|such</exception>
        </token>
    </pattern>
    <message>Would direct language such as *unless* convey your point just as effectively?|**Example** from Justice Sotomayor: “[I]t contends that no aged-out child may retain her priority date **unless** her petition is also eligible for automatic conversion.”|**Example** from Office of Legal Counsel: “The 2019 Opinion reasoned that Congress lacks constitutional authority to compel the Executive Branch . . . even when a statute vests the committee with a right to the information, **unless** the information would serve a legitimate legislative purpose.”|**Example** from Morgan Chu: “During this arbitration, [Defendant] stopped paying royalties and refused to pay anything **unless** ordered to do so.”|**Example** from Paul Clement: “The bottom line is that there is no preemption **unless** state law conflicts with some identifiable federal statute.”|**Example** from Andy Pincus: “The law does not permit a claim for defamation **unless** the allegedly false statement has caused actual harm.”|**Example** from Microsoft's Standard Contract: “Licenses granted on a subscription basis expire at the end of the applicable subscription period set forth in the Order, **unless** renewed.”</message>
    <suggestion><match no="1"/> unless <match no="4"/></suggestion>
    <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":1,"priority":"4.282","WORD":true,"OUTLOOK":true}</short>
    <example correction="examples unless they">Italic type is used for <marker>examples except where they</marker> are presented in lists.</example>
</rule>

###

Ad Hoc:
SENT_START in that case , ( however though ) , ( i he if in it she there this )
Rule Number:
30136
Correction:
But $7 @ Then $6 $7 @ But then $7
Category:
Flow
Explanation:
Could shortening your opening transition add punch and help lighten the style?<linebreak/><linebreak/><b>Example</b> from Chief Justice Roberts: “<b>But</b> that argument . . . confuses mootness with the merits.”
Test Sentence:
In that case, however, this subtitle should tell you.
Corrected Test Sentence:
But this subtitle should tell you.

XML Rule:
<rule id="{new_rule_id}" name="BRIEFCATCH_FLOW_30136">
    <pattern>
        <token postag="SENT_START"/>
        <marker>
            <token>in</token>
            <token>that</token>
            <token>case</token>
            <token>,</token>
            <token regexp="yes">however|though</token>
            <token>,</token>
            <token regexp="yes">he|i|if|in|it|she|there|this</token>
        </marker>
    </pattern>
    <message>Could shortening your opening transition add punch and help lighten the style?|**Example** from Chief Justice Roberts: “**But** that argument . . . confuses mootness with the merits.”</message>
    <suggestion>But <match no="8"/></suggestion>
    <suggestion>Then<match no="7"/> <match no="8"/></suggestion>
    <suggestion>But then <match no="8"/></suggestion>
    <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":3,"priority":"8.252","WORD":true,"OUTLOOK":true}</short>
    <example correction="But this|Then, this|But then this"><marker>In that case, however, this</marker> subtitle should tell you.</example>
</rule>


###

Ad Hoc:
the aim of DT ( NN NN:U NN:UN !analysis !council !game !present !project !research !study !work ) is to VB
Rule Number:
30117
Correction:
$3 $4 seeks $6 $7
Category:
Conciseness
Explanation:
Would using fewer words help tighten the sentence?<linebreak/><linebreak/><b>Example</b> from Justice Stevens: “The holder class action that respondent <b>tried to plead</b> . . . is distinguishable from a typical Rule 10b–5 class action in only one respect[.]”<linebreak/><linebreak/><b>Example</b> from Eric Holder: “Now, more than a decade later, [Plaintiffs] <b>seek to hold</b> . . . .”<linebreak/><linebreak/><b>Example</b> from Deanne Maynard: “The industry <b>sought to</b> justify that time period by arguing that patents did not adequately protect investment in biologics[.]”
Test Sentence:
The aim of this book is to give general advice.
Corrected Test Sentence:
This book seeks to give general advice.

XML Rule:
<rule id="BRIEFCATCH_281025868524827903719537260966583393237" name="BRIEFCATCH_CONCISENESS_30117">
	<pattern>
		<token>the</token>
		<token>aim</token>
		<token>of</token>
		<token postag="DT"/>
		<token postag="NN|NN:U|NN:UN" postag_regexp="yes">
			<exception regexp="yes">analysis|council|game|present|project|research|study|work</exception>
		</token>
		<token>is</token>
		<token>to</token>
		<token postag="VB"/>
	</pattern>
	<message>Would using fewer words help tighten the sentence?|**Example** from Justice Stevens: “The holder class action that respondent **tried to plead** . . . is distinguishable from a typical Rule 10b-5 class action in only one respect[.]”|**Example** from Eric Holder: “Now, more than a decade later, [Plaintiffs] **seek to hold** . . . .”|**Example** from Deanne Maynard: “The industry **sought to** justify that time period by arguing that patents did not adequately protect investment in biologics[.]”</message>
	<suggestion><match no="4"/> <match no="5"/> seeks <match no="7"/> <match no="8"/></suggestion>
	<short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":1,"priority":"8.328","WORD":true,"OUTLOOK":true}</short>
	<example correction="This book seeks to give"><marker>The aim of this book is to give</marker> general advice.</example>
</rule>

###

Ad Hoc:
( RX(.*?) !bed !conform !doctor !him !house !office !out !place !room !them !you ) as ( closely fast quickly simply soon ) as possible
Rule Number:
30118
Correction:
$0 $2
Category:
Conciseness
Explanation:
Would using fewer words help tighten the sentence?
Test Sentence:
Say it as simply as possible. 
Corrected Test Sentence:
Say it simply.

XML Rule:
<rule id="BRIEFCATCH_85326884043711870554689824506910775620" name="BRIEFCATCH_CONCISENESS_30118">
	<pattern>
		<token>
			<exception regexp="yes">bed|conform|doctor|him|house|office|out|place|room|them|you</exception>
		</token>
		<token>as</token>
		<token regexp="yes">closely|fast|quickly|simply|soon</token>
		<token>as</token>
		<token>possible</token>
	</pattern>
	<message>Would using fewer words help tighten the sentence?</message>
	<suggestion><match no="1"/> <match no="3"/></suggestion>
	<short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":1,"priority":"5.262","WORD":true,"OUTLOOK":true}</short>
	<example correction="it simply">Say <marker>it as simply as possible</marker>.</example>
</rule>

###

Here are some abbreviations and their meanings that will be helpful in creating these rules:
I.             Part of Speech Tags 
CC Coordinating conjunction: for, and, nor, but, or, yet, so			
CD Cardinal number: one, two, twenty-four			
DT Determiner: a, an, all, many, much, any, some, this			
EX Existential there: there (no other words)			
FW Foreign word: infinitum, ipso			
IN Preposition/subordinate conjunction: except, inside, across, on, through, beyond, with, without			
JJ Adjective: beautiful, large, inspectable			
JJR Adjective, comparative: larger, quicker			
JJS Adjective, superlative: largest, quickest			
LS List item marker: not used by LanguageTool			
MD Modal: should, can, need, must, will, would			
NN Noun, singular count noun: bicycle, earthquake, zipper			
NNS Noun, plural: bicycles, earthquakes, zippers			
NN:U Nouns that are always uncountable #new tag - deviation from Penn, examples: admiration, Afrikaans			
NN:UN Nouns that might be used in the plural form and with an indefinite article, depending on their meaning #new tag - deviation from Penn, examples: establishment, wax, afternoon			
NNP Proper noun, singular: Denver, DORAN, Alexandra			
NNPS Proper noun, plural: Buddhists, Englishmen			
ORD Ordinal number: first, second, twenty-third, hundredth #New tag (experimental) since LT 4.9. Specified in disambiguation.xml. Examples: first, second, third, twenty-fourth, seventy-sixth			
PCT Punctuation mark: (`.,;:…!?`) #new tag - deviation from Penn			
PDT Predeterminer: all, sure, such, this, many, half, both, quite			
POS Possessive ending: s (as in: Peter's)			
PRP Personal pronoun: everyone, I, he, it, myself			
PRP$ Possessive pronoun: its, our, their, mine, my, her, his, your			
RB Adverb and negation: easily, sunnily, suddenly, specifically, not			
RBR Adverb, comparative: better, faster, quicker			
RBS Adverb, superlative: best, fastest, quickest			
RB_SENT Adverbial phrase including a comma that starts a sentence. #New tag (experimental) since LT 4.8. Specified in disambiguation.xml. Examples: However, Whenever possible, First of all, On the other hand,			
RP Particle: in, into, at, off, over, by, for, under			
SENT_END: LanguageTool tags the last token of a sentence as both SENT_END and a regular part-of-speech tag.			
SENT_START: LanguageTool tags the first token of a sentence as both SENT_START and a regular part-of-speech tag.			
SYM Symbol: rarely used by LanguageTool (e.g. for 'DD/MM/YYYY')			
TO to: to (no other words)			
UH Interjection: aargh, ahem, attention, congrats, help			
VB Verb, base form: eat, jump, believe, be, have			
VBD Verb, past tense: ate, jumped, believed			
VBG Verb, gerund/present participle: eating, jumping, believing			
VBN Verb, past participle: eaten, jumped, believed			
VBP Verb, non-3rd ps. sing. present: eat, jump, believe, am (as in 'I am'), are			
VBZ Verb, 3rd ps. sing. present: eats, jumps, believes, is, has			
WDT wh-determiner: that, whatever, what, whichever, which (no other words)			
WP wh-pronoun: that, whatever, what, whatsoever, whomsoever, whosoever, who, whom, whoever, whomever, which (no other words)			
WP$ Possessive wh-pronoun: whose (no other words)			
WRB wh-adverb: however, how, wherever, where, when, why			
II.             Regular Expressions Used in Rules			
RX(.*?) A token that can be any word, punctuation mark, or symbol.			
RX([a-zA-Z]*) A token that can be any word.			
RX([a-zA-Z]+) A token that can be any word.			
III. Rules			
Rules consist of a number of tokens, some are required and some are optional.			
In the corrections, the first token is referred to as $0, the second $1, and so forth.			
If at least one word or tag or regular expression appears inside parentheses/brackets, the entire string, including the parentheses/brackets, is considered a single token.			
If at least two words or tags or regular expressions appear inside parentheses/brackets and if there is no “~” symbol at the end of the string, then any one of those words or tags or regular expressions is a required token in the string.			
If at least one word or tag or regular expression appears inside parentheses/brackets and if there is a “~” symbol at the end of the string, then any one of those words or tags or regular expressions is an optional token in the string.			
When a word or Part of Speech tag is preceded by “!”, that word or tag is excluded from the token. For example, "( CT(be) !been )" would include "be", "is", "am", "are", and "was", "were", and "being", but it would not include "been". Thus, the rule "( CT(be) !been ) happy" would flag "He was happy" but not "He had been happy".			
“SKIP” is always followed by a cardinal number. The number tells you how many tokens can come between the preceding token and the next one. The string “dog SKIP4 cat”, for example, would flag “The dog likes the cat” but would not flag “The dog likes the neighbor’s old cat,” nor would it flag “The cat likes the dog”.			
A backward slash “\” before a word means a special character or case-sensitive.			
“CT” refers to the infinitive form of a verb that can be conjugated. “CT(read)”, for example, could be “reads”, “read”, “reading”, etc.			
IV.          Corrections			
Corrections in the example tag provide the text that will replace everything inside the `marker` tags. Make sure when creating these, the corrected sentence would make sense when substituting in the correction. This would include no overlapping or duplicated words. However, and this is very important, if a word does not match the pattern for the rule, do not include it in the correction or within the marker tags.
Sometimes a rule has more than one possible correction. In that case, multiple alternative corrections are separated by the “@” symbol.			


Important Notes:
- Always set the rule id to `{new_rule_id}`
- Only return the rule XML, do not introduce it or wrap it with back ticks.
- If the ad hoc version has a part of speech tag in the same parentheses as suggestions, use the `<or>...</or>` tag with the part of speech tag as one token and the other options as a regexp token. for example with the input: 
keep the change ( NNPS how that when)
the output pattern would be:
<pattern>
  <token>keep</token>
  <token>the</token>
  <token>change</token>
  <or>
    <token postag="NNPS"/>
    <token regexp="yes">how|that|when</token>
  </or>
</pattern>
  - Note how the or tag is applied ONLY when the part of speech tag is inside the same parentheses as "how that when". Do not use the or tag if a part of speech tag is separate from other options. If using the or tag, make sure to use the regexp field and include multiple options in one token separated by `|`.
- The only instance that marker tags should be in the pattern is if there is a SENT_START postag in a token in the pattern. In this case, all tokens that succeed the SENT_START token need to be nested within marker tags, so that the SENT_START token is applied correctly.
- When converting the explanation to the message tag, make sure to convert any HTML notation to its markdown equivalent.
- The exception tags are only used for words that are marked with `!`. If you see you need to make an exception tag, make a note of this in your thoughts to determine which group of options needs to be exceptions and which are regular regexp.

Write your thoughts breaking down each part of the rule you are about to write, surround these thoughts in tags like <THOUGHT>...</THOUGHT>. Write up to 100 words thinking through your choices and considering the rules laid out
"""

### Test Dataset
Some example inputs and their expected outputs

In [4]:
example_one = """Ad Hoc:
SENT_START keep in mind ( NNP how that the what when )
Rule Number:
30119
Correction:
Remember $4 @ Recall $4
Category:
Conciseness
Explanation:
Would using fewer words help tighten the sentence?
Test Sentence:
Keep in mind George Orwell’s six rules. 
Corrected Test Sentence:
Remember George Orwell’s six rules.

XML Rule:"""

In [5]:
example_one_messages = generate_simple_message(SYSTEM_PROMPT, example_one)
example_two_response = call_gpt_with_backoff(messages=example_one_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_two_response)

<THOUGHT>
The ad hoc rule starts with SENT_START, which means the first token in the pattern should be tagged with SENT_START and the subsequent tokens should be nested within marker tags. The phrase "keep in mind" is followed by a set of options that include a part of speech tag (NNP) and other words. This requires the use of an <or> tag with the part of speech as one token and the other options as a regexp token. The correction offers two alternatives, separated by the "@" symbol. The explanation will be converted to markdown format for the message tag. The exception tags are not needed here as there are no words marked with "!".
</THOUGHT>
<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30119">
    <pattern>
        <token postag="SENT_START"/>
        <marker>
            <token>keep</token>
            <token>in</token>
            <token>mind</token>
            <or>
                <token postag="NNP"/>
                <token regexp="yes">how|that|the|what|when</token>
   

Expected output:
```
<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30119">
        <pattern>
                <token postag="SENT_START"/>
                <marker>
                        <token>keep</token>
                        <token>in</token>
                        <token>mind</token>
                        <or>
                                <token postag="NNP"/>
                                <token regexp="yes">how|that|the|what|when</token>
                        </or>
                </marker>
        </pattern>
        <message>Would using fewer words help tighten the sentence?</message>
        <suggestion>Remember <match no="5"/></suggestion>
        <suggestion>Recall <match no="5"/></suggestion>
        <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":2,"priority":"5.128","WORD":true,"OUTLOOK":true}</short>
        <example correction="Remember George|Recall George"><marker>Keep in mind George</marker> Orwell`s six rules.</example>
</rule>
```
---

In [6]:
example_two = """Ad Hoc:
( and is ) not without ( consequence consequences )
Rule Number:
30120
Correction:
$0 significant @ $0 weighty @ $0 consequential 
Category:
Conciseness
Explanation:
Would using fewer words help tighten the sentence?
Test Sentence:
The event is not without consequence. 
Corrected Test Sentence:
The event is significant.

XML Rule:"""

In [7]:
example_two_messages = generate_simple_message(SYSTEM_PROMPT, example_two)
example_two_response = call_gpt_with_backoff(messages=example_two_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_two_response)

<THOUGHT>
The pattern consists of two groups of tokens with options, which means we'll use the <or>...</or> tag for each group. The first group "( and is )" will be translated into an <or> tag containing "and" and "is" as separate tokens. The second group "( consequence consequences )" will be translated into a single token with a regexp attribute to match either "consequence" or "consequences". The phrase "not without" will be two separate tokens. Since there's no SENT_START, we won't use marker tags in the pattern. The correction offers three alternatives, so we'll use the "@" symbol to separate them. The message will be converted from HTML to markdown.
</THOUGHT>
<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30120">
    <pattern>
        <or>
            <token>and</token>
            <token>is</token>
        </or>
        <token>not</token>
        <token>without</token>
        <token regexp="yes">consequence|consequences</token>
    </pattern>
    <message>Would using fe

Expected output:
```
<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30120">
        <pattern>
                <token regexp="yes">and|is</token>
                <token>not</token>
                <token>without</token>
                <token regexp="yes">consequence|consequences</token>
        </pattern>
        <message>Would using fewer words help tighten the sentence?</message>
        <suggestion><match no="1"/> significant</suggestion>
        <suggestion><match no="1"/> weighty</suggestion>
        <suggestion><match no="1"/> consequential</suggestion>
        <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":3,"priority":"4.145","WORD":true,"OUTLOOK":true}</short>
        <example correction="is significant|is weighty|is consequential">The event <marker>is not without consequence</marker>.</example>
</rule>
```
___

In [8]:
example_three = """Ad Hoc:
CT(be) ( fairly quite rather somewhat ) ( afraid available clear difficult easy essential good important likely necessary possible ready similar sure true wrong )
Rule Number:
30122
Correction:
$0 $2
Category:
Conciseness
Explanation:
Would cutting this implied modifier help strengthen the sentence?
Test Sentence:
It is quite easy to rewrite an article. 
Corrected Test Sentence:
It is easy to rewrite an article.

XML Rule:"""

In [9]:
example_three_messages = generate_simple_message(SYSTEM_PROMPT, example_three)
example_three_response = call_gpt_with_backoff(messages=example_three_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_three_response)

<THOUGHT>
The ad hoc rule specifies a conjugated form of "be" (CT(be)), followed by a group of adverbs that can modify an adjective, and then a list of adjectives. The correction suggests removing the adverb to make the sentence more concise. The pattern will include the conjugated "be" followed by one of the adverbs and then one of the adjectives. The message will explain the rationale for the correction, and the example will show the correction applied. The exception tags are not needed here because there are no words marked with `!`.
</THOUGHT>

<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30122">
    <pattern>
        <token inflected="yes">be</token>
        <token regexp="yes">fairly|quite|rather|somewhat</token>
        <token regexp="yes">afraid|available|clear|difficult|easy|essential|good|important|likely|necessary|possible|ready|similar|sure|true|wrong</token>
    </pattern>
    <message>Would cutting this implied modifier help strengthen the sentence?</message>
   

Expected output:
```
<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30122">
        <pattern>
                <token inflected="yes">be</token>
                <token regexp="yes">fairly|quite|rather|somewhat</token>
                <token regexp="yes">afraid|available|clear|difficult|easy|essential|good|important|likely|necessary|possible|ready|similar|sure|true|wrong</token>
        </pattern>
        <message>Would cutting this implied modifier help strengthen the sentence?</message>
        <suggestion><match no="1"/> <match no="3"/></suggestion>
        <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":1,"priority":"3.249","WORD":true,"OUTLOOK":true}</short>
        <example correction="is easy">It <marker>is quite easy</marker> to rewrite an article.</example>
</rule>
```
---

In [10]:
example_four = """Ad Hoc:
really ( V.*? !did !do !have !know !think !want !wanted )
Rule Number:
30123
Correction:
$1 
Category:
Conciseness
Explanation:
Would cutting this implied modifier help strengthen the sentence?
Test Sentence:
They may also wonder whether these two people really exist. 
Corrected Test Sentence:
They may also wonder whether these two people exist.

XML Rule:"""

In [11]:
example_four_messages = generate_simple_message(SYSTEM_PROMPT, example_four)
example_four_response = call_gpt_with_backoff(messages=example_four_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_four_response)

<THOUGHT>
The ad hoc rule suggests that the word "really" is unnecessary when used before certain verbs, except for a list of verbs where it should not be removed (did, do, have, know, think, want, wanted). The pattern will include "really" followed by a verb token that excludes the listed verbs. The correction suggests removing "really" and keeping the verb, which is referred to as $1 in the correction. The explanation will be converted to markdown for the message tag. The test sentence and corrected test sentence provide context for the example tag.
</THOUGHT>

<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30123">
    <pattern>
        <token>really</token>
        <token postag="V.*" postag_regexp="yes">
            <exception regexp="yes">did|do|have|know|think|want|wanted</exception>
        </token>
    </pattern>
    <message>Would cutting this implied modifier help strengthen the sentence?</message>
    <suggestion><match no="2"/></suggestion>
    <short>{"ruleGroup":nu

Expected output:
```
<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30123">
        <pattern>
                <token>really</token>
                <token postag="V.*" postag_regexp="yes">
                        <exception regexp="yes">did|do|have|know|think|want|wanted</exception>
                </token>
        </pattern>
        <message>Would cutting this implied modifier help strengthen the sentence?</message>
        <suggestion><match no="2"/></suggestion>
        <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":1,"priority":"2.159","WORD":true,"OUTLOOK":true}</short>
        <example correction="exist">They may also wonder whether these two people <marker>really exist</marker>.</example>
</rule>
```

---

In [12]:
example_five = """Ad Hoc:
(CT(be) and but they i he they have ) not ( generally typically usually ) ( RX(.*?) !accepted !considered !known )
Rule Number:
30124
Correction:
$0 rarely $3 @ $0 seldom $3 
Category:
Conciseness
Explanation:
Would using fewer words help tighten the sentence?
Test Sentence:
They are not generally definite articles.
Corrected Test Sentence:
They are rarely definite articles.

XML Rule:"""

In [13]:
example_five_messages = generate_simple_message(SYSTEM_PROMPT, example_five)
example_five_response = call_gpt_with_backoff(messages=example_five_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_five_response)

<THOUGHT>First, I need to identify the pattern components from the ad hoc rule. The "CT(be)" indicates conjugated forms of "be," and "but they i he they have" are alternatives that should be included in an <or> tag. The "not" is a required token. The adverbs "generally typically usually" are alternatives that should be included in a regexp token. The "RX(.*?) !accepted !considered !known" indicates any token except for the listed exceptions. The correction suggests replacing "not generally/typically/usually" with "rarely" or "seldom," so I'll use two suggestions. The explanation will be converted to markdown for the message tag.</THOUGHT>

<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30124">
    <pattern>
        <or>
            <token inflected="yes">be</token>
            <token>and</token>
            <token>but</token>
            <token>they</token>
            <token>i</token>
            <token>he</token>
            <token>have</token>
        </or>
        <token>not

Expected output: 
```
<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30124">                                        
        <pattern>                                
                <or>                        
                        <token inflected="yes">be</token>                
                        <token regexp="yes">and|but|they|i|he|they|have</token>                
                </or>                        
                <token>not</token>                        
                <token regexp="yes">generally|typically|usually</token>                        
                <token>                        
                        <exception regexp="yes">accepted|considered|known</exception>                
                </token>                        
        </pattern>                                
        <message>Would using fewer words help tighten the sentence?</message>                                
        <suggestion><match no="1"/> rarely <match no="4"/></suggestion>                                
        <suggestion><match no="1"/> seldom <match no="4"/></suggestion>                                
        <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":2,"priority":"4.174","WORD":true,"OUTLOOK":true}</short>                                
        <example correction="are rarely definite|are seldom definite">They <marker>are not generally definite</marker> articles.</example>                                
</rule>
```
---

In [14]:
example_six = """Ad Hoc:
CT(do) not ( generally typically usually ) ( VB !give !include !take )
Rule Number:
30125
Correction:
rarely $3-$0 @ seldom $3-$0 
Category:
Conciseness
Explanation:
Would using fewer words help tighten the sentence?
Test Sentence:
They do not generally required a definite article.
Corrected Test Sentence:
They rarely require a definite article.

XML Rule:"""

In [15]:
example_six_messages = generate_simple_message(SYSTEM_PROMPT, example_six)
example_six_response = call_gpt_with_backoff(messages=example_six_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_six_response)

<THOUGHT>
The ad hoc rule specifies a conjugatable verb "do" followed by "not" and a choice of adverbs "generally," "typically," or "usually," and then a verb that is not "give," "include," or "take." The correction suggests replacing the phrase with "rarely" or "seldom" followed by the verb and the conjugated form of "do." The pattern will use the <or> tag to include the adverbs as a single token with a regexp attribute. The verb will be a separate token with an exception for the words "give," "include," and "take." The correction uses a hyphen to connect the adverb to the conjugated form of "do," which is represented by $0. The message will convert the HTML <i> tags to markdown italics.
</THOUGHT>
<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30125">
    <pattern>
        <token inflected="yes">do</token>
        <token>not</token>
        <or>
            <token regexp="yes">generally|typically|usually</token>
        </or>
        <token postag="VB" postag_regexp="yes">
   

Expected output:
```
<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30125">
    <pattern>
        <marker>
            <token inflected="yes">do</token>
            <token>not</token>
            <token regexp="yes">generally|typically|usually</token>
            <token postag="VB" postag_regexp="yes">
                <exception regexp="yes">give|include|take</exception>
            </token>
        </marker>
    </pattern>
    <message>Would using fewer words help tighten the sentence?</message>
    <suggestion>rarely <match no="4"/></suggestion>
    <suggestion>seldom <match no="4"/></suggestion>
    <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":2,"priority":"5.262","WORD":true,"OUTLOOK":true}</short>
    <example correction="They rarely require|They seldom require">They <marker>do not generally require</marker> a definite article.</example>
</rule>
```
---

In [16]:
example_seven = """Ad Hoc:
( RX(.*?) !closed !him !prohibited !time !times !used ) except when ( he i it otherwise the there they we you )
Rule Number:
30132
Correction:
$0 unless $3
Category:
Fresh Language
Explanation:
Would direct language such as <i>unless</i> convey your point just as effectively?<linebreak/><linebreak/><b>Example</b> from Justice Sotomayor: “[I]t contends that no aged-out child may retain her priority date <b>unless</b> her petition is also eligible for automatic conversion.”<linebreak/><linebreak/><b>Example</b> from Office of Legal Counsel: “The 2019 Opinion reasoned that Congress lacks constitutional authority to compel the Executive Branch . . . even when a statute vests the committee with a right to the information, <b>unless</b> the information would serve a legitimate legislative purpose.”<linebreak/><linebreak/><b>Example</b> from Morgan Chu: “During this arbitration, [Defendant] stopped paying royalties and refused to pay anything <b>unless</b> ordered to do so.”<linebreak/><linebreak/><b>Example</b> from Paul Clement: “The bottom line is that there is no preemption <b>unless</b> state law conflicts with some identifiable federal statute.”<linebreak/><linebreak/><b>Example</b> from Andy Pincus: “The law does not permit a claim for defamation <b>unless</b> the allegedly false statement has caused actual harm.”<linebreak/><linebreak/><b>Example</b> from Microsoft’s Standard Contract: “Licenses granted on a subscription basis expire at the end of the applicable subscription period set forth in the Order, <b>unless</b> renewed.”
Test Sentence:
Omit except when it is part of a name.
Corrected Test Sentence:
Omit unless it is part of a name.

XML Rule:"""

In [17]:
example_seven_messages = generate_simple_message(SYSTEM_PROMPT, example_seven)
example_seven_response = call_gpt_with_backoff(messages=example_seven_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_seven_response)

<THOUGHT>The pattern consists of a token that can be any word except the ones listed with "!", followed by the words "except when", and then a list of pronouns and the word "otherwise". The exception tag is needed for the first token to exclude the words listed with "!". The second part of the pattern is straightforward, with a list of pronouns and "otherwise" that do not require exceptions or regular expressions. The correction suggests replacing "except when" with "unless", so the suggestion tag will include the first token and the third token from the pattern. The message will include the explanation with HTML tags converted to markdown. The examples provided in the explanation will be included in the message as well.</THOUGHT>

<rule id="{new_rule_id}" name="BRIEFCATCH_FRESH_LANGUAGE_30132">
    <pattern>
        <token>
            <exception regexp="yes">closed|him|prohibited|time|times|used</exception>
        </token>
        <token>except</token>
        <token>when</token>
  

Expected output:
```
<rule id="{new_rule_id}" name="BRIEFCATCH_DIRECT_LANGUAGE_30132">
        <pattern>
                <token>
                        <exception regexp="yes">closed|him|prohibited|time|times|used</exception>
                </token>
                <token>except</token>
                <token>when</token>
                <token regexp="yes">he|i|it|otherwise|the|there|they|we|you</token>
        </pattern>
        <message>Would direct language such as *unless* convey your point just as effectively?|**Example** from Justice Sotomayor: “[I]t contends that no aged-out child may retain her priority date **unless** her petition is also eligible for automatic conversion.”|**Example** from Office of Legal Counsel: “The 2019 Opinion reasoned that Congress lacks constitutional authority to compel the Executive Branch . . . even when a statute vests the committee with a right to the information, **unless** the information would serve a legitimate legislative purpose.”|**Example** from Morgan Chu: “During this arbitration, [Defendant] stopped paying royalties and refused to pay anything **unless** ordered to do so.”|**Example** from Paul Clement: “The bottom line is that there is no preemption **unless** state law conflicts with some identifiable federal statute.”|**Example** from Andy Pincus: “The law does not permit a claim for defamation **unless** the allegedly false statement has caused actual harm.”|**Example** from Microsoft's Standard Contract: “Licenses granted on a subscription basis expire at the end of the applicable subscription period set forth in the Order, **unless** renewed.”</message>
        <suggestion><match no="1"/> unless <match no="4"/></suggestion>
        <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":1,"priority":"4.225","WORD":true,"OUTLOOK":true}</short>
        <example correction="Omit unless it"><marker>Omit except when it</marker> is part of a name.</example>
</rule>
```
---

In [18]:
example_eight = """Ad Hoc:
SENT_START in that case , ( however though ) , ( i he if in it she there this )
Rule Number:
30136
Correction:
But $7 @ Then $6 $7 @ But then $7
Category:
Flow
Explanation:
Could shortening your opening transition add punch and help lighten the style?<linebreak/><linebreak/><b>Example</b> from Chief Justice Roberts: “<b>But</b> that argument . . . confuses mootness with the merits.”
Test Sentence:
In that case, however, this subtitle should tell you.
Corrected Test Sentence:
But this subtitle should tell you.

XML Rule:"""

In [19]:
example_eight_messages = generate_simple_message(SYSTEM_PROMPT, example_eight)
example_eight_response = call_gpt_with_backoff(messages=example_eight_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_eight_response)

<THOUGHT>
The pattern starts with SENT_START which means the first token of a sentence. Following this, we have a fixed phrase "in that case," which should be matched exactly. The next token is a choice between "however" and "though," so we'll use a regular expression with the "regexp" attribute set to "yes." The final token is a choice between several pronouns and the word "in," which will also be a regular expression. Since SENT_START is used, we need to nest all subsequent tokens within <marker> tags. The correction offers three alternatives, so we'll use the "@" symbol to separate them. The message will include the explanation and the example from Chief Justice Roberts, with HTML tags converted to markdown.
</THOUGHT>

<rule id="{new_rule_id}" name="BRIEFCATCH_FLOW_30136">
    <pattern>
        <token postag="SENT_START"/>
        <marker>
            <token>in</token>
            <token>that</token>
            <token>case</token>
            <token>,</token>
            <token re

Expected output:
```
<rule id="{new_rule_id}" name="BRIEFCATCH_FLOW_30136">
        <pattern>
                <token postag="SENT_START"/>
                <marker>
                        <token>in</token>
                        <token>that</token>
                        <token>case</token>
                        <token>,</token>
                        <token regexp="yes">however|though</token>
                        <token>,</token>
                        <token regexp="yes">he|i|if|in|it|she|there|this</token>
                </marker>
        </pattern>
        <message>Could shortening your opening transition add punch and help lighten the style?|**Example** from Chief Justice Roberts: “**But** that argument . . . confuses mootness with the merits.”</message>
        <suggestion>But <match no="8"/></suggestion>
        <suggestion>Then<match no="7"/> <match no="8"/></suggestion>
        <suggestion>But then <match no="8"/></suggestion>
        <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":3,"priority":"8.252","WORD":true,"OUTLOOK":true}</short>
        <example correction="But this|Then, this|But then this"><marker>In that case, however, this</marker> subtitle should tell you.</example>
</rule>
```
---

In [20]:
example_nine = """Ad Hoc:
( RX(.*?) !for !in !on !that !through !to !with ) the use of the ( RX(.*?) !band !land !phrase !verb !word !words )
Rule Number:
30156
Correction:
$0 using $4 $5
Category:
Conciseness
Explanation:
Would using fewer words and cutting the <i>of</i> phrase help tighten the sentence?
Test Sentence:
But the use of the dictionary is wrong.
Corrected Test Sentence:
But using the dictionary is wrong.

XML Rule:"""

In [21]:
example_nine_messages = generate_simple_message(SYSTEM_PROMPT, example_nine)
example_nine_response = call_gpt_with_backoff(messages=example_nine_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_nine_response)

<THOUGHT>
The pattern consists of two parts with exceptions. The first part is a wildcard token that excludes specific prepositions and conjunctions. The second part is another wildcard token that excludes certain nouns related to language elements. The correction suggests replacing "the use of the" with "using". The message will include the explanation with HTML tags converted to markdown. The example will show the correction applied within the sentence, ensuring that the corrected part is within the marker tags and that the correction makes sense in context.
</THOUGHT>

<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30156">
    <pattern>
        <token>
            <exception regexp="yes">for|in|on|that|through|to|with</exception>
        </token>
        <token>the</token>
        <token>use</token>
        <token>of</token>
        <token>
            <exception regexp="yes">band|land|phrase|verb|word|words</exception>
        </token>
        <token>
            <exception 

Expected output:
```
<rule id="{new_rule_id}" name="BRIEFCATCH_PUNCHINESS_30156">
        <pattern>
                <token>
                        <exception regexp="yes">for|in|on|that|through|to|with</exception>
                </token>
                <token>the</token>
                <token>use</token>
                <token>of</token>
                <token>the</token>
                <token>
                        <exception regexp="yes">band|land|phrase|verb|word|words</exception>
                </token>
        </pattern>
        <message>Would using fewer words and cutting the *of* phrase help tighten the sentence?</message>
        <suggestion><match no="1"/> using <match no="5"/> <match no="6"/></suggestion>
        <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":1,"priority":"6.286","WORD":true,"OUTLOOK":true}</short>
        <example correction="But using the dictionary"><marker>But the use of the dictionary</marker> is wrong.</example>
</rule>
```
---

In [22]:
example_ten = """Ad Hoc:
( CT(be) and ) a bit ( JJ.*? more !much !of )
Rule Number:
30115
Correction:
$0 $3
Category:
Conciseness
Explanation:
Would cutting <i>a bit</i> help tighten the sentence?
Test Sentence:
The book does this and a bit more. 
Corrected Test Sentence:
The book does this and more. 

XML Rule:"""

In [23]:
example_ten_messages = generate_simple_message(SYSTEM_PROMPT, example_ten)
example_ten_response = call_gpt_with_backoff(messages=example_ten_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_ten_response)

<THOUGHT>
The ad hoc rule specifies a pattern that includes either a conjugated form of "be" or the word "and", followed by the phrase "a bit" and an adjective that is not "much" or "of". The correction suggests removing "a bit" to make the sentence more concise. The explanation will be converted to markdown format for the message tag. The pattern will use an <or> tag to include both "be" with inflection and "and" as options. The adjective will use a postag with a regular expression and exceptions.
</THOUGHT>

<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30115">
    <pattern>
        <or>
            <token inflected="yes">be</token>
            <token>and</token>
        </or>
        <token>a</token>
        <token>bit</token>
        <token postag="JJ.*" postag_regexp="yes">
            <exception regexp="yes">more|much|of</exception>
        </token>
    </pattern>
    <message>Would cutting *a bit* help tighten the sentence?</message>
    <suggestion><match no="1"/> <matc

Expected output:
```
<rule id="BRIEFCATCH_164054315699492609263729987293589324728" name="BRIEFCATCH_CONCISENESS_30115">
    <pattern>
        <or>
                <token inflected="yes">be</token>
                <token>and</token>
        </or>
        <token>a</token>
        <token>bit</token>
        <token postag="JJ.*" postag_regexp="yes">
                <exception regexp="yes">more|much|of</exception>
        </token>
    </pattern>
    <message>Would cutting *a bit* help tighten the sentence?</message>
    <suggestion><match no="1"/> <match no="4"/></suggestion>
    <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":1,"priority":"4.174","WORD":true,"OUTLOOK":true}</short>
    <example correction="and more.">The book does this <marker>and a bit more</marker>.</example>
</rule>
```
----

In [24]:
example_eleven = """Ad Hoc:
a ( sudden ~ ) surprise move
Rule Number:
3240
Correction:
a surprise @ a move @ surprising @ unexpected
Category:
Fresh Language
Explanation:
<b>A surprise move</b> is a cliché. Could direct language convey your point just as effectively?
Test Sentence:
She made a sudden surprise move. 
Corrected Test Sentence:
She made a surprise.

XML Rule:"""

In [25]:
example_eleven_messages = generate_simple_message(SYSTEM_PROMPT, example_eleven)
example_eleven_response = call_gpt_with_backoff(messages=example_eleven_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_eleven_response)

<THOUGHT>
The ad hoc rule indicates that the word "sudden" is optional, as denoted by the tilde (~). The phrase "surprise move" is the main target for correction. The correction offers four alternatives, which means we need to provide multiple suggestions. The category is "Fresh Language," and the explanation suggests avoiding the cliché "a surprise move." The test sentence includes the optional word "sudden," so the corrected sentence should work without it. We need to ensure that the pattern accounts for the optional nature of "sudden" and that the suggestions reflect the alternatives provided. The message will be adapted from the explanation, using markdown instead of HTML for the bold text.
</THOUGHT>

<rule id="{new_rule_id}" name="BRIEFCATCH_FRESH_LANGUAGE_3240">
    <pattern>
        <token>a</token>
        <token regexp="yes">sudden</token>?
        <token>surprise</token>
        <token>move</token>
    </pattern>
    <message>A *surprise move* is a cliché. Could direct langu

Expected output:
```
<rule id="BRIEFCATCH_4496626169111403644393793089759868674587" name="BRIEFCATCH_FRESH_LANGUAGE_3240">
    <pattern>
        <token>a</token>
        <token min="0">sudden</token>
        <token>surprise</token>
        <token>move</token>
    </pattern>
    <message>*A surprise move* is a cliché. Could direct language convey your point just as effectively?</message>
    <suggestion>a surprise</suggestion>
    <suggestion>a move</suggestion>
    <suggestion>surprising</suggestion>
    <suggestion>unexpected</suggestion>
    <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":4,"priority":"5.0","WORD":true,"OUTLOOK":true}</short>
    <example correction="a surprise|a move|surprising|unexpected">She made <marker>a sudden surprise move</marker>.</example>
</rule>
```
---

In [26]:
example_twelve = """Ad Hoc:
CT(do) not ( generally typically usually ) ( VB !give !include !take )
Rule Number:
30125
Correction:
rarely $3-$0 @ seldom $3-$0
Category:
Conciseness
Explanation:
Would using fewer words help tighten the sentence?
Test Sentence:
They do not generally required a definite article.
Corrected Test Sentence:
They rarely require a definite article.

XML Rule:"""

In [27]:
example_twelve_messages = generate_simple_message(SYSTEM_PROMPT, example_twelve)
example_twelve_response = call_gpt_with_backoff(messages=example_twelve_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_twelve_response)

<THOUGHT>
First, we need to identify the pattern of the rule. The "CT(do)" indicates a conjugatable form of "do", which can be "do", "does", "did", etc. The "not" is a fixed token. The next token is a choice between adverbs "generally", "typically", "usually", which we will capture using a regexp token. The last token is a verb that is not "give", "include", or "take", so we will use a postag for "VB" with exceptions. The correction suggests replacing the phrase with "rarely" or "seldom" followed by the verb and the conjugated form of "do" with a hyphen in between, which is indicated by "$3-$0". The explanation will be converted to a message in markdown format. The corrected sentence shows that the verb should be in the base form, which matches the "VB" postag.
</THOUGHT>
<rule id="{new_rule_id}" name="BRIEFCATCH_CONCISENESS_30125">
    <pattern>
        <token postag="VB" postag_regexp="yes" inflected="yes"/>
        <token>not</token>
        <token regexp="yes">generally|typically|u

Expected output:
```
<rule id="BRIEFCATCH_263762808715424542820983160320978225970" name="BRIEFCATCH_CONCISENESS_30125">                                        
    <pattern>                                
        <token inflected="yes">do</token>                        
        <token>not</token>                        
        <token regexp="yes">generally|typically|usually</token>                        
        <token postag="VB">                        
                <exception regexp="yes">give|include|take</exception>                
        </token>                        
    </pattern>
    <filter class="org.languagetool.rules.en.AdvancedSynthesizerFilter" args="lemmaFrom:4 lemmaSelect:V.* postagFrom:1 postagSelect:V.*"/>                                
    <message>Would using fewer words help tighten the sentence?</message>                                
    <suggestion>rarely {suggestion}</suggestion>                                
    <suggestion>seldom {suggestion}</suggestion>                                
    <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":2,"priority":"4.174","WORD":true,"OUTLOOK":true}</short>                                
    <example correction="rarely require|seldom require">They <marker>do not generally require</marker> a definite article.</example>                                
</rule>
```
---

In [28]:
example_thirteen = """Ad Hoc:
SENT_START in that case , ( however though ) , ( i he if in it she there this )
Rule Number:
30136
Correction:
But $7 @ Then $6 $7 @ But then $7
Category:
Flow
Explanation:
Could shortening your opening transition add punch and help lighten the style?<linebreak/><linebreak/><b>Example</b> from Chief Justice Roberts: “<b>But</b> that argument . . . confuses mootness with the merits.”
Test Sentence:
In that case, however, this subtitle should tell you.
Corrected Test Sentence:
But this subtitle should tell you.

XML Rule:"""

In [29]:
example_thirteen_messages = generate_simple_message(SYSTEM_PROMPT, example_thirteen)
example_thirteen_response = call_gpt_with_backoff(messages=example_thirteen_messages, model="gpt-4-1106-preview", temperature=0, max_length=1480)
print(example_thirteen_response)

<THOUGHT>The ad hoc rule starts with SENT_START which indicates the beginning of a sentence. The pattern includes a fixed phrase "in that case," followed by a choice between "however" and "though," and then another choice between several pronouns and the word "if." Since SENT_START is present, all tokens following it must be nested within marker tags. The correction offers three alternatives, using "But," "Then," and "But then" followed by the last token from the pattern, which is a pronoun or "if." The explanation includes an example and needs to be converted to markdown for the message tag. The corrected sentence shows that the phrase "in that case, however," is replaced by "But," which aligns with the first correction option.</THOUGHT>

<rule id="{new_rule_id}" name="BRIEFCATCH_FLOW_30136">
    <pattern>
        <token postag="SENT_START"/>
        <marker>
            <token>in</token>
            <token>that</token>
            <token>case</token>
            <token>,</token>
    

Expected output:
```
<rule id="BRIEFCATCH_145346392105646606287940325719406917958" name="BRIEFCATCH_FLOW_30136">
    <pattern>
        <token postag="SENT_START"/>
        <marker>
            <token>in</token>
            <token>that</token>
            <token>case</token>
            <token>,</token>
            <token regexp="yes">however|though</token>
            <token>,</token>
            <token regexp="yes">he|i|if|in|it|she|there|this</token>
        </marker>
    </pattern>
    <message>Could shortening your opening transition add punch and help lighten the style?|**Example** from Chief Justice Roberts: “**But** that argument . . . confuses mootness with the merits.”</message>
    <suggestion>But <match no="8"/></suggestion>
    <suggestion>Then<match no="7"/> <match no="8"/></suggestion>
    <suggestion>But then <match no="8"/></suggestion>
    <short>{"ruleGroup":null,"ruleGroupIdx":0,"isConsistency":false,"isStyle":true,"correctionCount":3,"priority":"8.252","WORD":true,"OUTLOOK":true}</short>
    <example correction="But this|Then, this|But then this"><marker>In that case, however, this</marker> subtitle should tell you.</example>
</rule>
```
---