Tiny DB

In [1]:
from tinydb import TinyDB, Query

In [9]:
db = TinyDB('db.json')

In [10]:
db.search(User.name == 'John')

NameError: name 'User' is not defined

In [None]:
db.search((User.name == 'John') & (User.age <= 30))

In [None]:
db.search((User.name == 'John') | (User.name == 'Bob'))

In [None]:
db.search((User.age.map(lambda x: x + x) == 44))

In [None]:
table = db.table('name')

table.all()

In [None]:
db.insert({'int': 1, 'char': 'a'})
db.insert({'int': 1, 'char': 'b'})

Pydantic

In [4]:
from pydantic import Field, BaseModel
from typing import List

class User(BaseModel):
    id: int
    name: str = Field(..., alias="full_name")

class Users(BaseModel):
    users: List[User]

data = {"id": 1, "full_name": "John Doe"}
user = User(**data)
print(user.name)

data = {"users": [{"id": 1, "full_name": "John Doe"},
                  {"id": 2, "full_name": "Jane Doe"},
                  {"id": 3, "name": "Jim Doe"}]}
users = Users(**data)

print(users.users[0].name)
print(users.users[1].name)

John Doe


ValidationError: 1 validation error for Users
users.2.full_name
  Field required [type=missing, input_value={'id': 3, 'name': 'Jim Doe'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.8/v/missing

## o1

In [None]:
assistant_msg = """
You are tasked with extracting elements from a given legal document. Please follow these steps carefully and ensure all instructions are adhered to:

# Steps

1. **Summarize the document** to understand its purpose and use it to verify if all important terms,facts, fact types, and rules are identified in subsequent steps.

2. **Identify elements**:
   - **About the elements**:
     - **Fact**: A specific instance or statement that describes an event or condition without any directive element. Facts often involve relationships between terms or entities. Example: "John works for X Inc."
     - **Fact Type**: A general, abstract template that describes potential relationships between terms or entities, serving as a model for generating specific facts. Example: "Person works for Company."
     - **Operative Rule**: A statement that governs or constrains some aspect of the business, specifying what must be done or what is not allowed. Rules enforce compliance, limit possibilities, or prescribe specific behaviors in response to business situations. Operative rules (otherwise known as normative rules or prescriptive rules) state what must or must not happen in particular circumstances. Operative rules can be contravened: required information may be omitted, inappropriate information supplied, or an attempt may be made to perform a process that is prohibited. Example: "A customer must provide identification before opening an account."
     - **Term**: A word or a group of words that represents a specific concept, entity, or subject in a particular context.
     - Terms, Fact, Fact Type, and Operative Rule are statements that should allow only full compliance or full contravention; partial compliance is not possible. The presence of "or" or "and" often suggests the need to separate a statement into two.
   - **For each fact, fact type, or rule**:
     - **Extract the statement**: Identify the exact statement or phrase from the document representing the fact, fact type, or rule.
     - **Give a unique title to the statement**.
     - **Extract and classify Terms**:
       - **Extract all the terms involved in the statement**. Record the level of confidence in the extraction, ranging from 0 to 1, and provide a brief reason for the confidence score.
       - **Classify each term** as either **Common Noun** or **Proper Noun**.
       - If a Term contains nouns separated by "and," ",", or "or," split it into two or more terms. For example, "Principal office and place of business" should be split into "Principal office" and "Place of business".
     - **Extract Verb Symbols**: Identify verbs, verb phrases, or prepositions that connect the terms in the statement. Record the level of confidence in the extraction, ranging from 0 to 1, and provide a brief reason for the confidence score.
     - **Classification**: Classify the statement as either a **Fact**, **Fact Type**, or **Rule**.
     - **Confidence**: Record the level of confidence in the classification, ranging from 0 to 1.
     - **Reason**: Provide a brief reason for the classification score.
     - **Source**: Note the specific paragraph or section of the document where the statement is found (e.g., "(a)(1)", "(b)").

3. **Provide JSON Output**:
   - Format your answer as per the output example below.
   - **All values are optional**: Include as much information as is available based on the document.
   - **Do not include any additional text or explanation outside the JSON structure**.

**Output Example**:

```json
{
  "section": "§ 123.4-5",
  "elements": [
    {
      "id": 1,
      "title": "some title",
      "statement": "A person serves a non-resident investment adviser by furnishing the Commission with process, pleadings, or papers.",
      "terms": [
        {
          "term": "Person",
          "classification": "Common Noun",
          "confidence": 0.9,
          "reason": "The term is ..."
          "extract_confidence": 0.9,
          "extract_reason": "The term is ..."
        },
        {
          "term": "Non-resident investment adviser",
          "classification": "Common Noun",
          "confidence": 0.8,
          "reason": "The term is ...",
          "extract_confidence": 0.8,
          "extract_reason": "The term is ..."  
        },
        ...
      ],
      "verb_symbols": ["serves", "by furnishing", "with"],
      "verb_symbols_extracted_confidence": [0.9, 0.8, 0.7],
      "verb_symbols_extracted_reason": ["The verb is ...", "The verb is ...", "The verb is ..."],
      "classification": "Fact Type",
      "confidence": 0.8,
      "reason": "The statement is ...",
      "sources": ["(a)"]
    },
    ...
  ]
}
```

# Notes
1. Level of Granularity: Extract and analyze every potential statement from the document;
2. Contextual Interpretation: Extract explicitly stated facts, fact types, and rules;
3. Scope of Terms: Classify every noun phrase as a term, even if it is peripheral to the main statement;
4. Verb Symbols: Include prepositions and auxiliary phrases;
5. Classification Nuances: Record the level of confidence range from 0 to 1;
6. Section Handling: Strictly tie every element to its section (e.g., § 275.0-7(a)(1));
7. Order of Presentation: Follow the sequence of the document strictly;
8. Edge Cases: Classify it with a lower level of confidence;
9. Output Preferences: Add timestamp and processing notes;
10. Formatting Precision: Ensure the JSON adheres strictly to a specific schema (e.g., for use in a system).
"""

user_msg = """
# Document


§ 275.0-2 General procedures for serving non-residents.
(a) General procedures for serving process, pleadings, or other papers on non-resident investment advisers, general partners and managing agents.  Under Forms ADV and ADV-NR [17 CFR 279.1 and 279.4], a person may serve process, pleadings, or other papers on a non-resident investment adviser, or on a non-resident general partner or non-resident managing agent of an investment adviser by serving any or all of its appointed agents:
  (1) A person may serve a non-resident investment adviser, non-resident general partner, or non-resident managing agent by furnishing the Commission with one copy of the process, pleadings, or papers, for each named party, and one additional copy for the Commission's records.
  (2) If process, pleadings, or other papers are served on the Commission as described in this section, the Secretary of the Commission (Secretary) will promptly forward a copy to each named party by registered or certified mail at that party's last address filed with the Commission.
  (3) If the Secretary certifies that the Commission was served with process, pleadings, or other papers pursuant to paragraph (a)(1) of this section and forwarded these documents to a named party pursuant to paragraph (a)(2) of this section, this certification constitutes evidence of service upon that party.
(b) Definitions.  For purposes of this section:
  (1) Managing agent  means any person, including a trustee, who directs or manages, or who participates in directing or managing, the affairs of any unincorporated organization or association other than a partnership.
  (2) Non-resident  means:
    (i) An individual who resides in any place not subject to the jurisdiction of the United States;
    (ii) A corporation that is incorporated in or that has its principal office and place of business in any place not subject to the jurisdiction of the United States; and
    (iii) A partnership or other unincorporated organization or association that has its principal office and place of business in any place not subject to the jurisdiction of the United States.
  (3) Principal office and place of business  has the same meaning as in § 275.203A-3(c) of this chapter.


"""

In [None]:
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="o1-preview",
    messages=[
        {"role": "user", "content": [{"type": "text", "text": f"{user_msg}"}]},
        {
            "role": "assistant",
            "content": [{"type": "text", "text": f"{assistant_msg}"}],
        },
    ],
)

BadRequestError: Error code: 400 - {'error': {'message': "Invalid value: 'json'. Supported values are: 'json_object', 'json_schema', and 'text'.", 'type': 'invalid_request_error', 'param': 'response_format.type', 'code': 'invalid_value'}}

In [19]:
response.choices[0].message.content

'```json\n{\n  "timestamp": "2024-12-05 20:37:38",\n  "processing_notes": "Processing document: § 275.0-2",\n  "section": "§ 275.0-2",\n  "elements": [\n    {\n      "id": 1,\n      "title": "Procedure for serving process on non-resident investment advisers via appointed agents",\n      "statement": "Under Forms ADV and ADV-NR [17 CFR 279.1 and 279.4], a person may serve process, pleadings, or other papers on a non-resident investment adviser, or on a non-resident general partner or non-resident managing agent of an investment adviser by serving any or all of its appointed agents:",\n      "terms": [\n        {\n          "term": "Form ADV",\n          "classification": "Proper Noun",\n          "confidence": 1.0,\n          "reason": "Form ADV is a specific form referenced in the regulation.",\n          "extract_confidence": 1.0,\n          "extract_reason": "Form ADV is explicitly mentioned."\n        },\n        {\n          "term": "Form ADV-NR",\n          "classification": "Prop