Merge branch 'main' into jxnl-tutorial-live

jxnl · Dec 22, 2023 · 2ea3ce2 · 2ea3ce2
2 parents feb2a53 + 8eef6da
commit 2ea3ce2
Show file tree

Hide file tree

Showing 11 changed files with 694 additions and 43 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,4 @@
+.DS_Store
 # Byte-compiled / optimized / DLL files
 __pycache__/
 *.py[cod]
@@ -6,6 +7,7 @@ __pycache__/
 # C extensions
 *.so
 
+
 # Distribution / packaging
 .Python
 build/
@@ -166,4 +168,4 @@ tutorials/results.csv
 tutorials/results.jsonl
 tutorials/results.jsonlines
 tutorials/schema.json
-wandb/settings
+wandb/settings
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
 # Welcome to Instructor - Your Gateway to Structured Outputs with OpenAI
 
-_Structured extraction in Python, powered by OpenAI's function calling api, designed for simplicity, transparency, and control._
+_Structured extraction in Python, powered by OpenAI's function calling API, designed for simplicity, transparency, and control._
 
 ---
 
@@ -18,7 +18,7 @@ Dive into the world of Python-based structured extraction, empowered by OpenAI's
 
 ## Get Started in Moments
 
-Installing Instructor is a breeze. Just run `pip install instructor` in your terminal and you're on your way to a smoother data handling experience.
+Installing Instructor is a breeze. Simply run `pip install instructor` in your terminal and you're on your way to a smoother data handling experience!
 
 ## How Instructor Enhances Your Workflow
 
@@ -28,13 +28,11 @@ Our `instructor.patch` for the `OpenAI` class introduces three key enhancements:
 - **Max Retries:** Set your desired number of retry attempts for requests.
 - **Validation Context:** Provide a context object for enhanced validator access. A Glimpse into Instructor's Capabilities.
 
-!!! note "Using Validators"
-
-Learn more about validators checkout our blog post [Good llm validation is just good validation](https://jxnl.github.io/instructor/blog/2023/10/23/good-llm-validation-is-just-good-validation/)
-
-With Instructor, your code becomes more efficient and readable. Here’s a quick peek:
+### Using Validators
+To learn more about validators, checkout our blog post [Good LLM validation is just good validation](https://jxnl.github.io/instructor/blog/2023/10/23/good-llm-validation-is-just-good-validation/)
 
 ## Usage
+With Instructor, your code becomes more efficient and readable. Here’s a quick peek:
 
 ```py hl_lines="5 13"
 import instructor
@@ -61,7 +59,7 @@ assert user.name == "Jason"
 assert user.age == 25
 ```
 
-### "Using `openai<1.0.0`"
+### Using `openai<1.0.0`
 
 If you're using `openai<1.0.0` then make sure you `pip install instructor<0.3.0`
 where you can patch a global client like so:
@@ -78,9 +76,9 @@ user = openai.ChatCompletion.create(
 )
 ```
 
-### "Using async clients"
+### Using async clients
 
-For async clients you must use apatch vs patch like so:
+For async clients you must use `apatch` vs. `patch`, as shown:
 
 ```py
 import instructor
@@ -106,7 +104,7 @@ assert isinstance(model, UserExtract)
 
 ### Step 1: Patch the client
 
-First, import the required libraries and apply the patch function to the OpenAI module. This exposes new functionality with the response_model parameter.
+First, import the required libraries and apply the `patch` function to the OpenAI module. This exposes new functionality with the `response_model` parameter.
 
 ```python
 import instructor
@@ -132,8 +130,7 @@ class UserDetail(BaseModel):
 
 ### Step 3: Extract
 
-Use the `client.chat.completions.create` method to send a prompt and extract the data into the Pydantic object. The response_model parameter specifies the Pydantic model to use for extraction. Its helpful to annotate the variable with the type of the response model.
-which will help your IDE provide autocomplete and spell check.
+Use the `client.chat.completions.create` method to send a prompt and extract the data into the Pydantic object. The `response_model` parameter specifies the Pydantic model to use for extraction. It is helpful to annotate the variable with the type of the response model which will help your IDE provide autocomplete and spell check.
 
 ```python
 user: UserDetail = client.chat.completions.create(
@@ -150,7 +147,9 @@ assert user.age == 25
 
 ## Pydantic Validation
 
-Validation can also be plugged into the same Pydantic model. Here, if the answer attribute contains content that violates the rule "don't say objectionable things," Pydantic will raise a validation error.
+Validation can also be plugged into the same Pydantic model. 
+
+In this example, if the answer attribute contains content that violates the rule "Do not say objectionable things", Pydantic will raise a validation error.
 
 ```python hl_lines="9 15"
 from pydantic import BaseModel, ValidationError, BeforeValidator
@@ -173,15 +172,15 @@ except ValidationError as e:
     print(e)
 ```
 
-Its important to note here that the error message is generated by the LLM, not the code, so it'll be helpful for re-asking the model.
+It is important to note here that the **error message is generated by the LLM**, not the code. Thus, it is helpful for re-asking the model.
 
 ```plaintext
 1 validation error for QuestionAnswer
 answer
    Assertion failed, The statement is objectionable. (type=assertion_error)
 ```
 
-## Reask on validation error
+## Re-ask on validation error
 
 Here, the `UserDetails` model is passed as the `response_model`, and `max_retries` is set to 2.
 
@@ -219,15 +218,15 @@ assert model.name == "JASON"
 
 ## [Evals](https://github.com/jxnl/instructor/tree/main/tests/openai/evals)
 
-We invite you to contribute evals in pytest as a way to monitor the quality of the openai models and the instructor library. To get started check out the [jxnl/instructor/tests/evals](https://github.com/jxnl/instructor/tree/main/tests/openai/evals) and contribute your own evals in the form of pytest tests. These evals will be run once a week and the results will be posted.
+We invite you to contribute to evals in `pytest` as a way to monitor the quality of the OpenAI models and the `instructor` library. To get started check out the [jxnl/instructor/tests/evals](https://github.com/jxnl/instructor/tree/main/tests/openai/evals) and contribute your own evals in the form of pytest tests. These evals will be run once a week and the results will be posted.
 
 ## Contributing
 
-If you want to help out checkout some of the issues marked as `good-first-issue` or `help-wanted`. Found [here](https://github.com/jxnl/instructor/labels/good%20first%20issue). They could be anything from code improvements, a guest blog post, or a new cook book.
+If you want to help, checkout some of the issues marked as `good-first-issue` or `help-wanted` found [here](https://github.com/jxnl/instructor/labels/good%20first%20issue). They could be anything from code improvements, a guest blog post, or a new cookbook.
 
 ## CLI
 
-We also provide some added CLI functionality for easy convinience
+We also provide some added CLI functionality for easy convinience:
 
 - `instructor jobs` : This helps with the creation of fine-tuning jobs with OpenAI. Simple use `instructor jobs create-from-file --help` to get started creating your first fine-tuned GPT3.5 model
 

diff --git a/docs/concepts/fields.md b/docs/concepts/fields.md
@@ -1,4 +1,4 @@
-The `pydantic.Field` function is used to customize and add metadata to fields of models. To learn more check out the pydantic [documentation](https://docs.pydantic.dev/latest/concepts/fields/) as this is a near replica of that documentation that is relevant to prompting.
+The `pydantic.Field` function is used to customize and add metadata to fields of models. To learn more, check out the Pydantic [documentation](https://docs.pydantic.dev/latest/concepts/fields/) as this is a near replica of that documentation that is relevant to prompting.
 
 ## Default values
 
@@ -88,15 +88,14 @@ print(date_range.model_dump_json())
 
 ## Customizing JSON Schema
 
-There are fields that exclusively to customise the generated JSON Schema:
+There are some fields that are exclusively used to customise the generated JSON Schema:
 
 - `title`: The title of the field.
 - `description`: The description of the field.
 - `examples`: The examples of the field.
 - `json_schema_extra`: Extra JSON Schema properties to be added to the field.
 
-These all work as great opportunities to add more information to the JSON Schema as part
-of your prompt engineering.
+These all work as great opportunities to add more information to the JSON schema as part of your prompt engineering.
 
 Here's an example:
 

diff --git a/docs/concepts/lists.md b/docs/concepts/lists.md
@@ -20,7 +20,7 @@ Defining a task and creating a list of classes is a common enough pattern that w
 
 ## Extracting Tasks using Iterable
 
-By using `Iterable` you get a very convient class with prompts and names automatically defined:
+By using `Iterable` you get a very convenient class with prompts and names automatically defined:
 
 ```python
 import instructor

diff --git a/docs/concepts/maybe.md b/docs/concepts/maybe.md
@@ -1,6 +1,8 @@
 # Handling Missing Data
 
-The `Maybe` pattern is a concept in functional programming used for error handling. Instead of raising exceptions or returning `None`, you can use a `Maybe` type to encapsulate both the result and potential errors. This pattern is particularly useful when making llm calls, as providing language models with an escape hatch can effectively reduce hallucinations.
+The `Maybe` pattern is a concept in functional programming used for error handling. Instead of raising exceptions or returning `None`, you can use a `Maybe` type to encapsulate both the result and potential errors. 
+
+This pattern is particularly useful when making LLM calls, as providing language models with an escape hatch can effectively reduce hallucinations.
 
 ## Defining the Model
 

diff --git a/docs/concepts/models.md b/docs/concepts/models.md
@@ -1,8 +1,11 @@
 # Response Model
 
-Defining llm output schemas in Pydantic is done via `pydantic.BaseModel`. To learn more about models in pydantic checkout their [documentation](https://docs.pydantic.dev/latest/concepts/models/).
+Defining LLM output schemas in Pydantic is done via `pydantic.BaseModel`. To learn more about models in Pydantic, check out their [documentation](https://docs.pydantic.dev/latest/concepts/models/).
 
-After defining a pydantic model, we can use it as as the `response_model` in your client `create` calls to openai. The job of the `response_model` is to define the schema and prompts for the language model and validate the response from the API and return a pydantic model instance.
+After defining a Pydantic model, we can use it as the `response_model` in your client `create` calls to OpenAI or any other supported model. The job of the `response_model` parameter is to: 
+- Define the schema and prompts for the language model
+- Validate the response from the API
+- Return a Pydantic model instance.
 
 ## Prompting
 
@@ -24,7 +27,7 @@ Here all docstrings, types, and field annotations will be used to generate the p
 
 ## Optional Values
 
-If we use `Optional` and `default` they will be considered not required when sent to the language model
+If we use `Optional` and `default`, they will be considered not required when sent to the language model
 
 ```python
 class User(BaseModel):
@@ -35,7 +38,7 @@ class User(BaseModel):
 
 ## Dynamic model creation
 
-There are some occasions where it is desirable to create a model using runtime information to specify the fields. For this Pydantic provides the create_model function to allow models to be created on the fly:
+There are some occasions where it is desirable to create a model using runtime information to specify the fields. For this, Pydantic provides the create_model function to allow models to be created on the fly:
 
 ```python
 from pydantic import BaseModel, create_model
@@ -94,7 +97,7 @@ print(BarModel.model_fields.keys())
 
 ## Structural Pattern Matching
 
-Pydantic supports structural pattern matching for models, as introduced by PEP 636 in Python 3.10.
+Pydantic supports structural pattern matching for models, as introduced by [PEP 636](https://peps.python.org/pep-0636/) in Python 3.10.
 
 ```python
 from pydantic import BaseModel
@@ -119,7 +122,7 @@ match a:
 
 ## Adding Behavior
 
-We can add methods to our pydantic models just as any plain python class. We might want to do this to add some custom logic to our models.
+We can add methods to our Pydantic models, just as any plain Python class. We might want to do this to add some custom logic to our models.
 
 ```python
 from pydantic import BaseModel

diff --git a/docs/concepts/reask_validation.md b/docs/concepts/reask_validation.md
@@ -1,16 +1,16 @@
 # Validation and Reasking
 
-Instead of framing "self-critique" or "self-reflection" in AI as new concepts, we can view them as validation errors with clear error messages that the systen can use to self correct.
+Instead of framing "self-critique" or "self-reflection" in AI as new concepts, we can view them as validation errors with clear error messages that the system can use to self-correct.
 
 ## Pydantic
 
 Pydantic offers an customizable and expressive validation framework for Python. Instructor leverages Pydantic's validation framework to provide a uniform developer experience for both code-based and LLM-based validation, as well as a reasking mechanism for correcting LLM outputs based on validation errors. To learn more check out the [Pydantic docs](https://docs.pydantic.dev/latest/concepts/validators/) on validators.
 
 !!! note "Good llm validation is just good validation"
 
-    If you want to see some more examples on validators checkout our blog post [Good llm validation is just good validation](https://jxnl.github.io/instructor/blog/2023/10/23/good-llm-validation-is-just-good-validation/)
+    If you want to see some more examples on validators checkout our blog post [Good LLM validation is just good validation](https://jxnl.github.io/instructor/blog/2023/10/23/good-llm-validation-is-just-good-validation/)
 
-### Code-Based Validation Example
+### Code-based Validation Example
 
 First define a Pydantic model with a validator using the `Annotation` class from `typing_extensions`.
 
@@ -80,7 +80,7 @@ except ValidationError as e:
 
 #### Output for LLM-Based Validation
 
-Its important to not here that the error message is generated by the LLM, not the code, so it'll be helpful for re asking the model.
+It is important to not here that the error message is generated by the LLM, not the code, so it'll be helpful for re asking the model.
 
 ```plaintext
 1 validation error for QuestionAnswer
@@ -92,14 +92,13 @@ answer
 
 Validators are a great tool for ensuring some property of the outputs. When you use the `patch()` method with the `openai` client, you can use the `max_retries` parameter to set the number of times you can reask the model to correct the output.
 
-Its a great layer of defense against bad outputs of two forms.
-
+It is a great layer of defense against bad outputs of two forms:
 1. Pydantic Validation Errors (code or llm based)
 2. JSON Decoding Errors (when the model returns a bad response)
 
 ### Step 1: Define the Response Model with Validators
 
-Noticed the field validator wants the name in uppercase, but the user input is lowercase. The validator will raise a `ValueError` if the name is not in uppercase.
+Notice that the field validator wants the name in uppercase, but the user input is lowercase. The validator will raise a `ValueError` if the name is not in uppercase.
 
 ```python hl_lines="11-16"
 import instructor
@@ -156,11 +155,10 @@ except (ValidationError, JSONDecodeError) as e:
 
 ## Advanced Validation Techniques
 
-The docs are currently incomplete, but we have a few advanced validation techniques that we're working on documenting better, for a example of model level validation, and using a validation context check out our example on [verifying citations](../examples/exact_citations.md) which covers
-
+The docs are currently incomplete, but we have a few advanced validation techniques that we're working on documenting better such as model level validation, and using a validation context. Check out our example on [verifying citations](../examples/exact_citations.md) which covers:
 1. Validate the entire object with all attributes rather than one attribute at a time
-2. Using some 'context' to validate the object, in this case we use the `context` to check if the citation existed in the original text.
+2. Using some 'context' to validate the object: In this case, we use the `context` to check if the citation existed in the original text.
 
 ## Takeaways
 
-By integrating these advanced validation techniques, we not only improve the quality and reliability of LLM-generated content but also pave the way for more autonomous and effective systems.
+By integrating these advanced validation techniques, we not only improve the quality and reliability of LLM-generated content, but also pave the way for more autonomous and effective systems.