Implement Pyright for Type-Checking #630

max-muoto · 2024-04-29T01:58:55Z

Fixes: #623

This PR implements Pyright for type-checking, with the strictest settings enabled, fully replacing MyPy, which is partially broken at the moment.

Realize this has gotten pretty large, just let me know if you would like me to break out any individual changes into separate PRs (Ruff config tweak, CI, etc) and can do that.

MyPy vs Pyright

Pyrightis the default type-checker for VSCode from Microsoft, and exists as its own CLI tool that you can utilize for type-checking your codebase. It's the only Python type-checker that's fully supports the latest Python typing spec. Outside of that, it has several key differentiating factors from MyPy, which are touched on here.

Some main things that I think are worth calling out:

Inferred Returned Types/Type-Checking for Unannotated Code

Return-types are always inferred in Pyright.

def foo():
    return "bar"

result = foo()
reveal_type(result) # Revealed type is `Literal[bar]`

With this, Pyright will always type-check external libraries, using whatever internal type hints they provided, if stubs aren't given to take precedence. Unlike MyPy, type-checking is performed at all times unless a library is specifically stubbed out to always return Any.

Unknown vs Any

When a type isn't provided, MyPy will put an Any in its place. Pyright on the other hand forces you to explicitly use Any, inferring Unknown is no generic param is provided. Some examples of the implications of this are discussed in this issue.

Narrowing/TypeGuards

Pyright supports a variety of type-guards that MyPy doesn't. See here.

Support for New Features

Pyright already has support for 3.13 features that we can start to utilize through type_extensions, for example generic defaults and support for the deprecated decorator. On the other MyPy still does not have full support for all 3.12 features, see here.

Less Bugs

Generally from my experience Pyright has far fewer bugs (34 GitHub issues) than MyPy (2.6k GitHub issues), and MyPy bugs are cross-checked against Pyright to ensure they aren't present.

Editor Support

Assuming you have Pylance installed in VSCode or any other editor, you'll get type-checking right as you modify code that will simply pull from the config in our pyproject.toml.

This PR

What exactly is this PR doing? It enables Pyright, with its strictest settings, with the exception of allowing deprecated methods (we're still using classmethod, which is deprecated), and allowing for redundant isinstance checks. The later is mostly around the fact that we can't ensure that users are doing static type-checking, so "redundant" runtime type-checking still serves a purpose.

I'll go over some of the general fixes to maybe try and help answer questions that might come from this PR.

Typed Dicts

You'll see the replacement of some inline dict types with TypeDicts, this mostly around the fact that kwargs unpacking is not type-safety without them.

Let's take this example:

additional_params: Dict[str, Union[str, Dict[str, Union[int, float, str]]]] = {}
if hyperparameters_dict:
    additional_params["hyperparameters"] = hyperparameters_dict
if validation_file:
    additional_params["validation_file"] = validation_file
if model_suffix:
    additional_params["suffix"] = model_suffix
job = client.fine_tuning.jobs.create(
    training_file=file_id,
    model=model,
    **additional_params,
)

Pyright doesn't consider this type-safe, since it doesn't know the types of the specific fields being passed into create, since the value is simply one of int, float, str, regardless of the key. To make this type-safe, we can create a typed dictionary defining the fields and their respective types.

Consistent Type Syntax

I enabled the Ruff rule UP006 to ensure we're using consistent type syntax. As of 3.9, explicitly importing Dict, List, etc is deprecated, as you can simply use the primitives as generics. This rule is auto-fix for the most part.

Typer/CLI Fixes

Removed unnecessary typing ignores by avoiding explicitly passing in ..., as it is the default, and corrected types that should have been Optional.

Pydantic

Had to add a variety of typing ignores toimports for create_model, due to missing generics. Made a PR fixing this, which was pushed, so once we upgrade to their next release, we can remove these.

CI / Pre-Commit

Pyright needs our extra dependencies to properly type-check everything, so tweaking the CI to simply do an install with Poetry, instead of the requirements.txt. I additionally updated the pre-commit to run Pyright locally.

🚀 This description was created by Ellipsis for commit `55ac2b9`

Summary:

This PR implements Pyright for type-checking, updates various files for type safety, uses new Python 3.9 typing syntax, and handles kwargs updates.

Key points:

Implements Pyright for type-checking, replacing MyPy.
Adds a new GitHub workflow for Pyright.
Updates various files to ensure type safety with Pyright.
Uses new typing syntax introduced in Python 3.9.
Updates handling of kwargs in several places.
Adds type ignores for some imports from pydantic.create_model due to missing generics.
Updates requirements.txt and pyproject.toml files to include Pyright.

Generated with ❤️ by ellipsis.dev

max-muoto · 2024-04-29T03:16:38Z

@jxnl Still have some cleaning up here to do, but was wondering if you might be able to run the actions for me? Hoping to test out the new Pyright check.

Edit: Looks all good now.

ellipsis-dev

❌ Changes requested.

Reviewed the entire pull request up to 55ac2b9
Looked at 2250 lines of code in 28 files
Took 1 minute and 59 seconds to review

More info

Skipped 1 files when reviewing.
Skipped posting 0 additional comments because they didn't meet confidence threshold of 85%.

Workflow ID: wflow_Z9oX11KrzKUD2C0d

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment (or add @ellipsis-agent to your repo for autocomplete). We'll respond in a few minutes. Learn more here. You can customize Ellipsis with review rules, user-specific overrides, quiet mode, and more. See docs.

ellipsis-dev · 2024-04-29T03:17:44Z

.github/workflows/pyright.yml

+    - name: Set up Python
+      uses: actions/setup-python@v4
+      with:
+        python-version: ${{ matrix.python-version }}


The python-version environment variable is referenced (${{ matrix.python-version }}) but not defined in the matrix. This could lead to an error during the workflow execution.

ellipsis-dev

❌ Changes requested.

Reviewed the entire pull request up to 55ac2b9
Looked at 2250 lines of code in 28 files
Took 1 minute and 46 seconds to review

More info

Skipped 1 files when reviewing.
Skipped posting 0 additional comments because they didn't meet confidence threshold of 85%.

Workflow ID: wflow_Rb4O94LyTuxvclWN

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment (or add @ellipsis-agent to your repo for autocomplete). We'll respond in a few minutes. Learn more here. You can customize Ellipsis with review rules, user-specific overrides, quiet mode, and more. See docs.

ellipsis-dev · 2024-04-29T03:19:25Z

.github/workflows/pyright.yml

+    - name: Set up Python
+      uses: actions/setup-python@v4
+      with:
+        python-version: ${{ matrix.python-version }}


The python-version variable is used but not defined in the matrix. This will cause the workflow to fail. Define python-version in the matrix or correct the variable reference.

max-muoto · 2024-04-29T03:46:57Z

instructor/client.py

@@ -60,16 +57,38 @@ def completions(self) -> Self:
    def messages(self) -> Self:
        return self

+    @overload
+    def create(


This usage of overloading might seem funky. But it's the only way we can get accurate type-checking here without refactoring AsyncInstructor, as the core issue here is thatthe subclass is doing an incompatible method override, since adding the async keyword implicitly adds Awaitable to the return-type.

Thus, create should return Union[T, Awaitable[T]]: However, doing this without overloads will lead to an inaccurate unioned type whenever calling create on Instructor. Overloading like this, we can specify that for Instructor, it will return the non-awaited version, whileour subclass will return the awaited version.

The main alternatives would be 1) refactor AsyncInstructor so it simply inherits from a common base class that Instructor also inherits from. 2) add typing ignores.

Happy to go with either or, depending on what you prefer, or leave this as is, and refactor it in a follow-up @jxnl. Let me know what you think.

max-muoto · 2024-04-29T03:55:40Z

instructor/cli/files.py

@@ -1,6 +1,8 @@
+# type: ignore - stub mismatched


We're really not following the stubs at all here, so not too sure if this was just from when we were on a previous version of the OpenAI client? Mainly, that we treat the pydantic models they return as dictionaries, but maybe there's something I'm missing. Thought it would be easiest to ignore this for now, and go back to it later.

max-muoto · 2024-04-29T03:56:07Z

instructor/cli/usage.py

-    ModelNames,
-    Union[Dict[str, float], float],
-] = {
+MODEL_COSTS = {


Annotation is inferred properly now.

instructor/client_anthropic.py

instructor/distil.py

instructor/dsl/maybe.py

examples/avail/run_mixtral.py

tests/test_function_calls.py

jxnl · 2024-04-29T13:41:12Z

quick look over 3.10 tests and it looks like theres some subclass issues, that point to list[int] vs List[int]. Maybe i'm mistaken

jxnl · 2024-04-29T13:42:54Z

the base tests (without openai calls) should be passing on main

max-muoto · 2024-04-29T15:00:11Z

the base tests (without openai calls) should be passing on main

Not able to run all of the tests locally, but should be good now!

max-muoto · 2024-04-29T18:44:02Z

Thanks for the quick review @jxnl I'll try and follow-up with some additional clean-up in regards to the overloads, and try and improve some of the Ruff rules regarding our typing config.

max-muoto changed the title ~~Initial typing fixes~~ Implement Pyright for Type-Checking Apr 29, 2024

max-muoto marked this pull request as ready for review April 29, 2024 03:15

max-muoto marked this pull request as draft April 29, 2024 03:16

max-muoto marked this pull request as ready for review April 29, 2024 03:17

ellipsis-dev bot reviewed Apr 29, 2024

View reviewed changes