Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve public API type annotations & fix unit test type errors #248

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

h4l
Copy link

@h4l h4l commented Mar 25, 2025

Hey, thanks for the Python cloudevents library, I've been using it (via functions_framework). In working with it I've noticed a few ways the types of your API could be tweaked to make things smoother for users. Also, while looking into the code, I noticed there were quite a few type errors in the test modules when running mypy, and mypy wasn't being run in CI to enforce the types. I figured I could help with that while offering some tweaks to the public API types.

Changes

This PR:

  • Fixes the existing type errors in tests
    • There's more that could be done to increase type coverage in tests (e.g. annotating more fixture arguments) but I've mostly fixed the type errors that were actively being reported
  • Improves typing of some public APIs:
    • @overloads for from_binary and from_structured to correctly type the return when the event_type arg is None
    • Support the chained/self-returning Set*() methods on cloudevents.sdk.event.v1.Event (previously you couldn't chain more than one call without type errors, as the return type did not have Set* methods)
    • Support passing non-dict Header classes to the from_http() conversion functions. They were requiring real dict types, so passing a headers object from an http request object would be a type error.
    • Functions taking dicts as inputs (such as from_dict()) are now typed to take Mapping rather than the exact dict type, to allow passing dict-like objects, as well as read-only dicts.
  • Runs mypy in CI via tox, both for the cloudevents module, and the two samples
  • Checks the types of the two samples directories:
    • The above headers change fixes a type error in one sample
    • Also fixed a trivial mypy complaint due to not annotating -> None: function returns.
  • Fixes an unrelated CI test failure on Python 3.8, due to Python 3.8 now being out of support, and the sanic test dependency having dropped support for 3.8 in its latest release

One line description for the changelog

  • Improve public API type annotations, fix type errors in tests/examples and run type checks in CI

  • Tests pass

  • Appropriate changes to README are included in PR

num_cloudevents = 30
for i in range(num_cloudevents):
event = {
raw_event = {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue here is that the second loop assigns the event var with a different type, causing a type error. Using a different name avoids this.

@@ -333,7 +334,7 @@ def test_valid_structured_events(specversion):
events_queue = []
num_cloudevents = 30
for i in range(num_cloudevents):
event = {
raw_event = {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as another comment — event var was being assigned twice with different types.

@@ -77,7 +77,7 @@ def test_object_event_v1():
_, structured_body = m.ToRequest(event)
assert isinstance(structured_body, bytes)
structured_obj = json.loads(structured_body)
error_msg = f"Body was {structured_body}, obj is {structured_obj}"
error_msg = f"Body was {structured_body!r}, obj is {structured_obj}"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lint error warning that bytes would be formatted as a quoted repr, unlike str, so it demands explicit repr (!r) to avoid ambiguity.

@@ -25,7 +25,7 @@
@pytest.mark.parametrize("event_class", [v03.Event, v1.Event])
def test_binary_converter_upstream(event_class):
m = marshaller.NewHTTPMarshaller([binary.NewBinaryHTTPCloudEventConverter()])
event = m.FromRequest(event_class(), data.headers[event_class], None, lambda x: x)
event = m.FromRequest(event_class(), data.headers[event_class], b"", lambda x: x)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using None is a type error, but in practice the value is a placeholder and never read. Using empty bytes is the correct type and acts as a similar placeholder value.

import typing

from cloudevents.sdk.event import base, opt

if typing.TYPE_CHECKING:
from typing_extensions import Self
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self is not in stdlib typing module in older Python versions, so needs a typing-only import.

h4l added 9 commits March 25, 2025 17:45
kafka.conversion.from_binary() and from_structured() return
AnyCloudEvent type var according to their event_type argument, but when
event_type is None, type checkers cannot infer the return type. We now
use an overload to declare that the return type is http.CloudEvent when
event_type is None.

Previously users had to explicitly annotate this type when calling
without event_type. This happens quite a lot in this repo's
test_kafka_conversions.py — this fixes quite a few type errors like:

> error: Need type annotation for "result"  [var-annotated]

Signed-off-by: Hal Blackburn <hwtb2@cam.ac.uk>
The v1.Event self-returning Set*() methods like SetData() were returning
BaseEvent, which doesn't declare the same Set* methods. As a result,
chaining more than one Set* method would make the return type unknown.

This was causing type errors in test_event_pipeline.py.

The Set*() methods now return the Self type.

Signed-off-by: Hal Blackburn <hwtb2@cam.ac.uk>
mypy was failing with lots of type errors in test modules. I've not
annotated all fixtures, mostly fixed existing type errors.

Signed-off-by: Hal Blackburn <hwtb2@cam.ac.uk>
from_http() conversion function was requiring its headers argument to
be a typing.Dict, which makes it incompatible with headers types of http
libraries, which support features like multiple values per key.
typing.Mapping and even _typeshed.SupportsItems do not cover these
types. For example,
samples/http-image-cloudevents/image_sample_server.py was failing to
type check where it calls `from_http(request.headers, ...)`.

To support these kind of headers types in from_http(), we now define our
own SupportsDuplicateItems protocol, which is broader than
_typeshed.SupportsItems.

I've only applied this to from_http(), as typing.Mapping is OK for most
other methods that accept dict-like objects, and using this more lenient
interface everywhere would impose restrictions on our implementation,
even though it might be more flexible for users.

Signed-off-by: Hal Blackburn <hwtb2@cam.ac.uk>
Tox now runs mypy on cloudevents itself, and the samples.

Signed-off-by: Hal Blackburn <hwtb2@cam.ac.uk>
Signed-off-by: Hal Blackburn <hwtb2@cam.ac.uk>
Signed-off-by: Hal Blackburn <hwtb2@cam.ac.uk>
Mapping imposes less restrictions on callers, because it's read-only and
allows non-dict types to be passed without copying them as dict(), or
passing dict-like values and ignoring the resulting type error.

Signed-off-by: Hal Blackburn <hwtb2@cam.ac.uk>
Tests were failing because the sanic dependency dropped support for
py3.8 in its current release. sanic is now pinned to the last compatible
version for py3.8 only.

Signed-off-by: Hal Blackburn <hwtb2@cam.ac.uk>
@h4l h4l force-pushed the fix-type-errors branch from c1235ea to dccd3cf Compare March 25, 2025 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant