Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build: refactor TaskData #9389

Merged
merged 4 commits into from
Jul 6, 2022
Merged

Build: refactor TaskData #9389

merged 4 commits into from
Jul 6, 2022

Conversation

stsewd
Copy link
Member

@stsewd stsewd commented Jun 29, 2022

Currently this class holds any values,
defining its attributes dynamically,
this makes it hard to know what attributes
exactly does this object hold,
and others are initialized in several parts of
the codebase.

So, I'm defining all the attributes in a form of a dataclass,
with slots=True (so no other attributes can be added by mistake).

I'm also checking for possible uninitialized data in the task handlers.

Note: Dataclasses require type annotations,
this doesn't mean we are using type hints or enforcing them in our codebase.

Closes #9364

@stsewd stsewd requested a review from a team as a code owner June 29, 2022 18:20
@stsewd stsewd requested a review from humitos June 29, 2022 18:20
build_pk: int = None
build_commit: str = None

start_time: timezone.datetime = None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could also default this to now

Currently this class holds any values,
defining its attributes dynamically,
this makes it hard to know what attributes
exactly does this object hold,
and others are initialized in several parts of
the codebase.

So, I'm defining all the attributes in a form of a dataclass,
with `slots=True` (so no other attributes can be added by mistake).

I'm also checking for possible uninitialized data in the task handlers.

Note: Dataclasses require type annotations,
this doesn't mean we are using type hints or enforcing them in our codebase.

Closes #9364
Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I'm not a big fan of data classes and type annotations, but this is a good example where it makes things easier and clearer 💯


# Dictionary returned from the API.
build: dict = field(default_factory=dict)
# If HTML, PDF, ePub, etc formats were build.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# If HTML, PDF, ePub, etc formats were build.
# If HTML, PDF, ePub, etc formats were built.

@@ -119,10 +142,10 @@ def before_start(self, task_id, args, kwargs):

# Comes from the signature of the task and it's the only required
# argument
version_id, = args
self.data.version_pk = args[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old syntax is a little weird, but it lets clear that args only has one item. Not a big deal, tho.

I think black changes it to:

(self.data.version_pk,) = args

to make it a little clearer.

build: dict = field(default_factory=dict)
# If HTML, PDF, ePub, etc formats were build.
outcomes: dict = field(default_factory=lambda: defaultdict(lambda: False))
# Build data for analytics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Build data for analytics.
# Build data for analytics (telemetry).

Comment on lines 493 to 495
version_type = self.data.version and self.data.version.type
send_external_build_status(
version_type=self.data.version.type,
version_type=version_type,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, this usage is a little tricky,

>>> True and "external"
'external'
>>> False and "external"
False
>>> True and "branch"
'branch'
>>>

We will be sending two different types: str and bool (False when we don't have a version). This works only because the function checks explicitly for version_type == EXTERNAL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I don't really follow what the code is doing here. The and in the above logic makes me think we're passing a bool. I agree this could be clearer by just using an if statement.

Copy link
Member Author

@stsewd stsewd Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will be sending two different types: str and bool

It will be sending None or string, but I'm changing it to an if

Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great refactor, but there's definitely some context from this PR that should be in the code, since it's a different pattern than we've normally used before.

readthedocs/projects/tasks/builds.py Show resolved Hide resolved
readthedocs/projects/tasks/builds.py Show resolved Hide resolved
.. note::

This handler is called even if the task has failed,
so some attributes from the `self.data` object may not be defined.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is the bug that we're actually fixing here? 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Comment on lines 493 to 495
version_type = self.data.version and self.data.version.type
send_external_build_status(
version_type=self.data.version.type,
version_type=version_type,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I don't really follow what the code is doing here. The and in the above logic makes me think we're passing a bool. I agree this could be clearer by just using an if statement.

readthedocs/projects/tasks/builds.py Show resolved Hide resolved
@stsewd stsewd enabled auto-merge (squash) July 6, 2022 23:12
@stsewd stsewd merged commit dc62d61 into main Jul 6, 2022
@stsewd stsewd deleted the refactor-task-data branch July 6, 2022 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AttributeError: 'TaskData' object has no attribute 'version'
3 participants