Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Makes article properties over writable by parser #330

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

MaxDall
Copy link
Collaborator

@MaxDall MaxDall commented Dec 10, 2023

This PR adds functionality to write over properties like lang defined in the article class. We do so by replacing the former property logic with a custom one. The new implementation now caches the values as well so it now works like cached_property but instead of cached_property it can be overwritten.

I.e. to overwrite the lang attribute of Article use the following in your desired parser.

from fundus.parser import ..., overwrite_attribute

    ...

    @overwrite_attribute
    def lang(self) -> str:
        return "This a new language value"

closes #328

src/fundus/parser/base_parser.py Outdated Show resolved Hide resolved
def __init__(self, func: Callable[[object], Any], priority: Optional[int], validate: bool):
self.validate = validate
def __init__(self, func: Callable[[object], Any], priority: Optional[int], validate: bool, overwrite: bool = False):
self.validate = validate if not overwrite else False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just writing the true path first for clarity.

Suggested change
self.validate = validate if not overwrite else False
self.validate = False if overwrite else validate

@@ -137,7 +145,7 @@ def validated(self) -> "AttributeCollection":

@property
def unvalidated(self) -> "AttributeCollection":
return AttributeCollection(*[attr for attr in self.functions if not attr.validate])
return AttributeCollection(*[attr for attr in self.functions if not attr.validate and not attr.overwrite])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the following line modify the validated attribute already, s.t. it correctly identifies (un)validated attributes?

self.validate = validate if not overwrite else False

Suggested change
return AttributeCollection(*[attr for attr in self.functions if not attr.validate and not attr.overwrite])
return AttributeCollection(*[attr for attr in self.functions if not attr.validate])

import functools


class _CachedAttribute(object):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I was a bit confused with the features added in this PR. Firstly, the overwrite_attribute decorator and, secondly, the property caching. I'll spare you with my previous comments of confusion, but at first I thought that these are separate features and the custom cached attribute is only motivated to improve the general perfomance. Nertheless, here is how I understand it now:

The original problem outlined in #328 is caused by the use of the @property decorator on lang that we want to overwrite in the parsers. Here, we cannot continue using the standard properties since the defined properties define no setter, thus preventing the properties to be overwritten in subclasses with methods with the @attribute decorator. Did I understand this correctly? The origin of the error or at least an error traceback would have been helpful to include in the issue or the PR description.

Comment on lines +79 to +81
def __init__(self, func: Callable[[object], Any], priority: Optional[int], validate: bool, overwrite: bool = False):
self.validate = validate if not overwrite else False
self.overwrite = overwrite
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify why an overwritten attribute can not be a validated attribute? If we allow overwritten attributes to be validated, they are the same as the regular attributes.

import functools


class _CachedAttribute(object):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unsure if this entire attribute caching via a custom property is necessary. Some alternatives come to my mind:

  1. We could define a setter for the relevant properties. But this would be bad since we don't want lang etc. to be mutable openly.
  2. From a design perspective, why are the lang and plaintext computed in the article and not in the parser? So maybe we can move both of these properties from the Article to the BaseParser as attributes. Then we need to include lang and plaintext as fields in the Article dataclass, s.t. they are populated delayed as the other attributes. This way it would also be explicit that these are attributes one may overwrite from the BaseParser and Article would define the skeleton of attributes as dataclass fields.
  3. What would be wrong with the regular cached_attribute? I tried to replace it with the custom one and it seemed to work fine, even when overwriting it in a parser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allowing overwrite of lang attribute
2 participants