Support S7 extension of S4 classes and vice-versa#659
Conversation
87b33e2 to
5e5d36d
Compare
5e5d36d to
3650492
Compare
3650492 to
eb1092b
Compare
hadley
left a comment
There was a problem hiding this comment.
I assume you know infinitely more about S4 than me 😄 so I focussed on docs and style.
Also need an update to NEWS.md?
| old-class object, not as an S4-bit object, so `isS4(child)` is `FALSE`. | ||
| This avoids advertising full S4 object invariants that the object does not | ||
| satisfy. | ||
| * The S3 class vector is not scalar. S4 code that needs one primary class name |
There was a problem hiding this comment.
How common is this use of scalar class()?
There was a problem hiding this comment.
Pretty common, unfortunately. I would guess most uses are in show() methods.
| }) | ||
|
|
||
| it("can convert_up() an S4-derived S7 object to an S4 object ", { | ||
| on.exit(S4_remove_classes(c("ParentS4", "ChildS7"))) |
There was a problem hiding this comment.
Might be worth considering extening local_S4_class() to handle more cases so we could skip the explicit cleanup steps (which are easy to forget)?
There was a problem hiding this comment.
local_S4_class() is convenient and robust, but it obfuscates the code, in my opinion. I think readers expect to see setClass(), setOldClass(), and setClassUnion(). I'd prefer something like S4_remove_after(setClass(...)).
How about we defer this to a future change after merging this branch and discussing? Since the changes are so extensive, rebasing is becoming a chore. It might be better to iterate over multiple MRs.
t-kalinowski
left a comment
There was a problem hiding this comment.
@lawremi I took a cursory look and didn’t see any obvious issues. Could you please tag me again once you’ve had a chance to resolve Hadley’s comments? I’ll take a more thorough look then. That way I can avoid duplicating effort or leaving comments that might become stale.
7a09607 to
238130f
Compare
|
I think all of the major comments have been resolved, so @t-kalinowski can please give it a look. |
a2cb084 to
c88c241
Compare
|
Since @t-kalinowski had the good idea of using AI to help review this branch, I asked an agent to generate this review guide that walks a reviewer through the diff. The links are to code, not diffs, so I'm not sure how helpful they are, but they do point to cohesive bits of logic. Please let me know if this helps and/or if you'd like to schedule a call for me to walk you through this. Review Guide for PR #659: S4/S7 InheritancePR: #659 This PR is large because the feature crosses object representation, S4 class What This PR EnablesThe goal is two-way S4/S7 inheritance interop:
Suggested Review Order1. Start With The Public ContractRead these first, before implementation details:
Questions to answer at this level:
2. Review The Core S4 BridgeMost of the feature lives in
Key invariants:
3. Review Object Representation And Class LookupThese are the representation-level changes reviewers should check carefully:
Key invariants:
4. Review Property Access Across The BoundaryProperty access is intentionally not perfectly encapsulated once S4 code is
Known compromise:
5. Review ValidationThe validation path is split between S7 and S4:
Key questions:
6. Review Dispatch And Method Registration LastThe dispatch-related change is smaller and mostly separable:
Key invariant:
Tests Worth ReadingIt is probably worth skimming some of the tests. Most of the behavioral spec is in Recommended test slices:
Lower-Risk / Mechanical FilesThese are useful to skim after the core review:
Known Caveats To ConfirmThese are intentional and documented:
|
hadley
left a comment
There was a problem hiding this comment.
IMO it's worth doing a little more style polishing but really I it's fine to merge as is. It doesn't need to be perfect, it just needs to advance S4 support and if we get it in quickly we'll avoid a bunch of merge conflicts.
| on.exit(S4_remove_classes(c("ParentS4", "ChildS7"))) | ||
| setClass("ParentS4", slots = list(x = "numeric")) | ||
|
|
…cing to an S3 class, and use methods::extends() to capture the entire class vector, not just the first class
…@() behavior, which needs to change in base R.
…S4_register() can now communicate any S7 class structure through the S4class= argument.
…t initialize() except uses the S7 property setting path.
… parent object construction in constructor.
…presentation along the old class chain
…ic is internal and an element of the signature has an S4 ancestor. This is needed because internal generics will favor S4 methods on S4 objects, so there is potential for inheriting overrides.
…ary S4 classes instead of old classes, because an old class implies an S3 instance of the object can exist, ie, there can be an S3Part().
… it to get stripped during S4 upcasting, breaking the S7 object
…in principle, there should only be one ::S4Slots shim for each S4 derivative of an S7 class
Co-authored-by: Hadley Wickham <h.wickham@gmail.com>
Co-authored-by: Hadley Wickham <h.wickham@gmail.com>
…S4 to inherit from S7; there's just two types of classes: the stub old class and the reified old class. The former for when only dispatch is needed, the latter for when inheritance (in either direction) is needed.
…f S4_remove_classes().
…ontains=TRUE used to do), which ensures that classes are always able to be used as S4 superclasses. Introduce a separate helper S4_contains() that just ensures the S7 class is a suitable parent of an S4 class.
8da17d8 to
616de46
Compare
|
@t-kalinowski is it OK if we merge this and then address concerns from #720? There are some important things to fix there, but I'd rather do it on top of this. |
|
Tomasz is out on vacation so I’ll answer on his behalf 😀 yes it’s fine. I’ve also had conversations with him about my sense that codex tends to make code more technically correct at the expense of readability/understandability. |
Fixes #456.
First iteration on enabling an S7 class to extend an S4 class and vice-versa.
The concrete, fail-fast goal is to enable SummarizedExperiment, a complex, central package in Bioconductor, to be rewritten in S7, based on S4 classes defined lower in the stack, like those from S4Vectors and IRanges, while not breaking compatibility with dependent packages, like SingleCellExperiment, via S4 shims extending the S7 classes. SingleCellExperiment is complex and deeply dependent on SummarizedExperiment, so if it continues working, it is a very good indicator.
What has been achieved outside of these improvements to S7:
class1()helper to be explicit when it wants the scalar name of the object's class and to ensure that it is robust to S3-style class vectors. Also improved thesetValidity2()helper so that validity functions are enclosed in the S4Vectors namespace, enabling S4 class objects to remain identical across lazy loading. This is important since S4 class objects end up in the S7 class object, which is carried by S7 objects, and we want them to pass identity checks across packages.For S7 classes extending S4 classes, the current approach is to instantiate S7 objects, meaning that the S4 bit is not set. The justification is that there are S4 object invariants that an S7 object cannot satisfy, and vice versa. The big one being the return value of
class(). Instead we have an S7 shim that does a reasonable job of mimicking an S4 object, including slot access, validity checking, and initialization, but notclass()compatibility. See the evolving man page insert for more details.When an S7 class is made to inherit from an S4 class,
S4_register()is automatically called on the S7 class. This branch extendsS4_register()to declare the S7 properties andS7_classattribute as S4 slots, which is helpful for making the S4 class more self-describing but critical in case the S7 class needs to be later extended by S4.To extend an S7 class using
setClass(), the S7 class first needs to be registered viaS4_register()andS4_contains()should be called to obtain a suitable value for thecontains=argument.S4_contains()is an abstraction that currently just requires the S7 properties to be static, ie not using getters and setters, since S4 code usingslot()would bypass those. Any instances of an S7-derived S4 class are proper S4 objects, with the bit set and a scalar class vector.Besides adding and changing hundreds of lines of logic around S4 classes, significant changes to support this in S7 include:
NULLslot sentinels. Until now, setting an S7 property toNULLremoved the attribute, and S4 does not like it when its slots disappear from the attributes.prop.cneed to be robust to scalar class vectors, using@.S3Classinstead.dimnames<-()generic is an example.@<-.S7_object()needs to support slots defined on S4 children that are not S7 properties, because S4 classes that extendS7_objectwill dispatch to the method, even when the S4 bit is set.Resolving the two incompatible
class()return values, scalar vs. non-scalar, is a larger issue. My proposal is that we exportmethods:::.class1()asmethods::class1()and encourage all S4 code to useclass1()to make the expectation of a scalar return value explicit. Practically, we could update the core of Bioconductor to adhere to that policy and thus enable new Bioconductor packages to extend the infrastructure with S7.