-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] New UI for listing object uses, including in RichText and StreamField. #4702
Conversation
Nested StreamBlock bug now fixed in 7ec418a. |
7ec418a
to
19952a6
Compare
wagtail/core/collectors.py
Outdated
|
||
if not remaining_path: | ||
# End of path reached; return model instances found within `value` | ||
if isinstance(current_block, RichTextBlock) and value is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt anyone will actually do it, but what if someone writes a block type returning a RichText
object? It will not be handled by this line, while checking for the value will handle such cases. This is why I originally checked the data types instead of the block types.
Also, is it possible to end up with a None
value in a RichTextBlock
, apart from a manual database rewrite?
(Making this review in the context of 19952a6)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! We need to check against the block definitions for the non-leaf cases, so I was following the same pattern here for consistency... but checking the value type would indeed be more flexible. Will update...
I guess a RichTextBlock
can't (currently) be None
, but I think it's good to be cautious here.
Looks great to me 😄 Later we’ll probably have to add a hook to allow users to add their own introspection rules, but that’s for another day! |
wagtail/core/collectors.py
Outdated
else: | ||
use = cls(obj, parent=parent, on_delete=on_delete) | ||
if use in exclude: | ||
if use not in originals: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BertrandBordage If multiple objects are being passed to get_all_uses
, it's possible for one object to be in the originals
list and also part of a CASCADE
chain for another of those objects, isn't it? If so, we can't just skip over this logic for objects in originals
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...in fact, what's the purpose of this line at all? As far as I can see, the only reason why we might want to give originals
special treatment is that they exist in the exclude
list as plain instances rather than Use
records, and it may or may not be a good thing to populate Use.parent
with a plain model instance rather than a Use
.
Given that the parent
attribute of Use
doesn't appear to be used, tested or documented anywhere, I'm inclined to remove it completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gasman, to answer your questions (sorry in advance 🙁):
originals
andexclude
always containUse
instances, notModel
instances.Use.parent
is used inUse.__init__
. EachUse
instance uses theparent
attribute of their ancestors to define its depth. I agree that it could be removed and incrementparent.depth
instead. However, I still think it’s a good thing to keep aparent
attribute in a tree structure.- That whole
if use in exclude: if use not in originals: […]
has a single purpose: work around the messy nested structure returned by Django’sNestedObjects
. It can return something like this for an objecto1
:[o1, [o5], o2, [o4], o3, [o1]]
(note that I’m simplifying because the example doesn’t contain multitable inheritance, which is correctly handled by this method). In that case:o2
,o3
ando5
are directly related too1
. Yes, it doesn’t seem to make sense becauseo5
is nested whileo2
ando3
are at the same level aso1
. This is the main cause for the complexity of this method.[o5]
is considered as a list of children ofo1
because it’s followingo1
.o2
ando3
are at a different level because they also contain related objects. I know, this is an awful tree structure, but that’s what we have to deal with.o4
is related too2
. We want to list it, but unlikeo5
we have to remember thato4
is contained ino2
, whereaso5
was considered at the root of our uses tree structure. So this means we had to remember the previous item,o2
, as the parent for the list of children, and therefore we store it inmain_use
. Before that, we search forother_use
, the first instanciation ofo2
in case there is any other. This is to avoid using extra RAM because millions of uses can be found for a single deletion…o1
is related too3
, but we don’t want to listo1
since we already know we want to delete it. Sinceo1
is already inexclude
fromget_all_uses
, it will not be shown. To linko3
too1
, we defined the variablemain_use
containing the use preceding the list[o1]
.
- It’s not possible for one object from
originals
to be in theCASCADE
chain of another object.NestedObjects
can only return an object once as a non-leaf node. For example, you couldn’t get[o1, [o2, [o1, [o3]]]]
. There are two reasons for that, first, obviously to avoid infinite recursion, and second, because of the SQL optimisations done byNestedObjects
to avoid triggering too many requests (even though it’s far from being optimised☹️ ).
TLDR:
Without if use not in originals:
(wrong because it shouldn’t show a chevron because nothing is contained)
With if use not in originals:
(right solution)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we can also get this kind of structure as well: [o1, [o2, [o4], o3, [o1], o5]]
. It depends on the case and I couldn’t make sense out of it, it’s either a Django bug or an inconsistency due to some internal way of running the requests. From what I remember and what I’ve tested right now, it seems to depend on the relation type: OneToOneField
and GenericForeignKey
seem to give the first structure while ForeignKey
and ManyToManyField
give the second. Maybe I misunderstood, but anyway the difference of structure is pointless to us: we want to show all these relations in the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine, but can this go in code comments please? :-)
Also, could we perhaps remove the Use(foo) == foo
behaviour from Use.__eq__
? I noticed that the tests are relying on that behaviour, but I have no idea if any other code is - and it's yet another thing that's making this PR extremely difficult to review, because every time I see a comparison between Use
objects, I have to stop and ask myself if it's doing the obvious thing or whether there's some secret type-hacking going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, no problem with removing the ability to compare Use
and Model
instances.
Well, you see the problem with putting all that as code comments x)
Tricky to do such complex explanations while keeping the code readable. Do you want me to add a docstring+a few inline comments in this method? I can make a commit in my old branch that you could cherry-pick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please!
wagtail/core/collectors.py
Outdated
if exclude is None: | ||
exclude = set() | ||
main_use = None | ||
for i, obj in enumerate(nested_list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i
isn't being used, so this can be simplified to for obj in nested_list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Obviously some vestigial code.
Going to have to postpone this for another release :-( The logic around cascading deletions, wrangling the output of |
wagtail/core/collectors.py
Outdated
if self.is_root: | ||
return html | ||
return format_html('{}<i class="icon icon-arrow-right"></i> {}', | ||
(' ' * self.depth * 8), html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this part got lost somehow, maybe during a merge: the space should be a non-breaking space, I’m sure I originally put it. @gasman please include it again, but with an explicit unicode code: '\u00A0'
.
EDIT: It’s one of my merges that lost it 😦
4b17017
to
a80ba37
Compare
Rebased with latest fixes cherry-picked. |
# When ``use`` is in ``originals``, we don’t want | ||
# to mark it as parent of the next item because we want | ||
# to display the next item as root, otherwise the first | ||
# level of deleted objects would be shown nested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BertrandBordage OK, let me see if I've got this right: this logic comes into play when nested_list
is a structure such as:
[a, [b], b, [c]]
where:
b
is one of the items fromoriginals
b
appears in the child list of an earlier item- the top-level occurrence of
b
in the list is immediately followed by a sub-list
In this case, we do not set parent_use
to b
- it gets left at its existing value, which is a
, and so c
ends up being assigned as a child of a
. Is that what you intended to happen? Is this actually a legitimate way that NestedObjects might represent c
being a child of a
, and are we relying on the fact that if c
was meant to be a child of b
, the list would look like [a, [b, [c]], b, [c]]
(in which case the second c
would be skipped along with the second b
)? Ouch...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gasman No, c
will not be marked as a child of a
because a
is also in originals
. So for c
, parent_use
is None
. This makes sense, because when you want to see what’s related to a
and b
, you only want c
to be listed as a root node. You do not want to display a
or b
.
In other words, the algorithm answers to:
What is related to a & b?
with
c
c
is correctly meant to be a child of b
, and this algorithm takes it into account: if b
was not in originals
, then b
would be a root node and c
would be a child of b
.
In other words, the algorithm answers to:
What is related to a?
with
b
- c
You might want to tell me “I didn’t mention a
was in originals
”. But that’s necessarily the case: NestedObjects
returns a structure where the order is important. The first root nodes are always from originals
, that’s why this iteration logic remembering the previous root node works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BertrandBordage Ah, I missed the detail that exclude
always contains the originals whenever we call this - lines 565-566 (which are never used?) led me to understand that the initial state was the empty set. Could we make exclude
a required argument, or at least change those lines to
if exclude is None:
exclude = set(originals)
so that it's not misleading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can indeed add it. I kept it consistent with from_flat_iterable
, but that was probably not the best decision.
# to mark it as parent of the next item because we want | ||
# to display the next item as root, otherwise the first | ||
# level of deleted objects would be shown nested. | ||
if use in originals: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BertrandBordage Would it be valid to set parent_use = None
at this point? I think this is what's hurting my head so much about this code 😀 : the fact that parent_use
is being left at its previous value in this case, which might be something left over from way earlier in the iteration. I suspect you're relying on an unspoken rule about the behaviour of NestedObjects
which ensures that parent_use
is only ever None
at this point: i.e. it will never give you a situation like
originals = exclude = {a, b}
nested_list = [a, c, b, [d]]
where a non-original (c
) appears at the top level before all the originals have been output. If so, setting parent_use = None
would make the intention clearer, and save us from having to figure out whether that rule is really true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I’m indeed relying on that order rule, as I mentioned in my previous message.
You’re totally right, we should explicitly add another parent_use = None
here to make it extra clear.
I tried too much to do an optimal algorithm…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry btw for the headache
a80ba37
to
dc94947
Compare
dc94947
to
b7d1ec2
Compare
b7d1ec2
to
5c824e4
Compare
5c824e4
to
bb84a96
Compare
@solarissmoke has fixed the query counts in bb84a96, but combining this PR with a couple of other patches (#5266, #5286, #5353, none of which change the test fixtures AFAICS) for a custom build has caused them to change again. Not sure what's causing the variation, but that'll definitely need to be dealt with before this can be merged. |
….com/gasman/wagtail - wagtail#4702 Conflicts: wagtail/images/tests/test_admin_views.py
See wagtail#4702 (comment) - they can't be relied on to remain stable when combined with other PRs
Alright, I will get back to this cursed PR this summer ^^" |
Manage this branch in SquashTest this branch here: https://bertrandbordage-new-uses-ui-re-g14rq.squash.io |
@BertrandBordage do you think there will be any progress soon? One of my clients is wanting this feature and I'm wondering if I should invest any time to make it working or wait until this PR is finished and shipped in a future version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be customizable somehow. For example, is_shown_in_uses()
and get_edit_url()
can't be easily set on third-party models, also at work we'd like to PROTECT
instances found in StreamFields and RichTextFields.
I think we could convert get_all_uses()
to a class, set ModelRichTextCollector
, StreamFieldCollector
and Use
as its attributes and make this new class overridable via setting.
{% block titletag %}{% trans "Delete image" %}{% endblock %} | ||
|
||
{% block content %} | ||
{% trans "Delete image" as del_str %} | ||
{% include "wagtailadmin/shared/header.html" with title=del_str icon="image" %} | ||
{% include "wagtailadmin/shared/header.html" with title=del_str subtitle=image icon="image" %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change should be moved in another PR.
<input type="submit" value="{% trans 'Yes, delete' %}" class="button serious" /> | ||
</form> | ||
{% if uses.are_protected %} | ||
<p>{% trans 'Impossible to delete: this object is referenced by other objects through protected relations.' %}</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might make sense to move this line into shared template. Also might make sense to make it more prominent with class="help-block help-critical"
.
for field in self.fields: | ||
filters |= Q(**{field.attname + '__regex': pattern}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should generate pattern for each field separately considering features they are using, i.e. not search for Document
in RichTextField(features=('bold', 'italic'))
{% csrf_token %} | ||
<input type="submit" value="{% trans 'Yes, delete' %}" class="button serious" /> | ||
<a href="{% if next %}{{ next }}{% else %}{% url 'wagtailsnippets:list' model_opts.app_label model_opts.model_name %}{% endif %}" class="button button-secondary">{% trans "No, don't delete" %}</a> | ||
</form> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{% include 'wagtailadmin/shared/uses.html' %}
should be added here.
Now rebased at #6129 - will keep both PRs open for now so that @sir-sigurd's feedback (thanks!) doesn't end up getting lost in transmission... Copying my comment from there: Right now I think the main thing standing in the way of merging this is that it really needs refactoring to be less "monolithic". As it stands, the logic in collectors.py is touching on the internal details of many areas of Wagtail (rich text, streamfields, foreign key on_delete behaviour...) which is going to make it hard to maintain - for example, if we extend rich text to support other kinds of object references beyond linktype and embedtype (see #4223) then it's not immediately clear what needs to change in collectors.py. I think the key to that will be making the concept of "a collector" into a well-defined internal API, so that whenever we have a field type that can contain object references (which currently includes ForeignKey, RichTextField and StreamField), there's a corresponding Collector class that implements a few standard methods for querying those references. After the previous review on #4702 (which teased out a lot of the internal workings of collectors.py and added docstrings) we hopefully have a fairly good idea of what those methods will be - something along the lines of:
Given how rich text and streamfields interact in RichTextBlock, it may also be necessary to add some lower-level methods to this API, such as "return a database-specific regexp for matching an object reference". Hopefully splitting the logic up in this way will also help to make our tests more tightly-scoped, and pin down what's causing the changing query counts. |
yield obj, found_obj | ||
|
||
|
||
class StreamFieldCollector: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gasman Do you think we could simplify this if we could switch stream fields to be based on JSONField?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, although then we still have to deal with RichTextBlocks, which means we'll end up mixing two techniques (doing regexp matches on items within a JSON field, rather than just one big regexp).
An alternative mechanism for reference-counting has now been implemented in #9279. Rebuilding the deletion confirmation page from this PR on top of that mechanism should be a relatively small task, at which point this can be considered resolved. |
Superseded by #10072. Instead of showing the references on the delete confirmation view, we're reusing the existing usage view and adding the on delete information there if you access it through a link on the delete confirmation view. |
Rebase of #4481, with the following fixes applied in response to the earlier code review there:
LinkHandler
andEmbedHandler
now deal purely with database-representation / front-end-HTML semantics, and hallo.js-specific legacy code remains elsewhere)ModelRichTextCollector.find_object
is split into a separatefind_all_objects
method instead of overloadingStreamFieldCollector.find_values
->find_objects
Still to do:
StreamFieldCollector.find_objects
when recursing over nested StreamBlocks: https://github.com/wagtail/wagtail/pull/4481/files#r190932257StreamFieldCollector
onward, and view-level code)