Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: Ignore max excerpt length for div excerpts too #13058

Merged

Conversation

AndrewPrigorshnev
Copy link
Contributor

We support two types of custom excerpts. It can be <div class="excerpt"> or <span class="excerpt">. We also ignore max excerpt length for custom excerpts. But we forgot to process div when ignoring max length.

@AndrewPrigorshnev AndrewPrigorshnev marked this pull request as ready for review May 13, 2021 15:58
lib/excerpt_parser.rb Outdated Show resolved Hide resolved
it "ignores max excerpt length if a div excerpt is specified" do
two_hundred = "123456789 " * 20 + "."
text = two_hundred + "<div class='excerpt'>#{two_hundred}</div>" + two_hundred
expect(PrettyText.excerpt(text, 100)).to eq(two_hundred)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it right to ignore? seems a bit odd. I would just truncate at 100

Copy link
Contributor Author

@AndrewPrigorshnev AndrewPrigorshnev May 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was implemented like this when the feature was introduced. So it's just documenting existent implementation.

I had the same doubts, but eventually, I feel that yes, it's a good design decision.

We always truncate automatic excerpts. But when it comes to custom excerpts it's better to give a user control over this. Probably the user decided to use a custom excerpt just to show that last word or even sentence that was truncated by automatic excerpt extraction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having read this, I think we're using one variable length for two different purposes.

When you create an instance of the ExcerptParser, the length parameter is the maximum length of the excerpt to be created.

However, when the parser detects the user added a custom excerpt, it will change the value of that length parameter to ensure it's properly picked up even if it's at the end of the post.

While I like the idea of giving users the ability to provide a custom excerpt anywhere in the post, I do not like that it's not limited in any sort of ways. That's a recipe for abuse 😱

What I reckon we should do here @AndrewPrigorshnev is when we detect a custom excerpt, we should change the html with the content of the custom excerpt instead of changing the length of the excerpt. That way we get the best of both worlds: users can add a custom excerpt anywhere in their post using a div or a span, and it's limited according to the site setting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, when the parser detects the user added a custom excerpt, it will change the value of that length parameter to ensure it's properly picked up even if it's at the end of the post.

Not only for that, but to override max excerpt length from settings too. It was intentionally implemented in this way. Here is the post from the same topic, and it was committed back then in 2014 with the message "FEATURE: Allow manual excerpt to be specified anywhere in the post and override max excerpt length".

I think this behavior makes sense. The main reasoning behind this change was to give users more control over excerpts:

"My major issue is that there is zero mod control here, you have an excerpt and can not control how it looks or when it stops."

Let's say my site has a limit for automatic excerpts in 100 chars, and I want to use 200 chars length excerpt for some important pinned topic. I may use a custom excerpt, and I wouldn't want it to be truncated to 100 chars. I wouldn't want to increase the limit for the whole site just because of this only topic, either.

Speaking of abuse, probably someone can start using tags <div class="excerpt">a looong message</div> to introduce clutter on the site. But in the same way, he can just start using four-letter words in his posts. Both problems are approachable by using standard moderation tools (flags, suspension, and so on). Maybe there could be another way to abuse it that I don't see? I think the worst thing that could happen here is that someone includes the whole post in the excerpt.

We definitely can change this behavior. But it's quite bigger than the change this PR is introducing. This would mean changing behavior that was intentionally introduced before and, in my opinion, makes sense. It would make this power user feature less powerful. Do we really want it?

Copy link
Contributor Author

@AndrewPrigorshnev AndrewPrigorshnev May 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make this power user feature less powerful.

Also if someone out there uses long custom excerpts let's say for categories, their excerpts will get truncated after this fix. And the only thing they will be able to do about it is to increase the excerpts limit for the whole site.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an excellent idea 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can let custom excerpts bypass max excerpt length but always truncate all excerpts by 1000 chars.

🤔 not following, are you saying

We no longer let custom excerpts bypass max excerpt length we truncate at 1000 chars.

Agree with @ZogStriP we need an upper limit otherwise people can sadly grief us and use this escape hatch to break rendering on front page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 not following, are you saying

We no longer let custom excerpts bypass max excerpt length we truncate at 1000 chars.

Nope, not exactly. An example:

Let's say topic excerpt maxlength set to 220 characters:
Screenshot 2021-05-21 at 13 02 01
Then our algorithm for automatic excerpt extraction will take maximum first 220 characters of a post. The rest will be truncated.

If a user decides to use a custom excerpt (using <div class="excerpt"> or <span class="excerpt">) that excerpt would bypass this restriction. If he wants to add a few more words above the topic excerpt maxlength limit it would work. If he wants to use a custom excerpt of 500 characters for some important pinned topic it would work too.

But to prevent misuse and abuse we will truncate all excerpts by 1000 chars anyway:

  • It's already implemented for automatic excerpts, users just can't set topic excerpt maxlength to a value bigger than 1000:
    Screenshot 2021-05-21 at 13 15 44
  • And we will add this protection for custom excerpts too. So custom excerpts would bypass topic excerpt maxlength setting but would be truncated on 1000'th character anyway.

Again I'd like us to add this protection in a separate PR or it would be really hard to follow. We can firstly merge this PR since it's just a bug fix, we just forgot to process div in one place and then we can merge another PR with misuse protection.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SamSaffron posted:

It is a bit confusing but sure sounds fine to merge this as is

Custom excerpts are such a giant edge case, as long as we have some level of protection (as you described) I think we are ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine, @SamSaffron thank you for bringing it up! And @ZogStriP thank you for this nice idea:

What I reckon we should do here @AndrewPrigorshnev is when we detect a custom excerpt, we should change the html with the content of the custom excerpt instead of changing the length of the excerpt.

it'll help a lot with implementing this.

Co-authored-by: Jarek Radosz <jradosz@gmail.com>
@AndrewPrigorshnev AndrewPrigorshnev changed the title FIX: Ignore max excerpt length for div excepts too FIX: Ignore max excerpt length for div excerpts too May 20, 2021
@AndrewPrigorshnev AndrewPrigorshnev merged commit c62efc0 into master May 24, 2021
@AndrewPrigorshnev AndrewPrigorshnev deleted the fix/ignore-max-excerpt-length-for-div-excerpt-too branch May 24, 2021 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants