-
-
Notifications
You must be signed in to change notification settings - Fork 7.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add truncate template function #2882
Conversation
if err != nil { | ||
return "", errors.New("text to truncate must be a string") | ||
} | ||
text, err = cast.ToStringE(textParam) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use :=
and remove var text string
above to delay the allocation.
} | ||
|
||
var count int | ||
words := strings.Fields(text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd bet that something like this would be more efficient. Doesn't use strings.Fields
that would generate more allocs.
This commit adds a truncate template function for safely truncating text without breaking words. The truncate function is HTML aware, so if the input text is a template.HTML it will be truncated without leaving broken or unclosed HTML tags. {{ "this is a very long text" | truncate 10 " ..." }} {{ "With [Markdown](/markdown) inside." | markdownify | truncate 10 }}
Thanks @moorereason I've added both performance optimizations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@biilmann a couple of comments/requests:
- Pull the implementation and test (not the "smoke test") out into its own files, name it template_func_truncate.go and template_func_truncate_test.go. The files they live in now have gotten a bit on the long side.
- The text truncate variant (I have not checked the HTML code path) assumes that every character is 1 byte, which fail pretty fast.
- Hugo is Big in Japan ... And Japanese and the other CJK languages are inherently space-less. Needs test cases to confirm that this works.
- The test line coverage looks ... average. I don't care about obvious error paths, but the other conditionals should be covered (or removed if not relevant).
All in all, it looks fine. It doesn't look particularly fast nor memory effective, but I guess a faster version would be much more complex and hard to read.
@bep, Can we stop creating |
@moorereason it should be pretty obvious if you add those missing test cases. As to naming, I just follow a naming convention already there. Let us take that discussion somewhere else. |
Spot on with the unicode comments. I've made some changes that fixes the issues with unicode slicing of the texts and that should handle languages with no spaces. Take a look and let me know if you spot any other issues. The two code paths between text and HTML truncation are much more similar now, and I'll want to refactor a bit to have just one path before this is ready to merge... |
Add test cases for some edge cases and japanese characters
Yea, this looks more like a |
Here is a failing test case for you: {2, template.HTML("<p>P1</p><p>P2</p>"), nil, template.HTML("<p>P1 …</p>"), false}, |
Avoid having two separate code branches for truncating text and HTML
Good catch - fixed that edge case and got rid of the duplication between the text and HTML truncation. |
|
||
if isHTML { | ||
// Make sure we keep tag of HTML tags | ||
slice := string(text[i:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
text
is a string now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There is a flaw in the tag-closing logic. This test case fails: {3, template.HTML(strings.Repeat("<p>P</p>", 20)), nil, template.HTML("<p>P</p><p>P</p><p>P …</p>"), false},
|
@@ -662,6 +662,15 @@ e.g. | |||
* `{{slicestr "BatMan" 3}}` → "Man" | |||
* `{{slicestr "BatMan" 0 3}}` → "Bat" | |||
|
|||
### truncate | |||
|
|||
Truncate a text to a max length without cutting words or HTML tags in half. Since go templates are HTML aware, truncate will handle normal strings vs HTML strings intelligently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say:
Truncate a text to a max length without cutting words or leaving unclosed HTML tags. Since Go templates are HTML-aware, truncate
will handle normal strings vs HTML strings intelligently. It's important to note that if you have a raw string that contains HTML tags that you want treated as HTML, you will need to convert the string to HTML using the safeHTML
template function before sending the value to truncate
; otherwise, the HTML tags will be escaped by truncate
.
`{{ "<em>Keep my HTML</em>" | safeHTML | truncate 10 }}` → `<em>Keep my …</em>`
Rewrote the tag closing logic and updated the docs. Added a few more edge case test cases as well around tag closing. |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
This commit adds a truncate template function for safely truncating text without
breaking words. The truncate function is HTML aware, so if the input text is a
template.HTML it will be truncated without leaving broken or unclosed HTML tags.
The HTML truncation is based on Django's truncatehtml template helper, so while I would normally be really cautious about doing anything regexp based with HTML, this is a very battle tested implementation. I've also tested it on a corpus of 2600+ blog posts from a large publication.