Add truncate template function #2882

biilmann · 2017-01-03T22:29:29Z

This commit adds a truncate template function for safely truncating text without
breaking words. The truncate function is HTML aware, so if the input text is a
template.HTML it will be truncated without leaving broken or unclosed HTML tags.

{{ "this is a very long text" | truncate 10 " ..." }}
{{ "With [Markdown](/markdown) inside." | markdownify | truncate 10 }}

The HTML truncation is based on Django's truncatehtml template helper, so while I would normally be really cautious about doing anything regexp based with HTML, this is a very battle tested implementation. I've also tested it on a corpus of 2600+ blog posts from a large publication.

moorereason · 2017-01-03T23:00:25Z

tpl/template_funcs.go

+	if err != nil {
+		return "", errors.New("text to truncate must be a string")
+	}
+	text, err = cast.ToStringE(textParam)


Use := and remove var text string above to delay the allocation.

moorereason · 2017-01-03T23:02:46Z

tpl/template_funcs.go

+	}
+
+	var count int
+	words := strings.Fields(text)


I'd bet that something like this would be more efficient. Doesn't use strings.Fields that would generate more allocs.

This commit adds a truncate template function for safely truncating text without breaking words. The truncate function is HTML aware, so if the input text is a template.HTML it will be truncated without leaving broken or unclosed HTML tags. {{ "this is a very long text" | truncate 10 " ..." }} {{ "With [Markdown](/markdown) inside." | markdownify | truncate 10 }}

biilmann · 2017-01-04T01:23:03Z

Thanks @moorereason I've added both performance optimizations.

bep

@biilmann a couple of comments/requests:

Pull the implementation and test (not the "smoke test") out into its own files, name it template_func_truncate.go and template_func_truncate_test.go. The files they live in now have gotten a bit on the long side.
The text truncate variant (I have not checked the HTML code path) assumes that every character is 1 byte, which fail pretty fast.
Hugo is Big in Japan ... And Japanese and the other CJK languages are inherently space-less. Needs test cases to confirm that this works.
The test line coverage looks ... average. I don't care about obvious error paths, but the other conditionals should be covered (or removed if not relevant).

All in all, it looks fine. It doesn't look particularly fast nor memory effective, but I guess a faster version would be much more complex and hard to read.

moorereason · 2017-01-04T21:43:12Z

@bep,
Can you point out where it assumes each character is a byte? Ranging on a string iterates the runes.

Can we stop creating tpl/template_something.go already? We're in the tpl directory. Feels like the file names are stuttering. I'd rather see tpl/func_truncate.go. I want to rename most of the files in the directory, but I'm afraid I'd break everyone's PRs. 😀

bep · 2017-01-04T21:49:34Z

@moorereason it should be pretty obvious if you add those missing test cases. As to naming, I just follow a naming convention already there. Let us take that discussion somewhere else.

biilmann · 2017-01-05T08:45:44Z

Spot on with the unicode comments. I've made some changes that fixes the issues with unicode slicing of the texts and that should handle languages with no spaces.

Take a look and let me know if you spot any other issues.

The two code paths between text and HTML truncation are much more similar now, and I'll want to refactor a bit to have just one path before this is ready to merge...

Add test cases for some edge cases and japanese characters

bep · 2017-01-05T09:58:12Z

Yea, this looks more like a truncate func people would want to steal for their CMS project ... I will have a closer look later.

bep · 2017-01-05T10:14:26Z

Here is a failing test case for you:

{2, template.HTML("<p>P1</p><p>P2</p>"), nil, template.HTML("<p>P1 …</p>"), false},

Avoid having two separate code branches for truncating text and HTML

biilmann · 2017-01-05T19:10:39Z

Good catch - fixed that edge case and got rid of the duplication between the text and HTML truncation.

moorereason · 2017-01-05T19:40:25Z

tpl/template_func_truncate.go

+
+		if isHTML {
+			// Make sure we keep tag of HTML tags
+			slice := string(text[i:])


text is a string now.

bep · 2017-01-05T21:52:01Z

There is a flaw in the tag-closing logic. This test case fails:

{3, template.HTML(strings.Repeat("<p>P</p>", 20)), nil, template.HTML("<p>P</p><p>P</p><p>P …</p>"), false},

moorereason · 2017-01-05T22:57:29Z

docs/content/templates/functions.md

@@ -662,6 +662,15 @@ e.g.
 * `{{slicestr "BatMan" 3}}` → "Man"
 * `{{slicestr "BatMan" 0 3}}` → "Bat"

+### truncate
+
+Truncate a text to a max length without cutting words or HTML tags in half. Since go templates are HTML aware, truncate will handle normal strings vs HTML strings intelligently.


I would say:

Truncate a text to a max length without cutting words or leaving unclosed HTML tags. Since Go templates are HTML-aware, truncate will handle normal strings vs HTML strings intelligently. It's important to note that if you have a raw string that contains HTML tags that you want treated as HTML, you will need to convert the string to HTML using the safeHTML template function before sending the value to truncate; otherwise, the HTML tags will be escaped by truncate.

`{{ "<em>Keep my HTML</em>" | safeHTML | truncate 10 }}` → `<em>Keep my …</em>`

biilmann · 2017-01-06T07:07:44Z

Rewrote the tag closing logic and updated the docs. Added a few more edge case test cases as well around tag closing.

github-actions · 2022-02-18T10:46:36Z

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

moorereason reviewed Jan 3, 2017

View reviewed changes

biilmann force-pushed the truncate branch from f7d6bd2 to a6caf55 Compare January 4, 2017 01:22

bep reviewed Jan 4, 2017

View reviewed changes

biilmann force-pushed the truncate branch from fc4cdc2 to c1c7c4e Compare January 5, 2017 08:44

Make truncate work with unicode

ee61aab

Add test cases for some edge cases and japanese characters

biilmann force-pushed the truncate branch from c1c7c4e to ee61aab Compare January 5, 2017 08:47

Handle self closing tags

0585b61

biilmann added 2 commits January 5, 2017 10:35

Fix truncation edge case

a7ec4f3

Just 1 code branch for handling truncation

9496e95

Avoid having two separate code branches for truncating text and HTML

moorereason reviewed Jan 5, 2017

View reviewed changes

Get rid of unecessary string cast

5f30c36

moorereason reviewed Jan 5, 2017

View reviewed changes

Rewrite tag closing code for truncate

fdb12e6

bep merged commit 2989c38 into gohugoio:master Jan 6, 2017

mattstratton mentioned this pull request Feb 27, 2017

Improve use of truncate devopsdays/devopsdays-theme#351

Closed

github-actions bot locked as resolved and limited conversation to collaborators Feb 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add truncate template function #2882

Add truncate template function #2882

biilmann commented Jan 3, 2017

moorereason Jan 3, 2017

moorereason Jan 3, 2017

biilmann commented Jan 4, 2017

bep left a comment •

edited

Loading

moorereason commented Jan 4, 2017

bep commented Jan 4, 2017

biilmann commented Jan 5, 2017

bep commented Jan 5, 2017

bep commented Jan 5, 2017

biilmann commented Jan 5, 2017

moorereason Jan 5, 2017

biilmann Jan 5, 2017

bep commented Jan 5, 2017 •

edited

Loading

moorereason Jan 5, 2017

biilmann commented Jan 6, 2017

github-actions bot commented Feb 18, 2022

Add truncate template function #2882

Add truncate template function #2882

Conversation

biilmann commented Jan 3, 2017

moorereason Jan 3, 2017

Choose a reason for hiding this comment

moorereason Jan 3, 2017

Choose a reason for hiding this comment

biilmann commented Jan 4, 2017

bep left a comment • edited Loading

Choose a reason for hiding this comment

moorereason commented Jan 4, 2017

bep commented Jan 4, 2017

biilmann commented Jan 5, 2017

bep commented Jan 5, 2017

bep commented Jan 5, 2017

biilmann commented Jan 5, 2017

moorereason Jan 5, 2017

Choose a reason for hiding this comment

biilmann Jan 5, 2017

Choose a reason for hiding this comment

bep commented Jan 5, 2017 • edited Loading

moorereason Jan 5, 2017

Choose a reason for hiding this comment

biilmann commented Jan 6, 2017

github-actions bot commented Feb 18, 2022

bep left a comment •

edited

Loading

bep commented Jan 5, 2017 •

edited

Loading