Decode HTML entities in excerpts #2257

mr-vinn · 2020-09-17T20:37:17Z

Decode HTML entities in the text content before returning an excerpt. Fixes #2114.

mr-vinn · 2020-09-17T21:43:23Z

I think the other getExcerpt() functions need some attention too; I've marked this PR as a draft until I can review the other classes.

ssddanbrown · 2020-09-17T22:40:21Z

Thanks for starting a pull request. I think the HTML entities should probably be decoded before the text is stored in the DB, that way you won't have to alter every point that the text is used.

mr-vinn · 2020-09-17T23:05:34Z

I think the HTML entities should probably be decoded before the text is stored in the DB

I wasn't sure if anything else was relying on the stored content being encoded, but if you think that's safe then I'm all for it. I'll replace this commit with a change that stores decoded text in the database.

Decode HTML entities in page text before saving it to the database.

mr-vinn · 2020-09-18T14:35:42Z

The new commit saves decoded text in the database for pages. This does affect search functionality, but I think the existing behavior is a bug. Currently you have to use the HTML entity reference in the search string to get the expected results:

This commit fixes the search behavior along with the excerpt rendering. However, the PR will need a database migration to decode existing text content that's already been saved—I'll work on that next.

ssddanbrown · 2020-09-19T11:23:32Z

Hi @mr-vinn,
I agree, That current search behaviour is just due to the existing issue and therefore not correct in itself.

I'd say don't worry about decoding existing content in a migration. This issue is mainly cosmetic and I'd prefer to not have a fairly heavy operation in the migrations to just solve a cosmetic issue. I always try to minimise risk in the migrations where possible.

If a way to update all existing plain text content is really warranted I'd prefer a command to be added which a user could optionally run but I don't think it is.

mr-vinn · 2020-09-19T12:49:08Z

I'd say don't worry about decoding existing content in a migration.

That works for me; in that case this PR is ready to merge.

ssddanbrown · 2020-09-19T14:14:30Z

Thanks again @mr-vinn, Testing added to prevent regression & merged to be in the next release.

mr-vinn changed the title ~~Decode HTML entities~~ Decode HTML entities in excerpts Sep 17, 2020

mr-vinn changed the title ~~Decode HTML entities in excerpts~~ [WIP] Decode HTML entities in excerpts Sep 17, 2020

mr-vinn marked this pull request as draft September 17, 2020 21:41

Decode HTML entities

311a12b

Decode HTML entities in page text before saving it to the database.

mr-vinn force-pushed the preview-entities branch from e2c609e to 311a12b Compare September 18, 2020 14:28

mr-vinn marked this pull request as ready for review September 19, 2020 12:49

mr-vinn changed the title ~~[WIP] Decode HTML entities in excerpts~~ Decode HTML entities in excerpts Sep 19, 2020

ssddanbrown added this to the v0.30.0 milestone Sep 19, 2020

ssddanbrown merged commit 44f3508 into BookStackApp:master Sep 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decode HTML entities in excerpts #2257

Decode HTML entities in excerpts #2257

mr-vinn commented Sep 17, 2020

mr-vinn commented Sep 17, 2020

ssddanbrown commented Sep 17, 2020

mr-vinn commented Sep 17, 2020

mr-vinn commented Sep 18, 2020

ssddanbrown commented Sep 19, 2020

mr-vinn commented Sep 19, 2020

ssddanbrown commented Sep 19, 2020

Decode HTML entities in excerpts #2257

Decode HTML entities in excerpts #2257

Conversation

mr-vinn commented Sep 17, 2020

mr-vinn commented Sep 17, 2020

ssddanbrown commented Sep 17, 2020

mr-vinn commented Sep 17, 2020

mr-vinn commented Sep 18, 2020

ssddanbrown commented Sep 19, 2020

mr-vinn commented Sep 19, 2020

ssddanbrown commented Sep 19, 2020