Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--minify breaks on JSON since 59.0 #6472

Closed
regisphilibert opened this issue Nov 1, 2019 · 26 comments
Closed

--minify breaks on JSON since 59.0 #6472

regisphilibert opened this issue Nov 1, 2019 · 26 comments

Comments

@regisphilibert
Copy link
Member

Since 59.0 hugo throws errors when trying to minify JSON files.

Building sites … ERROR 2019/11/01 16:57:44 parse error:1:1: unexpected character
    1: {"output":{"data":{"created":"2010-02-11T12:46:02Z","draf...
       ^
ERROR 2019/11/01 16:57:44 parse error:1:1: unexpected character
    1: {"output":{"data":{"created":"2010-03-08T16:25:25Z","draf...
       ^

It appears the minify lib has been updated to v2.5.2 with Hugo 59.0 so wondered if that could trigger the issue... b401858

I'll try and share a repo later.

Thanks!

@bep bep added this to the v0.59.2 milestone Nov 2, 2019
@bep
Copy link
Member

bep commented Nov 2, 2019

/cc @anthonyfok

@anthonyfok
Copy link
Member

Hi Regis and Bjørn Erik,

Thank you for bringing this to my attention. Unfortunately, partly due to my unfamiliarity with the JSON output feature, or perhaps the JSON test cases that I have on hand are too simplistic, that I am unable to reproduce the error that you are seeing, so please do share a test repo when you have time. Many thanks!

bep added a commit that referenced this issue Nov 3, 2019
@bep
Copy link
Member

bep commented Nov 3, 2019

Yea, I added a quick test for this myself, and it doesn't look like "JSON is broken" is true in the general sense (which would have surprised me), so there must be something "special" about the JSON file used.

@regisphilibert
Copy link
Member Author

Will try and look into it next week. Thanks a lot.

@regisphilibert
Copy link
Member Author

here you go @anthonyfok : https://github.com/theNewDynamic/gohugo-6472

Thanks a lot for your patience.

@anthonyfok
Copy link
Member

anthonyfok commented Nov 8, 2019

  • Before: parse v2.3.5 and minify v2.3.7 (Hugo ≤ v0.58.3), no error.
  • After: parse v2.3.9 and minify 2.5.2 (Hugo ≥ v0.59.0), error: unexpected character.

The error is reported from parse in commit tdewolff/parse@7763141 (between v2.3.5 and v2.3.6): "Improve error messages for NULL and unknown values"

So, according to @regisphilibert's MWE (minimal working example) https://github.com/theNewDynamic/gohugo-6472:

Error: Error building site: failed to render pages: parse error:1:1: unexpected character
    1: <!DOCTYPE html><html><head><title>http://example.org/arti...

or fuller report after I tweaked tdewolff/parse code:

Error: Error building site: failed to render pages: parse error:1:1: unexpected character '<'
    1: <!DOCTYPE html><html><head><title>http://example.org/article/</title><link rel="canonical" href="http://example.org/article/"/><meta name="robots" content="noindex"><meta charset="utf-8" /><meta http-equiv="refresh" content="0; url=http://example.org/article/" /></head></html>

It would appear that somehow Hugo is trying to ask Minify to process HTML as if it were JSON, and before commit tdewolff/parse@7763141, such error was silently ignored by tdewolff/parse. So...

But then, it does not seem to explain the error message that @regisphilibert's originally encountered:

Building sites … ERROR 2019/11/01 16:57:44 parse error:1:1: unexpected character
    1: {"output":{"data":{"created":"2010-02-11T12:46:02Z","draf...
       ^
ERROR 2019/11/01 16:57:44 parse error:1:1: unexpected character
    1: {"output":{"data":{"created":"2010-03-08T16:25:25Z","draf...
       ^

As { the "open curly bracket" is certainly a valid starting character for a JSON file, so, calling it an "unexpected character" hints at a bug in tdewolff/parse... ?

So:

  • Hypothesis 1: Newer version of parse uncovers a previously hidden bug in Hugo
  • Hypothesis 2: A bug in the newer versions of parse and/or minify
  • Hypothesis 3: Both of the above.

@regisphilibert, I know you are very busy, but, if possible, could you please also try to produce a MWE in order to get the error you see initially, i.e. "unexpected character" on seemingly valid JSON code?

@tdewolff, please help take a look and see if you can shed some light on this. Thank you!

(Sorry, I am really slow at trying to understand the code, and I have exceeded my quota for today, hence this "progress report" and plea for help.)

@regisphilibert
Copy link
Member Author

@anthonyfok I just updated the example repo so it compiles with what you requested. Not sure what did it, but it seems to be assigning JSON on RegularPage as well as lists.

@anthonyfok
Copy link
Member

Note to self: In the example that @regisphilibert provided, Hugo apparently calls Minify() from func Transformer() in minifier/minifier.go:

    return min.Minify(m.m, ft.To(), ft.From(), params)

I was going to try to decipher what m.m (mediatype) reads, but ran out of time.

@anthonyfok
Copy link
Member

@anthonyfok I just updated the example repo so it compiles with what you requested. Not sure what did it, but it seems to be assigning JSON on RegularPage as well as lists.

Wow, that was really quick! Thank you so much @regisphilibert!

Error: Error building site: failed to render pages: parse error:1:1: unexpected character '<'
    1: {"content":"Be With by Forrest Gander wins  2019 Pulitzer Prize for PoetryNew Directions  overjoyed by  news that our poet, translator,  dear friend Forrest Gander—whom we first published  1998—has won  year\u0026rsquo;s Pulitzer Prize!Read \u0026ldquo;Epitaph\u0026rdquo; 
       ^

Hmm... strange, the ^ points at {, but the unexpected character seems to be < still. Intriguing!

That '<' I got from modifying tdewolff/parse/json/parse.go from:

    p.err = parse.NewErrorLexer("unexpected character", p.r)

to

    p.err = parse.NewErrorLexer("unexpected character"+string(c), p.r)

@tdewolff
Copy link

What I think is happening is that passing HTML is causing the error (ie. the < character), but that the error display message is displaying the "first character is wrong" on the wrong file. This hypothesis means a bug in Hugo for putting HTML through the JSON parser, and a bug in the error reporting of parse.

When the parser finds an error, it registers the i th position it occured in the file. The error recovery process then parses the file again to find the line number and column number, as well as the context (surrounding text). When the file buffer is changed in between, the error recovery process reads the i th position from the wrong file. This is probably a subtle bug in parse, but might be caused by how Hugo uses it...let me investigate better next week when I have some time.

@bep bep modified the milestones: v0.59.2, v0.60, v0.61 Nov 25, 2019
@tdewolff
Copy link

tdewolff commented Nov 27, 2019

I have fixed this problem partially in tdewolff/parse@2.3.14 (with tdewolff/minify@2.6.1). What remains is that Hugo parses a generated HTML document by a JSON parser.

This can be confirmed by adding:

start := ft.From().Bytes()[:0]
if len(ft.From().Bytes()) > 30 {
    start = ft.From().Bytes()[:30]
}
fmt.Printf("Transform: %s %s\n", mediatype, string(start))

before line 54 in hugo/minifiers/minifiers.go. It outputs:

Transform: application/json <!DOCTYPE html><html><head><ti

Stack trace:

goroutine 121 [running]:
runtime/debug.Stack(0x3b, 0x0, 0x0)
	/usr/lib/go/src/runtime/debug/stack.go:24 +0x9d
runtime/debug.PrintStack()
	/usr/lib/go/src/runtime/debug/stack.go:16 +0x22
github.com/gohugoio/hugo/minifiers.Client.Transformer.func1(0x1e5e580, 0xc00149be80, 0xc001691260, 0x115)
	/home/taco/go/src/github.com/gohugoio/hugo/minifiers/minifiers.go:60 +0x249
github.com/gohugoio/hugo/transform.(*Chain).Apply(0xc00163f298, 0x1e4d940, 0xc001ea3110, 0x1e4d920, 0xc001accfc0, 0x0, 0x0)
	/home/taco/go/src/github.com/gohugoio/hugo/transform/chain.go:105 +0x25e
github.com/gohugoio/hugo/publisher.DestinationPublisher.Publish(0x1e94f20, 0xc00057d4a0, 0x1, 0xc00057a2c0, 0x1e4d920, 0xc001accfc0, 0x1aa051e, 0x4, 0x1ac7621, 0xb, ...)
	/home/taco/go/src/github.com/gohugoio/hugo/publisher/publisher.go:100 +0x373
github.com/gohugoio/hugo/hugolib.(*Site).publishDestAlias(0xc00023b500, 0xc001471e00, 0xc000cf6580, 0x1a, 0xc001471ee0, 0x1b, 0x1aa051e, 0x4, 0x1ac7621, 0xb, ...)
	/home/taco/go/src/github.com/gohugoio/hugo/hugolib/alias.go:124 +0x3d4
github.com/gohugoio/hugo/hugolib.(*Site).writeDestAlias(...)
	/home/taco/go/src/github.com/gohugoio/hugo/hugolib/alias.go:97
github.com/gohugoio/hugo/hugolib.(*Site).renderPaginator(0xc00023b500, 0xc000ff0090, 0xc0004e4900, 0x24, 0x30, 0x13, 0xc000ff0090)
	/home/taco/go/src/github.com/gohugoio/hugo/hugolib/site_render.go:181 +0x35e
github.com/gohugoio/hugo/hugolib.pageRenderer(0xc00147cc20, 0xc00023b500, 0xc00179e120, 0xc00100b020, 0xc0015be9f0)
	/home/taco/go/src/github.com/gohugoio/hugo/hugolib/site_render.go:157 +0x675
created by github.com/gohugoio/hugo/hugolib.(*Site).renderPages
	/home/taco/go/src/github.com/gohugoio/hugo/hugolib/site_render.go:73 +0x160

Edit:
Looks like Page(/article/_index.md) with output format JSON (and thus mediatype application/json) is send through the pages channel in hugolib/site_render.go:127 while the content is clearly HTML. I'm not very familiar with the code base of Hugo, so I don't know where this channel item is coming from. @anthonyfok any ideas?

@bep bep modified the milestones: v0.61, v0.62 Nov 28, 2019
@bep bep modified the milestones: v0.62, v0.63 Dec 11, 2019
@mlake
Copy link

mlake commented Dec 14, 2019

FWIW, I upped from 0.59.1 to 0.61.0 and now --minify is producing malformed AMP HTML Pages.. will turn off until it's sorted out..

@bep bep added this to the v0.75 milestone Jul 13, 2020
@bep bep modified the milestones: v0.75, v0.76 Sep 14, 2020
@bep bep modified the milestones: v0.76, v0.77 Oct 6, 2020
@bep bep modified the milestones: v0.77, v0.78 Oct 30, 2020
@moorereason
Copy link
Contributor

moorereason commented Nov 30, 2020

I'm unable to reproduce this in v0.79.0. Anyone confirm?

Edit: I'm testing the example repo from @regisphilibert

@bep bep modified the milestones: v0.78, v0.83 Apr 23, 2021
@bep bep modified the milestones: v0.83, v0.84 May 3, 2021
@bep bep modified the milestones: v0.84, v0.85 Jun 18, 2021
@bep bep modified the milestones: v0.85, v0.86 Jul 5, 2021
@bep bep modified the milestones: v0.86, v0.87, v0.88 Jul 26, 2021
@bep bep modified the milestones: v0.88, v0.89 Sep 2, 2021
yukihane added a commit to yukihane/yukihane.github.io that referenced this issue Oct 15, 2021
yukihane added a commit to yukihane/yukihane.github.io that referenced this issue Oct 15, 2021
yukihane added a commit to yukihane/yukihane.github.io that referenced this issue Oct 15, 2021
@jmooring
Copy link
Member

@moorereason

I'm unable to reproduce this in v0.79.0. Anyone confirm?

I cannot reproduce the problem with v0.88.1. Closing.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants