New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Builder
instead of String
#18
Conversation
This is awesome @parsonsmatt — I am completely sold on this being the right way to go and am entirely concerned about not breaking anything. What are the risks as you see it? |
Well, after doing some benchmarking and profiling, I'm unconvinced this is the right approach. It is a noticeable improvement. But So I want to do a bit more benchmarking before I commit this in |
By all means! My main concern remains avoiding breakage. |
Do you mean like API break? Or breaking change in behvaior? There isn't a test suite. I suppose I could |
I mean breaking changes in behaviour and I am concerned because we have no test coverage. So I really have to rely on good old fashioned engineering judgement — yours! If you think the risks are reasonable then I am good with that. |
This reverts commit 03a4346.
@parsonsmatt do you need anything from me? (All good on my side — just checking I am not blocking anything.) |
No, I just want to ensure this is actually a performance boon on Haddock. I haven't been able to verify that it speeds anything up or that it's correct yet. |
all cool — just ping me if you need anything |
where | ||
close = if empty then " />" else ">" | ||
|
||
shownAttrs = concat [' ':showPair attr | attr <- attrs ] | ||
shownAttrs = foldMap (\attr -> charUtf8 ' ' <> showPair attr) attrs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my equivalence testing, the attributes are showing up in reverse order. Weird. The difference should be pretty small.
concat [ k x | x <- xs ]
is equal to concat $ map k xs
, which is concatMap k xs
, and concatMap = foldMap
when m ~ [b]
, as it is here, so the real diff is
-foldMap (\attr -> ' ' : showPair attr) attrs
+foldMap (\attr -> charUtf8 ' ' <> showPair attr) attrs
So the rendering isn't the issue.
addAttrs (html@(HtmlTag { markupAttrs = attrs }) ) | ||
= html { markupAttrs = attrs ++ attr } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here we have attrs ++ attr
, with attr
being the new thing. So:
(anchor ! [href "asdf"]) ! [theclass "woop"]
should evaluate to:
(anchor ! [href "asdf"]) ! [theclass "asdf"]
(Html { attrs = [href "asdf"] }) ! [theclass "asdf"]
(Html { attrs = [href "asdf"] ++ [theclass "asdf"] })
Text/XHtml/Internals.hs
Outdated
case html of | ||
HtmlTag { markupAttrs = attrs, .. } -> | ||
HtmlTag | ||
{ markupAttrs = attrs . (++ attr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But here, we have:
(anchor ! [href "asdf"]) ! [theclass "qwer"]
==>
(HtmlTag { markupAttrs = id } ! [href "asdf"]) ! [theclass "qwer"]
==>
HtmlTag { markupAttrs = id . (++ [href "asdf"]) } ! [theclass "qwer"]
==>
HtmlTag { markupAttrs = id . (++ [href "asdf"]) . (++ [theclass "qwer"]) }
When we go to render, we apply []
to this. GHCI makes that:
λ> id . (++ ["asdf"]) . (++ ["qwer"]) $ []
["qwer","asdf"]
Which is - hm, a bit surprising!
id . (++ xs) . (++ ys) $ []
-- delete id
(++ xs) . (++ ys) $ []
-- replace `.` with `$`
(++ xs) $ (++ ys) []
-- inline operator section
(++ xs) $ [] ++ ys
-- inline operator
([] ++ ys) ++ xs
The dlist
package defines fromList
as UnsafeDList . (++)
and append
as UnsafeDList $ unsafeApplyDList xs . unsafeApplyDList ys
.
fromList xs = UnsafeDList . (++) $ xs
fromList xs = UnsafeDList $ (++) xs
fromList xs = UnsafeDList (xs ++)
So, yeah, I converted attr
incorrectly here.
Verified that there's no diff for Final performance improvement: Comparing
Pretty pleased with these results. |
@cdornan I believe this is ready for merge and release |
Unfortunately, this API is unpleasant with Because many functions accept |
OK, this is really frustrating! I can't reproduce the performance findings. In fact, when I try now, the Going to close this out for now until I can figure out what's going on. |
@parsonsmatt i am so sorry! And sorry I could not be of more help and thanks for looking into this. I hope you have a good break. |
I'm thinking it may be an issue with GHC 9.4.2, actually? GHC 9.4.3 shows a big speedup. But the work codebase uses GHC 9.4.2, and I can't reproduce the performance benefits with 9.4.2. |
That could explain it. Feel free to reopen once your confidence is restored. |
Fixes #16
This PR uses
Data.ByteString.Builder
instead ofString
for HTML fragmentsAlso uses difference lists to avoid the
++
penaltyThere's no benchmark on the library, so I can't provide direct results. However, comparing this to my CPS WriterT branch on
haddock
, I'm seeing some nice improvements, particularly for such a local change:With just the CPS WriterT
Using a
Builder
variant ofxhtml
Surprisingly, overall allocations are about the same. I'd have expected the allocations to be much less, since the overhead of a
Builder
is smaller than aString
.The reduced memory in use is stable across
haddock
runs. I'm not entirely sure why that's the case, given that allocations are basically unchanged.