Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make for-the-badge letter spacing more predictable, and rewrite layout logic #5754

Merged
merged 16 commits into from
May 26, 2021

Conversation

paulmelnikow
Copy link
Member

@paulmelnikow paulmelnikow commented Oct 21, 2020

This is a rewrite of the for-the-badge layout logic. The previous logic was very difficult to understand or modify, and also tended to produce inconsistent letter spacing which was hard to match during rasterization.

The goal was to produce consistent letter spacing, such that replacing textLength with letter-spacing produces a visually identical badge. It's not 100% perfect but it's very close. I suspect it is much more consistent than before. The letter spacing is 1.25px (or in 10x terms, 12.5px).

In the case of a message-only badge, I've removed the extra rect which we used to render behind the logo.

In this rewrite I haven't gone for consistency with the other badge-generation code. Rather, I've tried to produce something which is correct and as clean as possible, and can be used as a model to refactor the others.

Compare here: https://shields-staging-pr-5754.herokuapp.com/dev/styles/

This is a follow-on to #5696

@paulmelnikow paulmelnikow added the npm-package Badge generation and badge templates label Oct 21, 2020
@shields-cd shields-cd temporarily deployed to shields-staging-pr-5754 October 21, 2020 17:36 Inactive
@calebcartwright
Copy link
Member

And the raster proxy complexity faded away...

Haven't had a chance to test this yet but excited to do so 😄

@shields-ci
Copy link

shields-ci commented Oct 21, 2020

Messages
📖 ✨ Thanks for your contribution to Shields, @paulmelnikow!

Generated by 🚫 dangerJS against 7e8663a

@paulmelnikow paulmelnikow temporarily deployed to shields-staging-pr-5754 October 21, 2020 17:40 Inactive
@@ -576,78 +576,103 @@ function forTheBadge({
links,
logo,
logoWidth,
logoPadding,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logoPadding is not a user-specified parameter, but a value computed here:

logoPadding: logo && label.length ? 3 : 0,

I think we probably ought to push this logic into the badge generation code as I've done here in for-the-badge.

Copy link
Member

@calebcartwright calebcartwright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems straightforward to me (and a bit easier to grok). Left a few inline questions/comments below

badge-maker/lib/badge-renderers.js Show resolved Hide resolved
badge-maker/lib/badge-renderers.js Show resolved Hide resolved
badge-maker/lib/badge-renderers.js Outdated Show resolved Hide resolved
@paulmelnikow paulmelnikow temporarily deployed to shields-staging-pr-5754 October 22, 2020 01:48 Inactive
Copy link
Member

@calebcartwright calebcartwright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@paulmelnikow
Copy link
Member Author

I'll leave this open for a couple days in case @chris48s wants to review.

Copy link
Member

@chris48s chris48s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave this open for a couple days in case @chris48s wants to review.

"wants" is a strong word 😉
I've not really looked at the code in much detail, but there is one thing that immediately jumped out at me. The images on staging look right. I see if I can dig into it more at the weekend

Comment on lines 26 to 29
if (!logo && (logoPosition !== undefined || logoWidth !== undefined)) {
throw Error('`logoPosition` and `logoWidth` require `logo`')
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we've been here before. I can't remember what param it was (probably a different one), but we added some "if you're specifying this param, it has to actually make sense" logic before, loads of stuff broke and we ended up rolling it back the next day. Does that ring a bell.

Copy link
Member Author

@paulmelnikow paulmelnikow Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"wants" is a strong word 😉

😝

Well, thanks for reading anyway!

Hmm, no… it doesn't quite ring a bell. Unless it was something in the endpoint badge? I wonder if we added a test. I can do a little digging in the issues.

The reason I added this is so I could remove what appeared to be logic guarding against this inside the render functions.

I think, probably, that if there's a good reason for the server to accept params that don't make sense, that it should strip them out before invoking makeBadge().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember for the life of me what it is now, but we did something vaguely sensible-souning like like

  • ?style= has to be an actual style
  • if you're going to set ?logoColor you have to set ?logo=
  • ?color= has to be an actual colour

(it might have been none of those). It seemed completely valid at the time, then we deployed it and realised loads of people had badges out there with spurious values in the URLs and had to roll it back.

Spent about an hour trawling through old issues and PRs to try and track it down - can't find it :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the question is what happens when you pass such a URL to the badge server, and does it even get as far as these renderers? How about I move this change to its own draft PR and we can add some tests and investigate. As the code stands now, I don't think it is critical anymore.

Copy link
Member

@chris48s chris48s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I compare the source of
https://shields-staging-pr-5754.herokuapp.com/badge/build-passing-brightgreen
to the source of
https://shields-staging-pr-5754.herokuapp.com/badge/build-passing-brightgreen?style=for-the-badge
The first one is compressed, but the second is rendering with whitespace. I can't quite put my finger on why as by the looks of it, they should all be getting processed by stripXmlWhitespace() in makeBadge(). Any idea why that is happening?


I've had this thought before and its not necessarily something we need to do right now in this PR as it applies equally to the other code in badge-renderers.js too, but one of the things I wanted to check in this PR was that everything that needs to be escaped is definitely escaped. I'm conscious in attempting to read this is that because we call escapeXml() in various different places its hard to trace each variable through the function. I get that's necessary because sometimes we can escape straight away, whereas sometimes we want to (for example) do the text measurements first and then escape, etc..

Sometimes we are doing something like

const [leftLink, rightLink] = links.map(escapeXml)

early on, whereas sometimes we're doing it directly in the template string right at the end.

I wonder if it would make sense to adopt a form of hungarian notation just within badge-renderers.js where we have to remember to manually call escapeXml() before its safe to use certain variables. I'm thinking of the style described in the section "The Real Solution" of this blog post https://www.joelonsoftware.com/2005/05/11/making-wrong-code-look-wrong/ (from 15 years ago!)

If we say user-supplied vars are always usVar, and we always assign them to sVar when we escape them or something (and then document that convention), this would hopefully be much more obvious?

badge-maker/lib/badge-renderers.js Outdated Show resolved Hide resolved
badge-maker/lib/badge-renderers.js Outdated Show resolved Hide resolved
badge-maker/lib/badge-renderers.js Show resolved Hide resolved
@paulmelnikow
Copy link
Member Author

If I compare the source of
https://shields-staging-pr-5754.herokuapp.com/badge/build-passing-brightgreen
to the source of
https://shields-staging-pr-5754.herokuapp.com/badge/build-passing-brightgreen?style=for-the-badge
The first one is compressed, but the second is rendering with whitespace. I can't quite put my finger on why as by the looks of it, they should all be getting processed by stripXmlWhitespace() in makeBadge(). Any idea why that is happening?

Ah! It looks like stripXmlWhitespace() removes whitespace between elements, but not whitespace between attributes.

@paulmelnikow
Copy link
Member Author

I've had this thought before and its not necessarily something we need to do right now in this PR as it applies equally to the other code in badge-renderers.js too, but one of the things I wanted to check in this PR was that everything that needs to be escaped is definitely escaped. I'm conscious in attempting to read this is that because we call escapeXml() in various different places its hard to trace each variable through the function. I get that's necessary because sometimes we can escape straight away, whereas sometimes we want to (for example) do the text measurements first and then escape, etc..

Sometimes we are doing something like

const [leftLink, rightLink] = links.map(escapeXml)

early on, whereas sometimes we're doing it directly in the template string right at the end.

I wonder if it would make sense to adopt a form of hungarian notation just within badge-renderers.js where we have to remember to manually call escapeXml() before its safe to use certain variables. I'm thinking of the style described in the section "The Real Solution" of this blog post https://www.joelonsoftware.com/2005/05/11/making-wrong-code-look-wrong/ (from 15 years ago!)

If we say user-supplied vars are always usVar, and we always assign them to sVar when we escape them or something (and then document that convention), this would hopefully be much more obvious?

Yea, I thought about this too. The approach I tried to take here was to do all the 10x conversions and the escaping as late as possible, just as the XML was being assembled. The only reason not to put it right into the template literals is when we escape the same text twice. I do think the general gnarliness of this code makes it seem more bug prone than it will as it cleans up, so I'm sort of optimistic that if we do the escaping as late as possible, and either put it right into the XML or into a variable called escapedLabel, that it'll be obvious when things are right and when they're wrong. Though I'd also be open to aliasing label, message, etc. to something like unsafeLabel on the way in.

@shields-cd shields-cd temporarily deployed to shields-staging-pr-5754 October 29, 2020 20:08 Inactive
@chris48s
Copy link
Member

It looks like stripXmlWhitespace() removes whitespace between elements, but not whitespace between attributes

Right, yep should have spotted that. Following on from your suggestion in #3637 (comment) (see point 4 in #4459 for more info) we dropped the dependency on SVGO. In exchange for removing that library, we have assumed responsibility for manually optimising our outputs. As you note, stripXmlWhitespace() is crude, so the way this is dealt with elsewhere in this module is just.. write each XML element on one line, so we can either:

  • continue to just write each XML element on one line
  • make our own to minification approach more sophisticated
  • take on another dependency in the to do it for us package i.e: in
    "dependencies": {
    "anafanafo": "2.0.0",
    "css-color-converter": "^2.0.0"
    },
  • another (slightly long and rambling) idea which I shall post shortly

@chris48s
Copy link
Member

Yea, I thought about this too. The approach I tried to take here was to do all the 10x conversions and the escaping as late as possible, just as the XML was being assembled.

OK. As I say, we don't need to do it now. If the way we're going to try and clarify it for now is to escape as late as possible (I think in this case, that probably make sense) are we able to move the escaping of leftLink and rightLink into the template strings then?

@chris48s
Copy link
Member

I'm not really sure how to reconcile this review. I kind of drafted the inline comments and then thought about it some more and now I'm actually thinking about this whole module in a different direction so on the whole this review probably comes across like its written by two different people or something. I think I'm just going to paste-dump everything I've got in the interests of moving this on and then we can try to make sense of the soup of comments.

I'd prefer to keep this refactor isolated to for-the-badge, with the idea that it can be used as a model to refactor the others.

This is going to get a bit long-winded, but if you're thinking of refactoring the whole module, I have a couple of thoughts on this:

  1. Broadly speaking, the flat, flat-square and plastic badges are very slight variations on the exact same format/logic. These three lend themselves to sharing as much implementation between them as possible. The social and FTB styles are sufficiently different that they are best thought of as just being two different bespoke things (hence the really different style those two are written). It may be that this style provides a template for social, but I suspect it may not map as well to the flat, flat-square and plastic styles (which are probably starting from a better place anyway tbh).

  2. This isn't the only problem, but I think looking back over this module, my biggest regret is probably the poor separation of concerns between calculating values and assembling XML strings. Broadly, this iteration gives us a slightly different configuration of code that has roughly the same underlying issue. Having spent a bit of time chewing this over, I reckon a really big win in this respect would be to factor out a simple renderXmlElement() helper function which deals with turning objects into strings. Something like:

function renderXmlElement({
  element, // string
  content, // string
  attrs,   // object
}) {
  let attrs_str = ''
  if (attrs) {
    attrs_str = Object.entries(attrs).map(([k,v]) => `${k}="${v}"`).join(' ')
  }
  if (content) {
    return `<${element} ${attrs_str}>${content}</${element}>`
  }
  return `<${element} ${attrs_str} />`
}

That's a rough implementation. Maybe it needs a couple more guard conditions, maybe content should actually be an array of strings.. you get the basic idea though. That would allow us to write code like:

const rect = renderXmlElement({
  element: 'rect',
  attrs: {width: 20, height: 15, fill: '#FFF'}
})

const text = renderXmlElement({
  element: 'text',
  content: 'hello'
})

const labelText = renderXmlElement({
  element: 'a',
  content: `${rect}${text}`,
  attrs: {target: '_blank', 'xlink:href': 'https://shields.io/' }
})

(intentionally simplified, but again.. you get where I'm going).

This would then abstract out all the string templating so the badge rendering functions are just calculating values and composing function calls, which might simplify things somewhat. This would actually help towards solving two of the things I've mentioned in this review: First we can write the code however we want.

renderXmlElement({
  element: 'rect',
  attrs: {
    width: 20,
    height: 15,
    fill: '#FFF'
  }
})

and

renderXmlElement({element: 'rect', attrs: {width: 20, height: 15, fill: '#FFF'}})

render the exact same output string because we have seperated presentation and logic, whereas

`<rect width="20" height="15" fill="#FFF" />`

is a different string in the rendered output from

`<rect
  width="20"
  height="15"
  fill="#FFF"
/>`

That solves the formatting problem without making stripXmlWhitespace() more complex or taking on another dependency, and we can lay our code out however we want.

I think the next place I'd go with that is to see if we can help us simplify the escaping too (but we need to avoid accidentally double-escaping stuff). This may still be a bit half-baked, but I think if we change the line where we assemble the attributes to

- attrs_str = Object.entries(attrs).map(([k,v]) => `${k}="${v}"`).join(' ')
+ attrs_str = Object.entries(attrs).map(([k,v]) => `${k}="${escapeXml(v)}"`).join(' ')

(i.e: we assume the attribute values are always unsafe), we could just completely delegate attribute escaping to renderXmlElement() and we can just pass it unsafe vars. This is also sticking with your "escape as late as possible" principle. content is more tricky because content may be an XML element rather than a value, so maybe we would need to be able to pass a boolean telling us whether the content is safe, or wrap it in a object/type to tell renderXmlElement() whether content is a string or XML element. e.g:

if (content.type == 'xml') {
  return `<${element} ${attrs_str}>${content.text}</${element}>`
}
return `<${element} ${attrs_str}>${escapeXml(content.text)}</${element}>`

I feel like if we can completely delegate calling escapeXml() into renderXmlElement(), do it in an intuitive way and say "we always make XML strings with renderXmlElement(), never manual templating" then this masively helps to make code safe by default and make wrong code look wrong without adopting a wierd naming convention.

What do you think to this as a direction, or element to incorporate in your refactoring?


Just to bring this back to the original point.. (which I've strayed from a bit). The core point of this PR is to fix a problem with text width calculation and we don't necessarily need to scope creep and block fixing that bug on a massive refactor of the whole badge-renderers.js module (especially as that bugfix is in turn a blocker to switching out the raster proxy), but seeing as that's the direction you seem to be heading in, hopefully some of that is useful. Maybe it would be worth giving a basic renderXmlElement() implementation a go just on the FTB code in this PR to solve the whitespace issue though?

@shields-cd shields-cd temporarily deployed to shields-staging-pr-5754 November 30, 2020 02:48 Inactive
@chris48s
Copy link
Member

@paulmelnikow I am planning to try and revisit this PR over the weekend and see if we can get it finished

@shields-cd shields-cd temporarily deployed to shields-staging-pr-5754 March 19, 2021 23:25 Inactive
@calebcartwright
Copy link
Member

By the way, https://squint-test.herokuapp.com is still up and configured to fetch the svgs from the review app associated with this PR so if needed you can use that to check the impact of any additional changes here on the raster proxy (note that the current set of changes have resolved all the prior issues on the raster side, recently discovered emoji issues aside)

https://squint-test.herokuapp.com/badge/no-emoji-blue

@chris48s
Copy link
Member

By the way, https://squint-test.herokuapp.com is still up and configured to fetch the svgs from the review app associated with this PR

Nice - thanks 👍

@chris48s chris48s temporarily deployed to shields-staging-pr-5754 March 20, 2021 14:26 Inactive
@chris48s chris48s temporarily deployed to shields-staging-pr-5754 March 20, 2021 14:30 Inactive
Copy link
Member

@chris48s chris48s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I built this job up a bit too much in my mind. Once I got into it, there wasn't really as much to do here as I feared. I've stopped pulling the thread with For The Badge in this PR but I think this is going to be a good structure to apply to the other badge renderers moving forward.

/**
* Representation of an XML element
*/
class XmlElement {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This XmlElement class is the real meat of what I've done here. It is a slight evolution of Chris-from-the-past's suggestion in #5754 (comment) . The big advantage of this is it allows us to just construct an XmlElement using unescaped/unsafe data, pass it around, and then when we call .render() at the end, that will deal with all the escaping for us. It also decouples the string representation of the object from how we lay out the code, solving the whitespace issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still horribly unfamiliar with most of the package code so largely deferring to others. However, did want to ask whether we'd anticipate any additional memory load from the class-based approach, particularly given what we saw recently in production with the got experiment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be a slightly heavier implementation than pure string concatenation, but the thing I'm chasing here is developer experience rather than performance. Making it easy for us to "fall into the pit of success" when it comes to escaping has value. Every abstraction has a cost.

I think the difference we will see here will be a lot less than the overhead associated with a full featured HTTP library but it is conjecture and this code is very much on the critical path. I think to quantify it, we need to throw a real workload at it tbh. One thing I did to is A-B it against master using the benchmark script and this seems to add somewhere around 0.02 of a millisecond to a FTB render, which I am not worried about.

One variable here is that although I would like to apply this approach to the other badge styles this PR only changes the rendering of the FTB style. I don't really know what proportion of the traffic we serve is on FTB vs other styles but anecdotally I suspect it is very small so I suspect we wouldn't really start to see a big difference until we apply the approach to the flat style. I'd be surprised if we gather much useful data just from deploying this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Since we've shifted over to Heroku it's much easier to be able to canary deploy these types of changes and quickly revert back out if needed, which makes a huge difference.

This is yet another example of where having the ability to do some load/stress testing against a prod-like env would be useful, so may be worth circling back to this at some point down the road once some other related items (runtime costs/sponsorship, etc.) have concluded.

Comment on lines +68 to +71
if (el instanceof XmlElement) {
return el.render()
} else {
return escapeXml(el)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This recursive approach means content can be a mixed array containing strings and other XmlElement objects and we won't double-escape when nesting elements.

${hasLabel ? renderLabelText() : ''}
${renderMessageText()}
</g>`
[backgroundGroup.render(), foregroundGroup.render()].join('')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've drawn a line here and I'm just passing strings up to renderBadge(), but I think as we work through the other renderers it will ultimately make sense to hand off XmlElement objects here and make one final grand call to .render() on a XmlElement({tag: 'svg'...}) object right at the end so the job of calling escapeXml() is 100% delegated to XmlElement.render().

const labelTextX = ((labelWidth + totalLogoWidth) / 2) * 10
const labelTextLength = (labelWidth - (24 + totalLogoWidth)) * 10
const escapedLabel = escapeXml(label)
const logoElement = new XmlElement({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've ended up leaving the logo rendering inline as we are now making an object not a string, but we can come back to this and extract it later.

Copy link
Member

@calebcartwright calebcartwright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still feeling somewhat lost in the depth of the conversation but figured I'd take a look at 6571fbf

Only had one minor question/comment left inline but seems to me to implement the proposal from #5754 (comment) in a readable way 👍

I don't really have much to offer around the discussions on the different approaches, and I'm admittedly pretty heavily biased towards a "anything that fixes the FTB style badges" solution so can't help there.

One thing I did find coming back to this after a few months is that the code flow is now a bit easier to follow, but I felt some cognitive complexity was added around visualizing what the end state svgs would look like compared to the approaches that leaned more heavily on the larger template literals.

I think that's just the nature of the tradeoffs with this context but figured I'd share anyway as a perspective from someone largely unfamiliar with this area of the code base

Comment on lines 60 to 64
let xmlStr = `<${this.tag}`
if (attrsStr === '') {
xmlStr += '>'
} else {
xmlStr += ` ${attrsStr}>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

admittedly this may be another comment in the arena of micro-optimization, but would it be easier to just always insert the extra space (the else path), or if we want to avoid having a >, add an inline ternary on the end of the initial xmlStr assignment? I got a bit lost in the preceding conversation so while I understand and agree the useless space is indeed undesired, I wasn't clear whether we've already got cases of that today.

Just wondering if it'd be beneficial to be able to avoid the string concatenation since this will be on a hot path with recursion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect it doesn't matter

While conceptually this is recursive, for a badge the deepest trees we create are like 5 nodes deep so.. meh

I've pushed a version with fewer template statements, fewer concatenations and zero mutable variables. I won't make any claims about performance, but having revisited it, I reckon this is at least easier to read, so that's something.

Copy link
Member Author

@paulmelnikow paulmelnikow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really nice @chris48s! Thanks for picking this up! (And sorry I've been so swamped! I had a few minutes before our ops meeting and thought 👍🏼-ing this was probably the most impactful thing I could do.)

This approach of generating the nested elements using your XmlElement class leads to very readable code, and feels like a fine approach around which to consolidate the rest of the badge-generation code.

In the future we could consider other alternatives to the XmlElement approach, such as ReactDOMServer or various substitutes. (Does preact have something like that?) That could mean writing the badges in JSX, not sure if that would make them more readable or just the same. Honestly it's a very similar approach, just a different syntax.

Can't approve since I opened this but here is my 👍🏼

badge-maker/lib/xml.js Outdated Show resolved Hide resolved
Copy link
Member

@calebcartwright calebcartwright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like everyone is good with this so happy to give it the 👍

Love the improved readability and the dramatically better results these changes allow the raster server to produce 🙏

@chris48s chris48s merged commit cdd68fe into master May 26, 2021
@chris48s chris48s deleted the for-the-badge branch May 26, 2021 19:55
@chris48s
Copy link
Member

Thanks 👍 I'll deploy this shortly.

Next steps:

  • This unblocks the work on moving the raster server off vercel and on to heroku. Are you good to pick that up again (time permitting) now? Assuming you can remember anything about what we were doing there.. 😆
  • Once we've had this deployed for a couple of days I'll do a badge-maker release to NPM with these changes
  • As a slightly wider project we can migrate the remaining styles to use XmlElement but I don't think there's a big rush there

@calebcartwright
Copy link
Member

Are you good to pick that up again (time permitting) now? Assuming you can remember anything about what we were doing there.. laughing

Sure! There will likely be a bit of reacclimating required 😆, but the other good news is that the changes upstream here also greatly reduce the original complextity that was needed there before (to work around things here)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
npm-package Badge generation and badge templates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants