Skip to content

Conversation

@jared-hughes
Copy link
Contributor

@jared-hughes jared-hughes commented Nov 22, 2025

This PR only adds an extra entry in the code editor frontend that tells you how many strokes are total/selected. The hope is this will shape discussion moving forward.

Characters that have semantic meaning in the language count for only 1 stroke, so golfing is more straightforward than dealing with UTF-8:
image
High codepoints count for as many strokes as they have bytes, so packing is prevented:
image

Reference lang-allowed-strokes.tsx for the set of allowed bytes for the languages (APL, UIUA, 05AB1E, BQN, and Vyxal).

Strokes scoring has the following properties:

  • chars ≤ strokes ≤ bytes
  • If you only use code points under 128, then bytes = strokes = chars.
  • If you only use code points under 128 and characters in the allowed-strokes set, then strokes = chars.
  • If you try to pack into high unicode codepoints, then stokes > chars (so this wards against char packing).

A backend implementation would have more considerations like:

  • how this affects leaderboards.
  • considering how solutions are stored, both localStorage and in the database.
  • do we remove bytes and/or chars scoring in favor of stokes for these languages?

One potential concern people may have with stroke scoring is it allows more than 256 characters in some cases. In particular, 05AB1E and Vyxal both have 160 characters in their allowed-strokes set. Adding to the 127 non-null bytes under 128, this gives 287 characters that count as a single stroke.

This is much less egregious than the 'chars' scoring which has 1114110 chars that count as a single char, but it still means stroke scoring can't play on even ground with other languages using byte scoring. I don't think that matters. These golf languages aren't playing on even ground no matter what, and other languages don't play on even ground with each other anyways. Handicapping with a constant factor like 2x is silly, as is handicapping them with whatever factor UTF-8 works out to on average.

Some history: comparison to SBCS and Bytes/Chars

Bytes doesn't work well for these languages that heavily use multi-byte characters because it means code length doesn't always match visual length, since some multi-byte characters are longer than others. Also, certain languages such as APL can use 1:1 packing to encode each multi-byte character with one single-byte character.

Chars doesn't work well for many languages because they allow for 2:1 or 3:1 packers where large codepoints can be used in single chars to unpack into several bytes of regular code.

Using an SBCS (single-byte character set) is the traditional approach to this, in part motivated by CGSE's strict scoring that everything must be scored in bytes reversible from a file (I'm not sure if this policy has loosened in recent years). For example, APL has Adám's APL SBCS, and the golflangs 05AB1E and Vyxal explicitly have their own code pages: 05AB1E's codepage and [Vyxal's codepage]. (https://github.com/Vyxal/Vyxal/blob/version-3/documentation/codepage.md).

However, SBCS introduces some implementation difficulties, such as what to do with characters not in the code page. Since code.golf has many unicode-based holes (like emojify), it would be a bummer to entirely ban characters not in a code page.

APL on code.golf is a gist by ovs-code that goes into more detail about this.

language to count for only one stroke, while all other characters
count as one stroke for every UTF-8 byte in their encoding.
Reference
<a href="https://github.com/code-golf/code-golf/pull/2513">PR#2513</a>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we rather link to an issue over a PR?

@github-actions github-actions bot added the conflicts Pull Request has conflicts. label Nov 25, 2025
@github-actions github-actions bot removed the conflicts Pull Request has conflicts. label Dec 1, 2025
@JRaspass JRaspass merged commit f3baa2a into code-golf:master Dec 1, 2025
4 checks passed
@SirBogman
Copy link
Contributor

This looks smart, thanks for doing this @jared-hughes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants