Naming discussion: count_{en,de,trans}codable #16

h-vetinari · 2021-07-14T22:59:21Z

First off, thanks a lot for the awesome work you're doing! 🙃

I have been loosely following @ThePhD's stream (flood?) of papers & blog posts, and was wondering about a naming discussion they had posed on twitter a while ago.

Naming is hard, though worth it - the right name substantially reduces cognitive overhead, especially when accumulated over a long period of time. Since I was intrigued by the riddle of squaring that circle, I had written to @ThePhD with a suggestion, and they suggested to open an issue here.

One of the key points about the count_* functions that doesn't exactly jump out from the documentation (at least to me) is that the actual work of encoding/... (especially around memory allocation etc., AFAIU) is not being done. Quoth @ThePhD (in private communication):

I'm trying to avoid the participle because count_decoded sounds like "count after the work is already done", and that work expressly isn't being done. That's why I took the adjective-y version of decodable (and friends). Or, at least, that was the reasoning....! Not sure if it's good reasoning 😅

Since I wasn't warming up to count_encodable et. al (it's not intrinsically obvious to from the name whether things are being counted before or after the operation. The "-able" suffix makes it seem like it could be the former - "these code points are encodable [into code units], now count them"), I wanted to come up with something that more intrinsically reflects the absence of real work™.

My suggestion boils down to changing the verb from count to infer, freeing up the participle to rejoin the party unburdened of assumed labour. IOW:
infer_encoded_{size,length,count}

count could even make a comeback as a noun, but whether the choice is size, length or count is (from my POV) not that important.

I have no beef with the validate_* functions (whose expressiveness I applaud), so, to sum up my proposed answer to the initial riddle from the tweet (using my preferred noun size, if only due to its lack thereof, typographically speaking):

encode
decode
transcode
validate_encodable_as
validate_decodable_as
validate_transcodable_as
infer_encoded_size
infer_decoded_size
infer_transcoded_size

Thanks for reading!

The text was updated successfully, but these errors were encountered:

ThePhD · 2021-07-20T06:05:03Z

I was hoping somebody'd chime in, even after I posted it on Twitter, but nobody's really saying much. infer_* hasn't hit me just right yet. I'm not particulately married to the current ones either, though, so I'll probably just let this one sit and hope others can chime in.

If you've got friends, let them know.

Spammed · 2021-07-22T06:48:39Z

I am not a native speaker, and do not know exactly what is meant, but:

You count things that you have.
One can infer from something that one has to something that one will have.
One can conclude democratically what one wants to have.

In this sense I would understand the difference between 'count' and 'infer'. If this were meant, 'infer' would indeed be more accurate in my humble opinion. (And another crisply short 5-letter word, of which your language is so wonderfully rich).

Spammed · 2021-07-22T18:59:12Z

another 5-letter verb is 'deduc'...

h-vetinari · 2021-07-22T21:44:41Z

I think deduce could be an option indeed, even though it has a letter more. ;)

Spammed · 2021-08-01T08:31:41Z

or predict_ ?

h-vetinari · 2021-08-01T10:12:32Z

Some more discussion on twitter: https://twitter.com/__phantomderp/status/1421710921110589443

In particular, count_as_encoded came up, which I think is pretty good too (certainly much better than count_encodable, IMO). It would also imbue the two non-fundamental variants (validate/count) with an as (though unfortunately with different meanings & positions).

"count as [if]" can be interpreted as not actually doing the work (see OP), but I still think "infer" does that better - and if "count" is an important marker (cf. STL), it could still be kept as a noun: infer_encoded_count

ThePhD · 2021-08-08T05:37:51Z

Okay. We're going to go with one of infer_x_count or count_x_as. Vote here by reacting to this reply with

👀 for infer_x_count; or,
🚀 for count_as_x

Also, twitter poll: https://twitter.com/__phantomderp/status/1424243292015841281

h-vetinari · 2021-08-08T08:16:52Z

I think you meant count_as_x here 🙃

Spammed · 2021-08-08T08:22:06Z

One argument against 'infer' on twitter was that it implies it could go wrong.
Yes, exactly! If that's true and that can happen, then I'm very much in favor of 'infer_x_count()'.

h-vetinari · 2021-08-08T08:25:24Z

Regarding the uncertainty that people seem to associate with "infer", this is IMO a recent change in meaning because "infer"/"inference" is used in many scientific fields where the answers can never be known with certainty.

The dictionary leaves not much room for failure though (there's various definitions on that site, but they all paint the same picture):

infer

(ɪnˈfɜː)
vb (when tr, may take a clause as object) , -fers, -ferring or -ferred

to conclude (a state of affairs, supposition, etc) by reasoning from evidence; deduce
[definitions of transitive version omitted]

h-vetinari · 2021-08-08T08:34:36Z

🚀 for count_x_as

I think you meant count_as_x here 🙃

I think that little mix up is a good illustration of the risks of having the as in different positions (middle vs. end) in the count/validate functions.

Still, my position is actually not as decisive as it might seem - count_as_encoded is a good name too (and has other advantages).

Spammed · 2021-08-08T08:34:54Z

OK, But is what is done trivial (in the sense of 'count_as_dozen(24) -> 2'
or something more complicated (in the sense of get_screenwidth_of('Hello World!'))?

ThePhD · 2021-08-08T20:38:37Z

What is done is trivial. But, infer might help here since the operation takes an error_handler (or two) and will also account for that as it goes through and counts (for example, it will anticipate exactly the # of replacement characters).

ThePhD · 2021-08-13T19:14:36Z

This has now been implemented with the voted-on changes. Should appear in documentation in a bit.

ThePhD self-assigned this Jul 18, 2021

ThePhD added the question label Jul 18, 2021

ThePhD closed this as completed in 0b8af05 Aug 13, 2021

ThePhD added documentation enhancement thank you!! labels Aug 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Naming discussion: count_{en,de,trans}codable #16

Naming discussion: count_{en,de,trans}codable #16

h-vetinari commented Jul 14, 2021

ThePhD commented Jul 20, 2021

Spammed commented Jul 22, 2021

Spammed commented Jul 22, 2021 •

edited

Loading

h-vetinari commented Jul 22, 2021

Spammed commented Aug 1, 2021

h-vetinari commented Aug 1, 2021

ThePhD commented Aug 8, 2021 •

edited

Loading

h-vetinari commented Aug 8, 2021

Spammed commented Aug 8, 2021

h-vetinari commented Aug 8, 2021 •

edited

Loading

h-vetinari commented Aug 8, 2021

Spammed commented Aug 8, 2021 •

edited

Loading

ThePhD commented Aug 8, 2021

ThePhD commented Aug 13, 2021

Naming discussion: count_{en,de,trans}codable #16

Naming discussion: count_{en,de,trans}codable #16

Comments

h-vetinari commented Jul 14, 2021

ThePhD commented Jul 20, 2021

Spammed commented Jul 22, 2021

Spammed commented Jul 22, 2021 • edited Loading

h-vetinari commented Jul 22, 2021

Spammed commented Aug 1, 2021

h-vetinari commented Aug 1, 2021

ThePhD commented Aug 8, 2021 • edited Loading

h-vetinari commented Aug 8, 2021

Spammed commented Aug 8, 2021

h-vetinari commented Aug 8, 2021 • edited Loading

infer

h-vetinari commented Aug 8, 2021

Spammed commented Aug 8, 2021 • edited Loading

ThePhD commented Aug 8, 2021

ThePhD commented Aug 13, 2021

Spammed commented Jul 22, 2021 •

edited

Loading

ThePhD commented Aug 8, 2021 •

edited

Loading

h-vetinari commented Aug 8, 2021 •

edited

Loading

Spammed commented Aug 8, 2021 •

edited

Loading