Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naming discussion: count_{en,de,trans}codable #16

Closed
h-vetinari opened this issue Jul 14, 2021 · 14 comments
Closed

Naming discussion: count_{en,de,trans}codable #16

h-vetinari opened this issue Jul 14, 2021 · 14 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request question Further information is requested thank you!! We very much appreciate your time and effort!

Comments

@h-vetinari
Copy link
Contributor

First off, thanks a lot for the awesome work you're doing! 🙃

I have been loosely following @ThePhD's stream (flood?) of papers & blog posts, and was wondering about a naming discussion they had posed on twitter a while ago.

Naming is hard, though worth it - the right name substantially reduces cognitive overhead, especially when accumulated over a long period of time. Since I was intrigued by the riddle of squaring that circle, I had written to @ThePhD with a suggestion, and they suggested to open an issue here.

One of the key points about the count_* functions that doesn't exactly jump out from the documentation (at least to me) is that the actual work of encoding/... (especially around memory allocation etc., AFAIU) is not being done. Quoth @ThePhD (in private communication):

I'm trying to avoid the participle because count_decoded sounds like "count after the work is already done", and that work expressly isn't being done. That's why I took the adjective-y version of decodable (and friends). Or, at least, that was the reasoning....! Not sure if it's good reasoning 😅

Since I wasn't warming up to count_encodable et. al (it's not intrinsically obvious to from the name whether things are being counted before or after the operation. The "-able" suffix makes it seem like it could be the former - "these code points are encodable [into code units], now count them"), I wanted to come up with something that more intrinsically reflects the absence of real work™.

My suggestion boils down to changing the verb from count to infer, freeing up the participle to rejoin the party unburdened of assumed labour. IOW:
infer_encoded_{size,length,count}

count could even make a comeback as a noun, but whether the choice is size, length or count is (from my POV) not that important.

I have no beef with the validate_* functions (whose expressiveness I applaud), so, to sum up my proposed answer to the initial riddle from the tweet (using my preferred noun size, if only due to its lack thereof, typographically speaking):

encode
decode
transcode
validate_encodable_as
validate_decodable_as
validate_transcodable_as
infer_encoded_size
infer_decoded_size
infer_transcoded_size

Thanks for reading!

@ThePhD ThePhD self-assigned this Jul 18, 2021
@ThePhD ThePhD added the question Further information is requested label Jul 18, 2021
@ThePhD
Copy link
Contributor

ThePhD commented Jul 20, 2021

I was hoping somebody'd chime in, even after I posted it on Twitter, but nobody's really saying much. infer_* hasn't hit me just right yet. I'm not particulately married to the current ones either, though, so I'll probably just let this one sit and hope others can chime in.

If you've got friends, let them know.

@Spammed
Copy link

Spammed commented Jul 22, 2021

I am not a native speaker, and do not know exactly what is meant, but:

You count things that you have.
One can infer from something that one has to something that one will have.
One can conclude democratically what one wants to have.

In this sense I would understand the difference between 'count' and 'infer'. If this were meant, 'infer' would indeed be more accurate in my humble opinion. (And another crisply short 5-letter word, of which your language is so wonderfully rich).

@Spammed
Copy link

Spammed commented Jul 22, 2021

another 5-letter verb is 'deduc'...

@h-vetinari
Copy link
Contributor Author

I think deduce could be an option indeed, even though it has a letter more. ;)

@Spammed
Copy link

Spammed commented Aug 1, 2021

or predict_ ?

@h-vetinari
Copy link
Contributor Author

Some more discussion on twitter: https://twitter.com/__phantomderp/status/1421710921110589443

In particular, count_as_encoded came up, which I think is pretty good too (certainly much better than count_encodable, IMO). It would also imbue the two non-fundamental variants (validate/count) with an as (though unfortunately with different meanings & positions).

"count as [if]" can be interpreted as not actually doing the work (see OP), but I still think "infer" does that better - and if "count" is an important marker (cf. STL), it could still be kept as a noun: infer_encoded_count

@ThePhD
Copy link
Contributor

ThePhD commented Aug 8, 2021

Okay. We're going to go with one of infer_x_count or count_x_as. Vote here by reacting to this reply with

👀 for infer_x_count; or,
🚀 for count_as_x

Also, twitter poll: https://twitter.com/__phantomderp/status/1424243292015841281

@h-vetinari
Copy link
Contributor Author

I think you meant count_as_x here 🙃

@Spammed
Copy link

Spammed commented Aug 8, 2021

One argument against 'infer' on twitter was that it implies it could go wrong.
Yes, exactly! If that's true and that can happen, then I'm very much in favor of 'infer_x_count()'.

@h-vetinari
Copy link
Contributor Author

h-vetinari commented Aug 8, 2021

Regarding the uncertainty that people seem to associate with "infer", this is IMO a recent change in meaning because "infer"/"inference" is used in many scientific fields where the answers can never be known with certainty.

The dictionary leaves not much room for failure though (there's various definitions on that site, but they all paint the same picture):

infer

(ɪnˈfɜː)
vb (when tr, may take a clause as object) , -fers, -ferring or -ferred

  1. to conclude (a state of affairs, supposition, etc) by reasoning from evidence; deduce
  2. [definitions of transitive version omitted]

@h-vetinari
Copy link
Contributor Author

🚀 for count_x_as

I think you meant count_as_x here 🙃

I think that little mix up is a good illustration of the risks of having the as in different positions (middle vs. end) in the count/validate functions.

Still, my position is actually not as decisive as it might seem - count_as_encoded is a good name too (and has other advantages).

@Spammed
Copy link

Spammed commented Aug 8, 2021

OK, But is what is done trivial (in the sense of 'count_as_dozen(24) -> 2'
or something more complicated (in the sense of get_screenwidth_of('Hello World!'))?

@ThePhD
Copy link
Contributor

ThePhD commented Aug 8, 2021

What is done is trivial. But, infer might help here since the operation takes an error_handler (or two) and will also account for that as it goes through and counts (for example, it will anticipate exactly the # of replacement characters).

@ThePhD ThePhD closed this as completed in 0b8af05 Aug 13, 2021
@ThePhD
Copy link
Contributor

ThePhD commented Aug 13, 2021

This has now been implemented with the voted-on changes. Should appear in documentation in a bit.

@ThePhD ThePhD added documentation Improvements or additions to documentation enhancement New feature or request thank you!! We very much appreciate your time and effort! labels Aug 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request question Further information is requested thank you!! We very much appreciate your time and effort!
Projects
None yet
Development

No branches or pull requests

3 participants