Chapel standard module style guide (collecting input) #6698

mppf · 2017-07-14T09:53:53Z

This issue is meant to collect:

suggestions for style to follow in the Chapel standard and internal modules, especially in user-facing APIs
known API names that "stick out" for some reason or other
other known API issues

Having a style guide and adjusting the user-facing Chapel APIs to adhere to it addresses a language stability concern. There are two reasons for that:

The style guide will change rarely, once it exists. Or at least rarely go back on a decision in it.
APIs that do not match the style guide will need to change before we could claim language stability.

The style guide would simply represent some best practices for API design in the context of Chapel modules. Of course there will be situations in which there is a good reason to make an exception to it.

Please add any requests for the style guide or API issues that "stick out" in the comments on this issue.

mppf · 2017-07-14T10:02:27Z

Michael's input for the style guide

Single Namespace

Style guide should note many of these decisions are there to aid developers in working with the fact that Chapel is single-namespace language.

CamelCase

Generally, prefer CamelCase or camelCase to separating_with_underscores.

Style of Identifier Names

Modules should be CamelCase, starting with an uppercase letter
Types should be CamelCase or lowercase. User-defined types should generally start with an uppercase letter and be CamelCase (e.g. RandomStream,Timer). Internal types can be lowercase (e.g. int, domain,bigint)
** ? Alternative option? Value types start with lower case, class types start with uppercase?
Types should be CamelCase for by-reference types (classes) and camelCase or camelcase for by-value types (records).
Methods should be camelCase and start with a lowercase letter
Variables should be camelCase or CamelCase.
** Q: Should constant variables or params have a different rule?

Method Names and Part-of-Speech

One of the nice things about a well-designed method names is that it's easy to mentally translate from a method call into a sentence. For example, myNode.accept(someThing) can be translated into "myNode, please accept someThing". In order for this to work, method names that represent an action should generally be a verb.

Field names can be nouns. Some methods are sortof accessors or similar to them. In that case I don't have a problem with using a noun. I think that student.height() fits into that category.

In the case of replicand, I don't have a problem with A.replicand because it's arguably accessing a property of A. What is some evidence that it's a "property" rather than an "action"?

it's accessing something fundamentally stored in the data structure (A replicated array might be viewed as consisting of replicands...)
It returns something that makes sense as the LHS of an assignment statement

In other words, if I was writing complete English sentences in my programming, A.replicand would be a noun phrase that I would presumably then want to do something else with using a verb. A.replicand.split() for an arbitrary example, has reasonable code-to-English properties.

Module Naming vs Type Naming

When creating a module that mainly exists to define a single type, how should one choose the type name and the module name?

I think we should use the convention of making the module name a more specific/spelled out version. Examples:

class Block is in module BlockDist
class BigInt can be in module BigInteger
class Barrier can be in module TaskBarrier

I think that this is a useful and reasonable convention because the module name seems to me to be more likely to be visible as introductory material:

at the top of a file, in a 'use' statement
in the module documentation pages

And, at the same time, I would expect the module name to appear fewer times in user code than the class name. Thus it is less troublesome for the module name to be longer.

(It is true that some users might opt to write something like BlockDist.Block everywhere; but such users are opting in to verbosity and there is an obvious way to reduce it).

mppf · 2017-07-14T10:03:18Z

Michael's list of things that "stick out"

domain.member should be named something verb-y like domain.contains
array vector operations like push_back contain an underscore but should be camelCase
push_back is available on List but is not camelCase

ben-albrecht · 2017-07-14T17:28:02Z

Some patterns that might benefit from a stylistic recommendation

Should methods on an object vs. function taking an object follow a general pattern?

sort(A);
// vs.
A.sorted();

Should parentheses vs parentheses-less functions & methods follow a general pattern?

A.dims()
// vs.
A.shape

lydia-duncan · 2017-07-14T17:36:12Z

I think the method versus function question falls under general programming practices (regardless of language with objects).

My recommendation for parentheses vs. parentheses-less methods is:

is it simple?
is it practically a field, but you don't want to store the value?

nspark · 2017-07-20T18:23:14Z

I have sometimes wanted consistency between using primary or secondary methods on classes (or records, or unions). For example,

// bar is a primary method
record Quux {
  ...
  proc foo(...) {...}
}

vs

// bar is a secondary method
record Quux { ... }
proc Quux.bar(...) {...}

Stylistically, my preference is for:

Primary methods for methods on a class defined in the same module; e.g., ZMQ and Socket.close
Secondary methods for methods on a class defined in another module; e.g., Regexp and string.search

However, secondary methods on classes defined in the same module are common; e.g., IO and file.check.

buddha314 · 2017-09-05T22:19:33Z

In Java and Python, there is disambiguation on the package level, so scipy.spatial.distance.cosine differentiates from sklearn.metrics.pairwise.cosine_similarity at the distance vs pairwise level. How would that play in Chapel?

LouisJenkinsCS · 2018-06-14T18:29:53Z

@mppf

For "Single Namespace", does that mean no nested modules under a parent module? Whether or not this is a 'single' namespace or not depends on how you define it. For example...

use parentModule.childModule;

Edit:

The intention is to create something like this

mppf · 2018-06-14T18:38:45Z

@LouisJenkinsCS - single namespace really means that attention should be paid to the names of top-level modules as they need to be reasonably unique. So in your example, literally parentModule is probably not a reasonable top-level module name, and neither is Test or Project but something more project-name-y like Hypergraphs or SuperIO is good. Nested modules don't have this issue and are reasonable to use and can have less unique names.

ben-albrecht · 2018-12-06T20:29:48Z

known API names that "stick out" for some reason or other

I'd expect channel.isclosed() to be channel.isClosed()

ben-albrecht · 2019-03-11T19:52:34Z

known API names that "stick out" for some reason or other

A few others:

subprocess.exit_status
subprocess.send_signal
Atomic memory_order_*
Several cases in DateTime
- e.g. DateTime.fromtimestamp
Lots of cases in Math
- e.g. half_pi
Most of Random.PCGRandomLib
Lots of cases in Buffers
- e.g., buffer_range

Many modules that exist as a thin wrapper around some C library will use naming schemes from the original C library, e.g. FFTW or Curl. It seems reasonable to leave these as is.

bradcray · 2019-05-16T23:51:47Z

single namespace really means that attention should be paid to the names of top-level modules as they need to be reasonably unique.

By this definition, what's an example of a popular non-single-namespace language? I worry (as I think Louis did above) whenever someone uses this term out loud that what everyone hears (myself included) is that all Chapel identifiers are thrown into a single namespace with no hierarchy, scoping, etc. What I think the term has traditionally been intended to be used for w.r.t. Chapel is to say that we don't treat identifiers of one kind differently than identifiers of another kind (i.e., there isn't a collection of function names distinct from the collection of variable names).

mppf · 2019-05-17T12:43:15Z

By this definition, what's an example of a popular non-single-namespace language?

Certain interpreted languages manage "imports" by file path. In these languages, you can have multiple "top level modules" with the same name as long as you point to one or the other with the import. I'm not sure I have a great example at the ready but one example is that in Python you can manipulate the Python path just before (and also maybe after) doing an import.

Other examples (for interpreted languages) of reading code from a file:

source in bash
load in Ruby

What I think the term has traditionally been intended to be used for w.r.t. Chapel is to say that we don't treat identifiers of one kind differently than identifiers of another kind (i.e., there isn't a collection of function names distinct from the collection of variable names).

I agree that this is the proper definition of "single namespace". Originally, I used the term because we need a different identifier for a module name, a class, and a function. Anyway, this is a good clarification in this discussion.

mppf · 2019-05-23T13:55:34Z

Regarding the names of types:

Earlier I said this:

Types should be CamelCase or lowercase. User-defined types should generally start with an uppercase letter and be CamelCase (e.g. RandomStream,Timer). Internal types can be lowercase (e.g. int, domain,bigint)
** ? Alternative option? Value types start with lower case, class types start with uppercase?

@bradcray responded in #13030 (comment) :

In #6698, I think Michael suggested that a potential deciding point might be that value-based data types could be camelCase and reference-/class-based data types could be PascalCase, and I think that is an intriguing notion that has resonated pretty well with me over the past 12 hours or so.

I find the "User-defined" vs "Internal" types to be rather confusing (RandomStream and Timer come with the compiler, so aren't they thus "internal"?). I would prefer a rule that would allow Chapel programmers to emulate the style they see in the standard library, since that's probably what they will do in any case. So, if we choose "internal types are lower case" then I'd prefer that to apply only to a small number language-specified types that the compiler knows about (e.g. int, domain). No matter what, if we choose the "internal types are lower case" naming convention, we will need to be clearer about what counts as an "internal type".

I'd also consider having a rule that all types are UpperCamelCase, including Int, but this would involve changing a lot of code. But at least there wouldn't be a confusing distinction there.

Today we have domain, bigint, codepointIndex, string, Error, Timer, RandomStream, LinkedList, Barrier. The UpperCase ones include a mix of classes and records. And it's hard to argue that Error is not an "internal type". So it seems that breaking changes are ahead of us if we seek a consistent style.

BryantLam · 2019-05-24T04:30:45Z

@bradc: By this definition, what's an example of a popular non-single-namespace language? ... What I think the term has traditionally been intended to be used for w.r.t. Chapel is to say that we don't treat identifiers of one kind differently than identifiers of another kind (i.e., there isn't a collection of function names distinct from the collection of variable names).

Rust actually makes a distinction between identifiers on modules and structs, etc. and identifiers on variables and functions because the former identifiers are unambiguously referenced with :: separators.

mod thing {
    pub fn doit() {
        println!("fn thing::doit()");
    }
}

fn thing() {
    println!("fn thing()");
}

// This global variable, however, does not work; `thing` is already used by `fn thing()`.
// static thing: &str = "variable";

fn main() {
    thing();
    thing::doit();
}

This code behaves as expected.

BryantLam · 2019-05-24T04:53:17Z

One source of (minor) confusion that I have coming from a C/C++ background is the specification of a function's arguments versus the cast operator. proc is used to denote a function definition, but C/C++ doesn't have a keyword for function.

Just a very minor confusion, but it does affect my ability to quickly scan through a piece of code.

proc doSomeIoOperation(x: string) { ... }
//                      ^------- type specification

proc main() {
    var x: string = readFromFile();
//       ^------- type specification

    doSomeIoOperation(x: string);
//                     ^------- cast operator

    proc doOtherIoOperation(x: string) { ... }
//  ^--- denotes procedure   ^------- type specification
}

Whenever I do a cast, I've taken to the habit of binding to the RHS side instead.

var a = 42;
var b = a :real;
writeln(b.type :string);

bradcray · 2019-05-24T16:04:36Z

Building on what Michael said above and (I think?) agreeing with him:

Regarding the names of types:

I definitely don't like the idea of making all types InitialCaps and don't see much value in it other than "it's a simple rule". I'd be more likely to support making all types initialLower, but wouldn't consider myself a champion of that approach.
I agree that there's a bit of a fuzzy line between internal vs. non-internal types that makes it difficult to distinguish between them. I also do't think that it's that useful a distinction... if I create a really kick-ass type in a library that ends up becoming almost as if it's part of the language, what about it suggests that a user would want it to be syntactically distinguished?
After a few days thinking about it now, I continue to really like the "value types use initialLower and class types use InitialUpper" convention. It jibes with my instincts for what I'd want to call things like list and timer vs BaseArr and C for example, and provides a useful cue as to whether the type can be subclassed.

So if I ran the world, I'd say "records use initialLower and classes use InitialUpper," and then fall back to "all types use initialLower" if that wasn't palatable. I'd be curious if anyone strongly disagreed with this stance.

gbtitus · 2019-05-29T15:30:07Z

So Greg, I think you're saying you're not in favor of having a style convention for the standard modules at all, is that right?

Definitely not. My objection is specific to the proposal that value types be lowerCamelCase and class types UpperCamelCase. I am fine with, for example, a convention that type names are lowerCamelCase and module names are UpperCamelCase, because there the minor lexical difference is augmented by the syntactic cues due to usage. I would even be fine with a convention that type names were lowerCamelCase but class type names had a _class suffix, because such a large lexical difference seems reasonably proportionate to the semantic difference. Note that this latter is merely an example which I hope helps others understand my position, definitely not a proposal for consideration.

Follow-up: What if the convention for standard libraries was checked by the compiler?

👍 Yes, for sure if we have a convention we should have a checker for it.

bradcray · 2019-05-29T16:23:26Z

because there the minor lexical difference is augmented by the syntactic cues due to usage.

This seems like splitting hairs to me. In some cases (particularly use M, there are syntactic cues, but in others, not. Is M.foo() a fully-qualified reference to a module-scope symbol (e.g., function or array) within M, a type method being called on a type named M, or a normal method being called on a variable named M?

if we have a convention we should have a checker for it.

If we have a checker for the convention, then does that eliminate the chances of willful or inadvertent surprise and put us back onto stable ground if we were to adopt a class vs. record naming convention?

gbtitus · 2019-05-29T17:11:13Z

If we [check the convention], then does that eliminate the chances of willful or inadvertent surprise ...

Sure, that sounds reasonable.

dmk42 · 2019-05-30T18:03:17Z

Michael reminded me to kick in my 2¢.

To me, type inference makes a language easy to write but difficult to read. It takes me a long time to trace back through the code to find out what something is. Visual cues help to reduce that problem. For that reason, I prefer that classes begin with an upper case letter unless they are internal-enough types like string, and most other things begin with a lower case letter.
[Edit] Yup, I know string isn't a class.

It is probably well known that to me, camel case is an abomination.

mppf · 2019-06-04T15:03:00Z

It seems the idea of records/by value types use lowerCamelCase and classes/by reference types use UpperCamelCase has enough support provided that there is compiler enforcement of it within the standard/internal modules.

BryantLam · 2019-06-05T19:16:17Z

It seems the idea of records/by value types use lowerCamelCase and classes/by reference types use UpperCamelCase has enough support provided that there is compiler enforcement of it within the standard/internal modules.

I'm for differentiating cases that should be differentiated, but I imagine this might be a (mild) concern when users begin to refactor codes.

What's a generic type t? Is it t or T? There is no by-valueness if it can accept both classes and record.
What if you wrote your code originally using by-value semantics and then later decided to turn it into classes? My own opinion is that a production library API would not rely on the record/class semantics for copies and would just implement its own clone() routine and possibly go so far as to turn the default copy operators into compiler errors. I would prefer if there was a way for library developers to always get the behavior they intend, but am concerned that it would be too inflexible to major code refactors.

As such, I think I'd advocate for types to always be UpperCamelCase. ... (Which conflicts with module naming conventions, but that's a different issue.) But this is just my opinion which could be wrong for large-scale production codes.

Some of these naming-convention distinctions would be resolved by a decent IDE.

e-kayrakli · 2019-12-06T22:25:45Z

There were some discussions regarding factory functions that should be part of the style guide.

Main discussion is under #14291
There is a related discussion for the Random module in general: #14589

ben-albrecht · 2020-09-22T15:04:41Z

Should there be naming conventions for in-place vs. out-of-place operations?

The proposal to deprecate Random.permutation() (noun) for Random.permute() (verb) in #16400 made me wonder about this.

mppf · 2021-03-23T14:05:56Z

These documents have some interesting discussion of part-of-speech naming (noun vs verb etc):
https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/CodingGuidelines/Articles/NamingMethods.html
https://swift.org/documentation/api-design-guidelines/

@bradcray

Add a draft for a standard module style guide This PR adds a draft standard module style guide as an aid for the effort of stabilizing the standard modules. Based in large part upon #6698. Discussed with lots of people but final review is by @bradcray - thanks!

mppf · 2021-05-22T11:27:34Z

See also Property queries: paren-less or paren-ful? #17128

lydia-duncan · 2021-06-22T16:36:35Z

Over on the review of #17936, Andy asked if we had a specific policy on the capitalization of initialisms and acronyms:

Should all letters in an initialism\acronym be capitalized or just the first? So if you had a snakecase function: render_html_dom(), should it be camelized as:

renderHtmlDom() or renderHTMLDOM().

Even when you don't have to initialisms back-to-back I still prefer the "only capitalize the first letter style". So renderHtmlPage() over renderHTMLPage()

bradcray · 2021-06-22T17:17:10Z

I always wrestle with this question, though the variant I wrestle with is if I choose the renderHTML... style, I'm sometimes tempted to make the letter that follows lowercase (so, renderHTMLpage()). I realize that's not particularly defensible. But it generally makes me prefer the renderHtmlPage() style because it avoids the question.

That said, I feel like I'd want to see more cases in practice to decide whether I'd feel different in different situations (i.e., whether I think this should be a hard and fast rule for standard modules, or whether we'd want to let authors choose based on taste).

mppf · 2021-06-23T18:56:46Z

One way would be for the acronym to be uppercase unless another word follows it; so renderHTML but renderHtmlDom. In any case, I wouldn't want to see renderHTMLDOM.

lydia-duncan · 2022-02-18T23:45:49Z

Over in #19012 (comment) Damian brought up a potential rule for operator-like names - that they be treated as a single word for the purpose of camelCasing. So for instance, divceil or addmul as opposed to divCeil or addMul. Thoughts?

lydia-duncan · 2022-07-21T18:29:54Z

Over on #14291, we solidified our discussion on factory functions to the point where it can be included:

Factory functions should use create as their prefix
Generally, they have been standalone functions. We are leaving the possibility of type methods open, but don't have many instances of that being the right choice in practice
There are some factory functions that are allowed to differ. open and spawn are the main examples. We do not view this as contradicting the general policy and may accept new exceptions on a case-by-case basis.

bradcray · 2022-07-21T19:32:20Z

of that being the right choice in practice

I might phrase this instead as "being sufficiently better/clearer in practice to garner a lot of support." That said, both David's seem to be more on the "prefer type method" side of the equation, so we may want to make sure we're not squashing voices by having louder ones assert something that a silent majority doesn't agree with.

There are some factory functions that are allowed to differ. open and spawn are the main examples

I wouldn't even call these factory functions. I'd describe this caset as "Some routines exist to do things and the result of doing that thing happens to create a new object." For example, I wouldn't say that numBits(type t) is a factory function for ints or that gethostname() is a factory function for strings, they're just routines that happens to generate an int or string as their result. So similarly, I don't think of open as being a factory function for files so much as a way to request that a file on the FS be opened, which gives you back a file object representing the result (where I'm stealing that idea from Michael who expressed it the last time we discussed this).

mppf added area: Libraries / Modules stat: Work in Progress type: Feature Request labels Jul 14, 2017

nspark mentioned this issue Jul 28, 2017

Adjusting error handling policy to avoid implicit halt #6566

Closed

ben-albrecht mentioned this issue Sep 5, 2017

SvmTrain pgrew/mbb#10

Merged

buddha314 mentioned this issue Sep 20, 2017

Construct a public Chapel Style Guide #7417

Open

mppf mentioned this issue Mar 1, 2018

array.pop_back() does not return its element #8601

Closed

This was referenced Oct 23, 2018

Establish how modules are adopted in standard library #10712

Open

rename string.isEmptyString()? #11437

Closed

ben-albrecht mentioned this issue Mar 4, 2019

Deprecate List module, introduce LinkedLists #12454

Merged

2 tasks

ben-albrecht mentioned this issue Mar 13, 2019

Rename module BlockCycDist to BlockCyclicDist #12590

Open

This was referenced Mar 26, 2019

Rename module LinkedLists to LinkedList or List #12711

Open

Refactor stable modules into std #12712

Open

bradcray mentioned this issue May 16, 2019

First crack at new List API, intended to replace "array-as-vec" operations on arrays #13030

Closed

bradcray mentioned this issue Jun 7, 2019

Local Module Search Paths #12923

Closed

mppf mentioned this issue Aug 23, 2019

Calling functions in a module without any use #13523

Closed

ben-albrecht mentioned this issue Oct 10, 2019

Add approximate equality function #13898

Closed

mppf mentioned this issue Oct 17, 2019

Standard library factory functions are not consistent #14291

Closed

e-kayrakli mentioned this issue Dec 13, 2019

IO symbols don't follow established conventions #14646

Closed

e-kayrakli mentioned this issue Jul 31, 2020

Add module OrderedSet #16124

Merged

bradcray mentioned this issue Aug 11, 2020

Underscore nomenclature #16219

Closed

ben-albrecht mentioned this issue Sep 22, 2020

permutation method in Random module overwrites the input array with permuted indices of the domain #16400

Closed

mppf mentioned this issue May 11, 2021

Add a draft for a standard module style guide #17675

Merged

This was referenced Jan 14, 2022

Math module - rename INFINITY, NAN #19016

Closed

Math module - rename isnan #19019

Closed

daviditen mentioned this issue Jul 18, 2022

Naming convention for names containing acronyms in standard modules #20240

Closed

lydia-duncan mentioned this issue Jul 6, 2023

Math module - potential renaming for params #19000

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapel standard module style guide (collecting input) #6698

Chapel standard module style guide (collecting input) #6698

mppf commented Jul 14, 2017

mppf commented Jul 14, 2017 •

edited

Loading

mppf commented Jul 14, 2017 •

edited

Loading

ben-albrecht commented Jul 14, 2017

lydia-duncan commented Jul 14, 2017

nspark commented Jul 20, 2017

buddha314 commented Sep 5, 2017

LouisJenkinsCS commented Jun 14, 2018 •

edited

Loading

mppf commented Jun 14, 2018

ben-albrecht commented Dec 6, 2018

ben-albrecht commented Mar 11, 2019 •

edited

Loading

bradcray commented May 16, 2019

mppf commented May 17, 2019

mppf commented May 23, 2019

BryantLam commented May 24, 2019 •

edited

Loading

BryantLam commented May 24, 2019

bradcray commented May 24, 2019

gbtitus commented May 29, 2019

bradcray commented May 29, 2019

gbtitus commented May 29, 2019

dmk42 commented May 30, 2019 •

edited

Loading

mppf commented Jun 4, 2019

BryantLam commented Jun 5, 2019 •

edited

Loading

e-kayrakli commented Dec 6, 2019

ben-albrecht commented Sep 22, 2020 •

edited

Loading

mppf commented Mar 23, 2021

mppf commented May 22, 2021

lydia-duncan commented Jun 22, 2021

bradcray commented Jun 22, 2021

mppf commented Jun 23, 2021

lydia-duncan commented Feb 18, 2022

lydia-duncan commented Jul 21, 2022

bradcray commented Jul 21, 2022

Chapel standard module style guide (collecting input) #6698

Chapel standard module style guide (collecting input) #6698

Comments

mppf commented Jul 14, 2017

mppf commented Jul 14, 2017 • edited Loading

Michael's input for the style guide

Single Namespace

CamelCase

Style of Identifier Names

Method Names and Part-of-Speech

Module Naming vs Type Naming

mppf commented Jul 14, 2017 • edited Loading

Michael's list of things that "stick out"

ben-albrecht commented Jul 14, 2017

Some patterns that might benefit from a stylistic recommendation

lydia-duncan commented Jul 14, 2017

nspark commented Jul 20, 2017

buddha314 commented Sep 5, 2017

LouisJenkinsCS commented Jun 14, 2018 • edited Loading

mppf commented Jun 14, 2018

ben-albrecht commented Dec 6, 2018

ben-albrecht commented Mar 11, 2019 • edited Loading

bradcray commented May 16, 2019

mppf commented May 17, 2019

mppf commented May 23, 2019

BryantLam commented May 24, 2019 • edited Loading

BryantLam commented May 24, 2019

bradcray commented May 24, 2019

gbtitus commented May 29, 2019

bradcray commented May 29, 2019

gbtitus commented May 29, 2019

dmk42 commented May 30, 2019 • edited Loading

mppf commented Jun 4, 2019

BryantLam commented Jun 5, 2019 • edited Loading

e-kayrakli commented Dec 6, 2019

ben-albrecht commented Sep 22, 2020 • edited Loading

mppf commented Mar 23, 2021

mppf commented May 22, 2021

lydia-duncan commented Jun 22, 2021

bradcray commented Jun 22, 2021

mppf commented Jun 23, 2021

lydia-duncan commented Feb 18, 2022

lydia-duncan commented Jul 21, 2022

bradcray commented Jul 21, 2022

mppf commented Jul 14, 2017 •

edited

Loading

mppf commented Jul 14, 2017 •

edited

Loading

LouisJenkinsCS commented Jun 14, 2018 •

edited

Loading

ben-albrecht commented Mar 11, 2019 •

edited

Loading

BryantLam commented May 24, 2019 •

edited

Loading

dmk42 commented May 30, 2019 •

edited

Loading

BryantLam commented Jun 5, 2019 •

edited

Loading

ben-albrecht commented Sep 22, 2020 •

edited

Loading