Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUGGESTION] Typed Expressions; generalized constructors, UDLs, unnamed variables and functions #463

Closed
msadeqhe opened this issue May 21, 2023 · 54 comments

Comments

@msadeqhe
Copy link

msadeqhe commented May 21, 2023

This is an alternative solution to this issue. If you don't like ...TYPE notation for object construction because of its UDL syntax, this is an alternative solution. In this suggestion, objection construction would use familiar notation EXPR:TYPE (aka Typed Expression) which is similar to how Cpp2 programmers use it within declarations.

Consider EXPR:TYPE as syntactic sugar to (: TYPE = EXPR). For example:

Abc: type;
Xyz: type;
func: (u: int) -> Abc;

a: int = 2;
b: = 2:int;       // (: int = 2)
c: = a:Abc;       // (: Abc = a)
d: = ():Abc;      // Default Constructor
e: = (a + b):Abc; // (: Abc = a + b)
r: = func(a):Abc; // (: Abc = func(a))
s: = func(2):Abc.start():Xyz.value; // Function Chaining
t: = a++:Abc;     // (: Abc = a++)
u: = a:Abc++;     // (: Abc = a)++
x: = 2:int:Abc;   // (: Abc =: int = 2) Constructor Chaining
y: = (2:int + 4:int):Abc; // (: Abc = (: int = 2) + (: int = 4))
z: = ("text", 2:int):Abc; // (: Abc = ("text", (: int = 2)))

Literally if x:int is at the start of a statement or function parameter, it would be a declaration, otherwise it would be a Typed Expression.

Abc: type;

// `x: Abc` is a parameter declaration.
func: (x: Abc) = {}

// `a: int` is a declaration.
a: int = 2;

// `a:Abc` is a Typed Expression, it calls the constructor of `Abc`.
m: = a:Abc;

EXPR:TYPE is similar to ...TYPE suggestion, except with the following advantages:

  • It's familiar and similar to declaration syntax in which types are specified in the language.
  • It's easier to parse, but ...TYPE would complicate the grammar especially for working within function chaining.
  • It doesn't need parentheses for simple expressions with unary postfix operators (e.g. a++:Abc). It depends on operator precedence, and it's left to right.

And this notation has the following disadvantages:

  • It requires an extra :. BTW it's opinion based.

I have to explain : within SOMETHING:TYPE is for object construction (as an expression) or declaration (as a statement), but :: within SOMETHING::TYPE is scope resulotion operator for qualified names. They can be combined like 10++ : my::Type. Also after the object is constructored, we can use operator dot or operator() or operator[] or ... to access members from it, e.g. 10:Type.call() or 10:Type[0].

Will your feature suggestion eliminate X% of security vulnerabilities of a given kind in current C++ code?

No.

Will your feature suggestion automate or eliminate X% of current C++ guidance literature?

Yes.

  1. It unifies constructors with UDLs. They are semantically the same. Both of them create a new object.
    1. It's useful in generic programming.
    2. It reduces concept count.
      • Novice programmers don't need to learn a distinct concept about UDLs.
      • All types benefit from UDL like syntax. It's not needed to declare UDL for them.
      • It eliminates the need of understanding and learning built-in prefixes and suffixes for literals.
    3. The syntax of calling constructors will be expressive and readable.
  2. It distincts constructors from regular function calls. They are semantically different.
    • Constructors:
      • EXPR:TYPE, parentheses are not necessary when EXPR has operators with higher precedence.
      • ():TYPE, it calls the default constructor
      • (args...):TYPE
    • Regular Function Calls:
      • FUNCTION(), it calls a function without arguments
      • FUNCTION(args...)
      • obj.FUNCTION()
      • obj.FUNCTION(args...)
  3. They can be chained together, whereas it's not possible with UDLs in Cpp1.
    • Only one UDL can be applied to a literal in Cpp1.
  4. Constructors already can be templated, but UDLs cannot be templated.
    • UDL templates are not supported in Cpp1.
  5. It removes built-in literal prefixes and suffixes. They are inconsistent and redundant.
    1. They are visually inconsistent.
      • Some of them are prefix.
      • Some of them are suffix.
    2. Their behaviours are inconsistent when the constant of literal exceeds the type as described in this comment.
  6. The name to construct a literal and to declare a variable will be consistently the same.
    • It's not needed to declare a new name for literal suffixes.
    • The name of types are like a suffix that will construct an object.
  7. They can be applied to literals with qualified name (if they are within namespaces) unlike UDLs which need using statement before they can be applied to literals.
    • That's why UDLs in Cpp1 have to be prefixed with _, thus they will be distinguished from UDLs which are declared in the Cpp1 standard library.
  8. Unlike TYPE(args) it doesn't work with UFCS intentionally. UFCS should not work on constructors as described in this comment. Compare:
    // `TYPE(args)` with UFCS on it.
    x: = 10.Type(10, 20);
    
    // `(args):TYPE`
    y: = (10, 10, 20):Type;

Describe alternatives you've considered.

These are alternative solutions:

Thanks.

@msadeqhe
Copy link
Author

msadeqhe commented May 21, 2023

EXPR:TYPE is similar to ...TYPE suggestion, except with the following advantages:

  • It's familiar and similar to declaration syntax in which types are specified in the language.
  • It's easier to parse, but ...TYPE would complicate the grammar especially for working within function chaining.
  • It doesn't need parentheses for simple expressions with unary postfix operators (e.g. a++:Abc). It depends on operator precedence.

And this notation has the following disadvantages:

  • It requires an extra :. BTW it's opinion based.

Also this suggestion won't add any new syntax to the language, it uses the existing syntax SOMETHING : TYPE, but as expressions. Currently in Cpp2 we use : TYPE = ... expression to create unnamed variables, but it won't conflict with that because unnamed variables don't have the left side of :, and they have an extra =.

@msadeqhe
Copy link
Author

msadeqhe commented May 21, 2023

In a nutshell:

  • ID : TYPE would be a declaration if it's at the start of a statement or function parameter.
    • We use ID : TYPE to specify the type of a declaration.
  • EXPR : TYPE would be a typed expression if it's not at the start of a statement or function parameter.
    • We use EXPR : TYPE to specify the type of an expression.
    • EXPR : TYPE calls the constructor of TYPE with the result of EXPR as the argument.
    • (ARGS) : TYPE calls the constructor with ARGS.
    • () : TYPE calls the default constructor.
  • : TYPE = SOMETHING would be an unnamed variable.

@AbhinavK00
Copy link

AbhinavK00 commented May 21, 2023 via email

@msadeqhe
Copy link
Author

msadeqhe commented May 21, 2023

Thanks. Yes, It would be context-free, because the behaviour of SOMETHING : TYPE is dependent on its placement:

// It's a declaration.
something: Type;

// It's a typed expression.
call(something: Type);

// It's a parameter declaration.
call: (something: Type) = {}

@msadeqhe
Copy link
Author

msadeqhe commented May 21, 2023

Briefly:

// `A` is a declaration.
A: Type = 0;

// `B` is a declaration.
// `X` is a typed expression.
B: Type = X: Type;

@JohelEGP
Copy link
Contributor

A typed expression can subsume a UDL better than UFCS.

Let's consider the type std::chrono::year:

date := 1970y/January/1; // UDL after `using namespace std::chrono_literals;`
date := 1970.y()/January/1; // UFCS after `using namespace ufcs_literals;` (see next code block).
date := 1970:y/January/1; // Direct construction after `using namespace typed_expr_literals;` (see next code block).
date := 1970:year/January/1; // This suggestion after `using std::chrono::year;`
date := 1970:std::chrono::year/January/1; // Ugly, but possible.

As you can glean from the comments,
a typed expression uses direct construction,
whereas UFCS implies an extra object in-between, the function parameter.

Just switch from having a _literals namespace with Cpp1 UDLs to type aliases.

typed_expr_literals: namespace = {
y: type == std::chrono::year; // Direct construction.
}
ufcs_literals: namespace = {
y: (int i) -> _ = std::chrono::year(i); // Indirect construction through `i`.
}

For an alias template (e.g., to replace the UDLs 1s and 1.0s), compiler support is still in the works:
1684676514
1684676524

@JohelEGP
Copy link
Contributor

JohelEGP commented May 21, 2023

From our experience at https://github.com/mpusz/units, which has quantity references (e.g., 1 * m for 1 m),
one point against some things that try to replace UDL (e.g., UFCS) vs. UDL
is that bringing a UDL into scope doesn't take up the symbol in the UDL (because it's actual name is operator""𝘴𝘺𝘮𝘣𝘰𝘭).
So, for example, something like using namespace unit_literals; with variables
introduces conflicts, specially when formulas are involved.
From Quantity References vs Unit-specific Aliases:

  1. Shadowing issues

    • Quantity References

      References occupy a pool of many short identifiers which sometimes shadow the variables,
      function arguments, or even template parameters provided by the user or other libraries. This
      results in warnings being generated by some compilers. The most restrictive here is MSVC which
      for example emits a warning of shadowing N template parameter for an array size provided
      in a header file with Newton unit included via namespace declaration in the main() program
      function (see experimental_angle <https://github.com/mpusz/units/blob/master/example/references/experimental_angle.cpp>).
      In other cases user is forced to rename its local identifiers to not collide with predefined
      references (see capacitor_time_curve <https://github.com/mpusz/units/blob/master/example/references/capacitor_time_curve.cpp>).

    • Unit-specific Aliases

      As aliases are defined in terms of types rather variables no major shadowing issues were found
      so far. In case of identifiers abiguity it was always possible to disambiguate with more
      namespaces prefixed in front of the alias.

As seen from "Unit-specific Aliases", a typed expression isn't thus affected.
But UFCS, which works on functions (and behave like variables), possibly does.

See also UDLs vs Quantity References for many major pain points against UDLs.
One that applies to UDL and UFCS, but not typed expression:

  • 2. UDLs cannot be disambiguated with a namespace name

@JohelEGP
Copy link
Contributor

JohelEGP commented May 21, 2023

I expanded on the quote above.
A typed expression isn't actually affected, but UFCS probably is.

Allow me to expand the summary's table:

Feature Aliases References Typed expression UDLs
Literals and variables support Yes Yes Yes Literals only
Preserves user provided representation type Yes Yes Yes No
Explicit control over the representation type Yes No Yes No
Possibility to resolve ambiguity Yes Yes Yes No
Readability Good Medium Good Good
Hard to resolve shadowing issues No Yes No No
Operators precedence issue No Yes No No
Controlled verbosity Yes No Yes No
Easy composition for derived units No Yes Yes No
Simplified quantity casting Yes No Yes No
Implementation and standardization effort Medium Lowest Medium Highest
Compile-time performance Fastest Medium Fastest Slowest

As you you can see, a typed expression is the best in almost all aspects.
For details on the "feature", please see the linked documentation.

@JohelEGP
Copy link
Contributor

JohelEGP commented May 21, 2023

@mpusz You may be interested in looking at this. In particular, the 3 comments above.

@msadeqhe
Copy link
Author

@JohelEGP Thanks for explaining about indirect construction of UDL and UFCS, I wasn't aware of it.

From our experience at https://github.com/mpusz/units, which has quantity references (e.g., 1 * m for 1 m), ...

I changed the example to have typed expressions:

// simple numeric operations
static_assert(10:km / 2 == 5:km);

// unit conversions
static_assert(1:h == 3'600:s);
static_assert(1:km + 1:m == 1'001:m);

_s: = 1:s;
kmph:type == decltype(1:km / _s);

// dimension conversions
static_assert(1:km / 1:s == 1'000:m / _s);
static_assert(2:kmph * 2:h == 4:km);
static_assert(2:km / 2:kmph == 1:h);
static_assert(2:m * 3:m == 6:m2);
static_assert(10:km / 5:km == 2);
static_assert(1'000 / 1:s == 1:kHz);

For this to work, I think that unit types may have an extra template parameter to indicate the prefix. For example metre<1'000> is equal to kilometre.

@JohelEGP
Copy link
Contributor

Don't worry. The library has taken care of all that.
In fact, it already has those alias templates!
So if typed expressions made it to Cpp2, that example would compile right away (not on Clang yet).

@JohelEGP
Copy link
Contributor

From commit 0982b8e:

Note that : continues to be pronounces "is a"... e.g., f: () -> int is pronounced as "f is a function returning int," v: vector<int> as "v is a vector<int>", this: Shape as "this object is a Shape."

That works well for named declarations.
What about expressions with :?

Commit 1090a31 also enabled : std::vector = (5,1).
Let's consider it together with 42:seconds.

: std::vector = (5,1) can be pronounced as "the vector $(5, 1)$", and
42:seconds and 42:s can be pronounced as "42 seconds".

@msadeqhe
Copy link
Author

msadeqhe commented May 22, 2023

Good point. So it would be like to pronounce:

  • 42 : second as 42 is a second.
    • the value of 42 is a second type.
  • (a + b) : price as (a + b) is a price.
    • the result of (a + b) is a price type.
  • ("name", 30) : Person as ("name", 30) is a Person.
    • the argument group ("name", 30) is a Person type.

In general:

  • something : name as something is a name.
    • the expression something is a name type.
  • : name = something as it is a name assigned from something.
    • it is the unnamed variable.

@msadeqhe
Copy link
Author

msadeqhe commented May 22, 2023

Assignment to Typed Expression:

// `A` is a declaration.
A: Type = 0;

// `B` is a declaration.
// `A` is a typed expression.
B: Type = A: Type;

// It's equal to:
//      = Type::operator=(out this, B).operator=(something)
C: Type = B: Type = something;
//  (2 + 2): Type = something;
// x++*.f(): Type = something;

// It's equal to:
//      = Type::operator=(out this, something)
D: Type = : Type = something;

It's can be safe to disallow assignment in case 3, because it's rvalue:

// ERROR `B: Type` is rvalue.
C: Type = B: Type = something;

I'm going to categorize them.

They all have a similar syntax but semantically they are different in this way:

  1. Variable Declarations
    They are statements.
    • something: Type;
    • something: Type = value;
  2. Unnamed Variables
    They are expressions.
    • (: Type = value)
  3. Typed Expressions
    They are expressions.
    • (something: Type)
    • (something: Type /*unary postfix operators*/)
    • For example:
      • (something: Type = value)
      • (something: Type++)
      • (something: Type.member...)

Unnamed Variables are a special Typed Expression.

So we can think about it that Unnamed Variables are a special Typed Expression. This is a generalized syntax for both of them:

(something: Type = value)

Unnamed Variables don't have the something part, therefore the variable won't be initialized with something, instead it will be initialized with value, because value is the first assignment, and the first assignment is initialization.

After that, they would be categorized in this way:

  1. Variable Declarations
    They are statements.
    • something: Type;
    • something: Type = value;
  2. Typed Expressions
    They are expressions.
    • (something: Type)
    • (something: Type /*unary postfix operators*/)
    • For example:
      • (something: Type = value)
      • (something: Type++)
      • (something: Type.member...)
    • (: Type = value) aka Unnamed Variables
    • (: Type) is invalid, because it's an uninitialized variable which is immediately used.

This categorization will reduce concept count.

@msadeqhe
Copy link
Author

msadeqhe commented May 22, 2023

I'm trying to find a general rule to reduce concept count.

Also Unnamed Functions could be somehow a special Typed Expression if Cpp2 would support issue suggestion #391 titled "Statement-expressions, result vs return". In general it would be possible to have a block statement after assignment operator. Let's look at the syntax of typed expression with function types and assignment in this case:

(something: (args) -> Type = { /*statements*/ })

Unnamed Functions don't need the something part, because the something part is for passing arguments to the function (see next comment), because the function body cannot be before parameter declarations. If we don't write something part, it would be a function object, otherwise it would immediately call the function. So assignment is needed to define function body.

After that, they would be categorized in this way:

  1. Variable/Function Declarations
    They are statements.
    • something: Type;
    • something: Type = value;
    • (NEW) something: Type = { /*statements*/ }
  2. Typed Expressions
    They are expressions.
    • (something: Type)
    • (something: Type /*unary postfix operators*/)
    • For example:
      • (something: Type = value)
      • (NEW) (something: Type = { /*statements*/ })
      • (something: Type++)
      • (something: Type.member...)
    • (: Type = value) aka Unnamed Variables
    • (NEW) (: (args) -> Type = { /*statements*/ } aka Unnamed Functions
    • (: Type) is invalid, because it's an uninitialized variable which is immediately used.

I have to clarify about the syntax (described above) of typed expressions:

  • The type is not restricted. It can be either:
    • a variable type (e.g. Type).
    • or a function type (e.g. (args) -> Type).
  • But something cannot be a statement (e.g. { /*statements*/ }).
    • Because Typed Expressions can only be applied to expressions.
  • If the type of typed expression is a function type:
    • If it has something, the function would be immediately called (see next comment).
    • It must have an assignment. Because function body must be after parameter declarations.

So this example won't be allowed:

// WRONG! This typed expression applied to a statement block.
{ /*statements*/ }:(args) -> Type

@msadeqhe

This comment was marked as outdated.

@JohelEGP

This comment was marked as resolved.

@msadeqhe
Copy link
Author

msadeqhe commented May 22, 2023

Yes, you're right. I will correct this paragraph from my comment.

Unnamed Functions don't have the something part, therefore the function won't be defined with something, instead it will be defined with { /*statements*/ }, because { /*statements*/ } is the first assignment, and the first assignment is definition:

// This function has not a definition.
func: (args) -> Type;

// The first assignment is definition.
func = { /*statements*/ }

Because the something part is for passing arguments to the function.

Edit

Thanks @JohelEGP I've removed that misleading information from my comment.

@msadeqhe
Copy link
Author

If they have something, in this case the function would be called immediately with arguments:

(something: (args) -> Type = { /*statements*/ })

For example:

// It immediately calls the function with arg=1.
1: (arg: int) -> int = { return arg + 2; }

// It immediately calls the function with a=1, b=2.
(1, 2): (a: int, b: int) -> int = { return a + b; }

I gave up on this idea. So unnamed functions shouldn't be immediately called in this way, because it's inconsistence with how unnamed variables work.

@msadeqhe msadeqhe changed the title [SUGGESTION] Typed Expressions to replace both UDLs and constructors! [SUGGESTION] Typed Expressions; generalized constructors, UDLs, unnamed variables and functions May 23, 2023
@mpusz
Copy link

mpusz commented May 23, 2023

@JohelEGP you provided "Yes" in the table for "Easy composition for derived units". How it is possible with typed expressions?

@JohelEGP
Copy link
Contributor

As a replacement to UDLs,
typed expressions build on type aliases/alias templates,
so they should be the same.
At the time,
I probably thought that the alias used for example in the
"Composition for unnamed derived units" bullet
could be defined using decltype and expressions.

@mpusz
Copy link

mpusz commented May 24, 2023

Easy composition for derived units and quantities does not mean that you can decltype some result to define the type, but the fact that you do not have to define it at all. We do not want to end up with hundreds of different variations of types for units of a single derived quantity. For example, consider how many predefined types for units of angular momentum besides kilogram_metre_sq_per_second would be needed to make everyone happy. That is not easy to compose (and standardize) at all.

@JohelEGP
Copy link
Contributor

I understand.
V2 makes that row redundant, right?
It has no unit downcasting, and unit composition is transformed
from kilogram_metre_sq_per_second to ~derived_unit<square<kilogram, metre>, per<second>>, and
from kilometre_per_hour to ~derived_unit<kilo<metre>, per<hour>>.
So the bullet under the title "Composition for unnamed derived units",
which for "Quantity References" says "References have only to be defined for named units."
would be true for aliases, too.

@msadeqhe
Copy link
Author

Somehow if multiplication of units could be modeled as template template parameters, we would have:

// Type aliases
kg: <T> type == com<kg_type, T>;
m2: <T> type == com<m2_type, T>;

// 10:kg:m2 is com<m2_type, com<kg_type, int>>
a: = 10:kg:m2;

But there isn't any notation better than operator/ for division:

1:N:m == 1:kg:m2 / 1:s2

1:J / 1:mol:K == 1:m2:kg / 1:s2:mol:K

@mpusz
Copy link

mpusz commented May 24, 2023

I do not think V2 makes it redundant. The V2 provides a solution that gathers the best features from all the options we had before. In V2 we have units that "have only to be defined for named units", and the "unnamed" derived units are obtained by applying unit equations on the predefined ones. In V2 user never types derived_unit<kilogram, square<metre>, per<second>> but does kg * m2 / s (or si::kilogram * square<si::metre> / si::second) to get it. You can put those easily to a quantity type as well quantity<kg * m2 / s>. That is the power of composition where you have to predefine only a few named units to be able to obtain "infinite" number of derived unnamed ones.

Unit-specific alias in V1 are pointing to quantity types rather than units so we can't obtain derived unit or quantity_spec type by equations. I think this is also true for typed equations.

@JohelEGP
Copy link
Contributor

1:N:m

It doesn't seem possible for that,
or the equivalent m(N(1)),
to mean 1 Newton metre.

@msadeqhe
Copy link
Author

msadeqhe commented May 24, 2023

What if its type is:

m: == Comp<m_type, T>;
N: == Comp<N_type, T>;

1:N:m is Comp<m_type, Comp<N_type, int>>
1:N:m:kg is Comp<kg_type, Comp<m_type, Comp<N_type, int>>>

And Comp is the underlying type of all units. Each derived unit is a composition of two units, but each base unit is a composition of itself and int.

@JohelEGP
Copy link
Contributor

Sorry, I was too brief in my reply.

That certainly works.
But it would be another library altogether.

One of the points of mp-units is

  1. The best possible user experience
  • compiler errors
  • debugging

-- https://mpusz.github.io/units/introduction.html#approach

The nesting required to make this work is suboptimal.
Another thing is that the types of 1:N:m and 1:m:N
would be different just due to the placement of the units.

I tried to make it work without disrupting the design.
What I found out is that it doesn't seem possible
for m(N(0)) to result in a type like
quantity<derived_unit<metre, Newton>, int>: https://cpp2.godbolt.org/z/YbEh4aPj9.

@JohelEGP
Copy link
Contributor

Another point against alias chaining is the extra construction per type.
For example,
0:uₙ:uₙ₋₁:…:u₀ and
u₀(…(uₙ₋₁(uₙ(0)))…)
perform $n-1$ extra quantity constructions
vs.
0 * (uₙ * uₙ₋₁ * … * u₀),
which computes the final quantity from the rhs once
and the outer * performs the quantity construction once
(plus a parameter construction).

I have to say that the readability and composability of your example is superb:

1:N:m == 1:kg:m2 / 1:s2

1:J / 1:mol:K == 1:m2:kg / 1:s2:mol:K

@JohelEGP
Copy link
Contributor

JohelEGP commented May 25, 2023

A typed expression can subsume a UDL better than UFCS.

This still stands.

Here are some examples of chained typed expressions that work well from #284:

20'percent'bottle'water
5'000'gram'apple
1'h'worker

1:N:m (1 Newton metre)

It's unfortunate that to make chaining work for units
one has to add an explicit constructor that doesn't actually make sense by itself.
What does it mean to construct a quantity of metres from Newtons?

I've left the table of #463 (comment) untouched,
despite typed expressions as a substitute for unit UDLs building on aliases.

I'm thinking that rather than aliasing the existing class template quantity,
the aliases intended to be used in a typed expression could be their own entity.
Then we have a clean slate to workaround whatever issues we can
and integrate them better into the existing design.

Here's my attempt: https://cpp2.godbolt.org/z/3ddev6MMe https://cpp2.godbolt.org/z/abMWoj6eG.

@msadeqhe
Copy link
Author

msadeqhe commented May 27, 2023

<> for grouping types

Currently we use parentheses to group expressions: (1 + 2) * 3. I suggest to use angle brackets <> to group types and make it a syntax sugar to decltype in this way:

2:<N*m> == 2:decltype(2:N * 2:m)
2:<int> == 2:decltype(2:int)

It would make type composition easy. For example:

2:<N*m> == 2:<kg*m2/s2>
1:<J/mol/K> == 1:<m2*kg/s2/mol/K>

Also nested grouping with <> is possible (just like () within expressions):

1:<J/<mol*K>> == 1:<m2*kg/<s2*mol*K>>

It doesn't conflict with <> for template parameters, because it only works for typed expressions. In a similar manner that parentheses are for function parameters in declarations, but they mean grouping within expressions.

If there is an identifier before <>, it would be a template type with template arguments:

2:<Type<int>*int> == 2:decltype(2:Type<int> * 2:int)

That is just like how if there is an identifier before (), it would be a function call with function arguments.

@msadeqhe
Copy link
Author

msadeqhe commented May 27, 2023

Additionally to use variable or function names within <>, we may use decltype within <> like this:

a: Type = 0;

2:<decltype(a)*int> == 2:decltype(2:decltype(a) * 2:int)

Unary operators and other combinations are possible, but we use <> for grouping instead of (). For example:

2:<<Abc + Xyz>++ * <Abc + Xyz>++> == 2:decltype((2:Abc + 2:Xyz)++ * (2:Abc + 2:Xyz)++)
2:<Abc < Xyz> == 2:decltype(2:Abc < 2:Xyz)
2:<Abc > Xyz> == 2:decltype(2:Abc > 2:Xyz)

< and > within <> will be parsed similar to how they work within template arguments.

@mpusz
Copy link

mpusz commented May 27, 2023

2:<N*m> == 2:decltype(2:N * 2:m)

This does not work in a generic sense. Even though it is perfectly fine for int as a representation type, it will not work for a linear algebra vector type as multiplying those is generally undefined. You either have a dot or vector product, but both end up with a different type than the inputs.

Probably you mean something like:

2:<N*m> == 2:decltype(N * m)

which may work.

@msadeqhe
Copy link
Author

Thanks. Yes I mean that. Infact I was thinking about allowing unnamed uninitialized variables within decltype:

// `:N` and `:m` are unnamed uninitialized variables.
2:<N*m> == 2:decltype(:N * :m)

: is not an operator like * or / therefore it doesn't mean a mathematical operation in this case, expr:Type will create an instance of Type from expr.

@msadeqhe
Copy link
Author

msadeqhe commented May 27, 2023

<> for grouping types

Currently we use parentheses to group expressions: (1 + 2) * 3. I suggest to use angle brackets <> to group types and make it a syntax sugar to decltype in this way:

2:<N*m> == 2:decltype(2:N * 2:m)
2:<int> == 2:decltype(2:int)

Also it's syntactically possible to use () instead of <> without any conflict, that's based on the rule that typed expressions cannot be a function type (except if they are unnamed functions, hence the left-side of : doesn't exist), becuase function body must be after its signature. So we would have:

a: = 120:A;

// `(A)` is not a function signature, because it's a typed expression.
a: = 120:(A);

// But `(A)` is a function signature, because it's a declaration.
a: (A) = ...

Examples:

2:int == 2:(int)

2:(N*m) == 2:(kg*m2/s2)
1:(J/mol/K) == 1:(m2*kg/s2/mol/K)

1:(J/(mol*K)) == 1:(m2*kg/(s2*mol*K))

a: Type = 0;

2:(decltype(a)*int)

2:((Abc + Xyz)++ * (Abc + Xyz)++)
2:(Abc < Xyz)
2:(Abc > Xyz)

Although () is more readable than <>, but<> always means the same in contrast to ().

@JohelEGP
Copy link
Contributor

JohelEGP commented May 27, 2023

<> for grouping types

[...]

It would make type composition easy. For example:

2:<N*m> == 2:<kg*m2/s2>

That certainly works in favor of unit libraries.
But I worry about the feature not being more generally useful.
Can we think of more use cases?

It can also work for C++ standard library range piping when the pipes don't have input:

algo: <R> (r: R) requires std::range<R> = {
  rng: namespace == std::ranges;
  return r:<rng::filter_view|rng::join_view>;
}

Of course, the standard syntax r | filter | join is more general.

Also it's syntactically possible to use () instead of <> without any conflict

I was going to suggest that for the inner <>s, e.g., 1:<J/<mol*K>> -> 1:<J/(mol*K)>.
Because identifiers within the <> after the colon in a typed expressions are already types.
Parentheses would also work after the colon,
but I worry we might be overloading them too much in Cpp2.

Thanks. Yes I mean that. Infact I was thinking about allowing unnamed uninitialized variables within decltype:

// `:N` and `:m` are unnamed uninitialized variables.
2:<N*m> == 2:decltype(:N * :m)

That'd be a good shorthand for

decltype(std::declval<decltype(2:N)>() * std::declval<decltype(2:m)>())

@msadeqhe
Copy link
Author

msadeqhe commented May 28, 2023

You're right, () is already working in expressions and that is reasonable to use it for inner <>s.

To have general use case, it seems <> can be used within declarations without any conflict with template parameters:

// When there is one <>, it's for type composition.
variable1: <A*B++> = /*expression*/;

// That's because the following is already an error in Cpp2 if `T` is a template parameter:
// ERROR! `T` is not a declared type! Also `T` cannot be a template parameter.
variable2: <T> = /*expression*/;

// Instead, it has to be declared like the following (already works):
variable3: <T> T = /*expression*/;

// When there is two <>, always:
// - The first <> is for template parameters.
// - The second <> is for type composition.
variable4: <T> <A*B/T> = /*expression*/;

// The <> before () is always for template parameters.
function1: <T> (a: <A*T>, b: <B*T>) -> <A*B> = { /*statements*/ }

// OK: The type of template paramteter `v` is `<A*B>`.
function2: <v: <A*B>> () = { /*statements*/ }

// OK: The type of template paramteter `v` is template parameter `std::vector<T>`.
function3: <v: <T> std::vector<T>> () = { /*statements*/ }

So <> would somehow complement decltype:

function1: (a: A, b: B) -> decltype(a*b) = { /*statements*/ }

function2: (a: A, b: B) -> <A*B> = { /*statements*/ }

function3: (a: A, b: B) -> <A*decltype(b)> = { /*statements*/ }

Also Cpp2 can go furthur and change the name of decltype to simply type. It would lead to less count of keywords in the language:

function1: (a: A, b: B) -> type(a*b) = { /*statements*/ }

function2: (a: A, b: B) -> <A*type(b)> = { /*statements*/ }

a: A = ();
variable1: type(a) = /*expression*/;

In this way type is for declaring a type, but type(...) is a way to get the type of an expression exactly like decltype. The point is that type is already a keyword in Cpp2.

@msadeqhe
Copy link
Author

msadeqhe commented May 28, 2023

Briefly it would mean:

a: = ... // It's a variable or function object. It depends on the right hand side of assignment.
b: A = ... // It's a variable.
c: <A> = ... // It's a variable. <A> is a composed type. Currently it's an error in Cpp2.
d: <T> T = ... // It's a variable template.
e: <T> <T> = ... // It's a variable template. The second <T> is a composed type.
f: <T> type(expr) = ... // It's a variable template. `type` is `decltype` here.
g: type(expr) = ... // It's a variable. `type` is `decltype` here.
h: type = ... // It's a type.
i: <T> type = ... // It's a type template.
j: <T> (args...) = ... // It's a function template. The return type is `void`.
k: <T> (args...) -> T = ... // It's a function template.
l: <T> (args...) -> <T> = ... // It's a function template. The return type <T> is a composed type.
m: <T> (args...) -> type(expr) = ... // It's a function template. `type` is `decltype` here.

type means "Type" (itself), but type(expr) means "Type of expression".

In general:

// <A> is a composed type.
n: <A> = ...

// <T> is a template parameter.
// `something` can be either a type, composed type, function type, `type` or `namespace`.
o: <T> something = ... // `o` is a template.

// `something` can be either a type, composed type, function type, `type` or `namespace`.
p: something = ... // `p` is not a template.

@JohelEGP
Copy link
Contributor

Also Cpp2 can go furthur and change the name of decltype to simply type. It would lead to less count of keywords in the language:

C23 got typeof and typeof_unqual.
IIUC, they'll be C++ when it's rebased on C23.

@AbhinavK00
Copy link

typeof is not decltype though. It's like decltype but with references removed. So,

typeof :== std::remove_reference_t<decltype(T)>;

decltype would still be used and therefore renaming it could be considered.

@msadeqhe
Copy link
Author

msadeqhe commented May 29, 2023

I examined () for type composition within declarations, it seems () is not needed at all within declarations.

<> for grouping types

Currently we use parentheses to group expressions: (1 + 2) * 3. I suggest to use angle brackets <> to group types and make it a syntax sugar to decltype in this way:

2:<N*m> == 2:decltype(2:N * 2:m)
2:<int> == 2:decltype(2:int)

Also it's syntactically possible to use () instead of <> without any conflict, that's based on the rule that typed expressions cannot be a function type (except if they are unnamed functions, hence the left-side of : doesn't exist), becuase function body must be after its signature. So we would have:

a: = 120:A;

// `(A)` is not a function signature, because it's a typed expression.
a: = 120:(A);

// But `(A)` is a function signature, because it's a declaration.
a: (A) = ...

Examples:

2:int == 2:(int)

2:(N*m) == 2:(kg*m2/s2)
1:(J/mol/K) == 1:(m2*kg/s2/mol/K)

1:(J/(mol*K)) == 1:(m2*kg/(s2*mol*K))

a: Type = 0;

2:(decltype(a)*int)

2:((Abc + Xyz)++ * (Abc + Xyz)++)
2:(Abc < Xyz)
2:(Abc > Xyz)

Although () is more readable than <>, but<> always means the same in contrast to ().

So type composition within declarations would be like this:

A: type = { /*declarations*/ }
B: type = { /*declarations*/ }
function: <T> (a: A*T, b: B*T) -> A*B = { /*statements*/ }

X: type = { /*declarations*/ }
variable: <T> X*T = /*value*/;

That's because , has lowest precedence and it's not an operator, it's just a separator. So () should not be used for type composition within declarations.

Briefly, it means that:

  • For type composition within declarations, () should not be used around composed type, because () means function type:
    function: <T> (a: A*T, b: B*T) -> A*B = { /*statements*/ }
    Now compare it with type aliases in which the type composition depends on template parameters.
    This doesn't work:
    x: X = (); // OK.
    t: T = (); // ERROR! `T` is a template parameter.
    XT: == decltype(x*t); // No, it doesn't work.
    function: <T> (a: XT, b: XT) -> XT = { /*statements*/ }
    Because in this case, a variable template is needed:
    x: X = (); // OK.
    t: <T> T = (); // OK. `T` is a template parameter.
    XT: <T> == decltype(x*t<T>); // OK, it works.
    function: <T> (a: XT<T>, b: XT<T>) -> XT<T> = { /*statements*/ }
    So type composition within declarations makes the code simpler to write and easier to read when type composition depends on template parameters.
  • For type composition within typed expressions, () has to be used around composed type, because operators have lower precenece than :, and typed expressions cannot be function types, therefore () wouldn't conflict with it:
    x: = something:(A*B);

@msadeqhe
Copy link
Author

msadeqhe commented May 29, 2023

C23 got typeof and typeof_unqual.
IIUC, they'll be C++ when it's rebased on C23.

That's interesting. Also C23 has auto.

decltype would still be used and therefore renaming it could be considered.

I like how Cpp2 already has type keyword, and type(...) could be used to mean decltype(...).

@realgdman
Copy link

1:N:m (1 Newton metre)
What does it mean to construct a quantity of metres from Newtons?

I agree that's not intuitive from human-language point too. In cpp2 : generally means is-a, like "f is-a function", and "Newton is-a metre" makes little sense.

@JohelEGP
Copy link
Contributor

1:<N*m> solves that and the $n-1$ constructions of 1:N:m, which translates to m(N(1)).

@msadeqhe
Copy link
Author

msadeqhe commented May 30, 2023

Yet Another Alternative Solution

What if first we try to fix arg.Type(other_args...) for object construction? I've written the problems of UFCS on types in this comment. If we use (all_args...).Type or expr.Type instead of arg.Type(other_args...), it would be like this:

Abc: type = { /*declarations*/ }

x1: = 1.Abc;   // this suggestion
x2: = 1.Abc(); // in current Cpp2

y1: = (1, 2).Abc; // this suggestion
y2: = (1).Abc(2); // in current Cpp2

z1: = (1, 2, 3).Abc; // this suggestion
z2: = (1).Abc(2, 3); // in current Cpp2

So (all_args...).Type doesn't have the problems of UFCS on types:

  • It's not UFCS, because it doesn't have parenthesis after Type as the same as UFCS doesn't work on member variables.
    • So the first parameter this of operator= will not be (all_args...) in (all_args...).Type.
    • It's not inconsistent with UFCS on functions, because it's not UFCS at all.
  • It has less possibility than UFCS on types to be overrided by a member.
    • Because most of the time, member variables are not public.
      Abc: type = {
          member: int = 0;
      }
      
      member: type = {
          operator=: (out this, value: Abc) = { /*statements*/ }
      }
      
      main: () = {
          x: Abc = ();
          
          // `member` is a type, because `Abc::member` is not public.
          // It calls the constructor `member::operator=(x)`.
          y: = x.member;
      }
  • It complements UFCS.
    • UFCS works on non-member functions and member functions.
      • Member functions will override non-member functions within obj.func(args).
      Abc: type = {
          // It overrides non-member function `func`.
          func: (this) = { /*statements*/ }
      }
      
      Cde: type = {
          // Nothing overrides non-member function `func`.
      }
      
      func: (obj: Abc) = { /*statements*/ }
      func: (obj: Cde) = { /*statements*/ }
      
      main: () = {
          x: Abc = ();
          y: Cde = ();
      
          // It calls member function `func`.
          x.func();
      
          // It calls non-member function `func`.
          y.func();
      }
    • This suggestion works on types and member variables.
      • Member variables will override type constructions within (args).Type or expr.Type.
      Abc: type = {
          // It overrides type construction.
          vari: int = 0;
      }
      
      Cde: type = {
          // Nothing overrides type construction.
      }
      
      vari: type = {
          operator=: (out this, value: Abc) = { /*statements*/ }
          operator=: (out this, value: Cde) = { /*statements*/ }
      }
      
      main: () = {
          x: Abc = ();
          y: Cde = ();
      
          // It gets member variable `vari`.
          a: = x.vari;
      
          // It constructs an instance of `vari`.
          b: = y.vari;
      }
    • It would be possible to add member variables and functions outside type definition (something similar to extension properties and methods).
  • All arguments are in one-side, because all of them have the same characteristic to construct the object:
    Abc: type = {
        operator=: (out this, a: int, b: int) = { /*statements*/ }
    }
    
    main: () = {
        // this suggestion
        x: = (1, 2).Abc;
    
        // With UFCS on types, it looks strange.
        y: = (1).Abc(2);
    }

Using . instead of : has the advantage of using () with type composition, because it wouldn't be ambiguous with function types, by the way still <> can be used instead of ():

10.(N*m) == 10 * 1.kg * 1.m2 / 1.s2

// <> can be used instead of ()
10.<N*m> == 10 * 1.kg * 1.m2 / 1.s2

But this alternative solution has a problem (opinion-based). The problem is that if something is a type, 1.something would call the constructor function, but it doesn't have () like other function calls.

@msadeqhe
Copy link
Author

msadeqhe commented May 30, 2023

Now, if we use (all_args...).Type to create an instance of Type with arguments (all_args...), and if we use parentheses around composed types within typed expressions, and if we don't use anything around composed types within declarations, this is how the code would look like:

// Type composition within declarations:
A: type = { /*statements*/ }
B: type = { /*statements*/ }
function: <T> (a: A*T, b: B*T) -> A*B = { /*statements*/ }

point: type = {
    operator=: (out this, a: A, b: B) = { /*statements*/ }
}

main: () = {
    a: = ().A; // It calls the default constructor.
    b: = 12.B; // It calls the constructor `operator=(out this, 12)`.
    c: = 12.int.B; // Also they can be chained.

    // Type composition within typed expressions:
    x: = 10.(N*m) == 10 * 1.kg * 1.m2 / 1.s2;
    y: = 10.(N*m) == 10.(kg*m2/s2);

    z: = (a, b).point;
}

@JohelEGP
Copy link
Contributor

JohelEGP commented May 30, 2023

I don't think that's context-free:

Cpp2 strictly avoids this, and never requires sema to guide parsing. When I say "context-free parsing" this is the primary thing I have in mind... the compiler (and the human) can always parse the current piece of code without getting non-local information from elsewhere.
-- Extract from https://github.com/hsutter/cppfront/wiki/Design-note%3A-Unambiguous-parsing.

x.b is already valid today and means member access.
We'd need to know whether b is a type to know what x.b means.

Even with parentheses, (x).b and (0,1,x).b are valid today.

@msadeqhe
Copy link
Author

Yes, it's not context-free, because it's based on UFCS a.Something(args) for types in which Something may be a type or a function. Whereas that syntax is changed to (a, args).Something in which Something may be a type or a variable.

The motivation of this alternative suggestion is that if UFCS a.Something(args) for types is currently acceptable in Cpp2 as a context-free language, (a, args).Something would be acceptable too.

@JohelEGP
Copy link
Contributor

They are not so comparable.
The function call func(arg) and construction type(arg) are very similar, even in Cpp1.
The member access arg.something and construction something(arg) are way more dissimilar.

@msadeqhe
Copy link
Author

That's right but something(args) itself is not context-free, of course a.something(args) is not context-free, either. something can be either type or function.

I've to clarify I don't want to say that 1.stuff is better than 1:stuff or 1stuff, it's another alternative suggestion to consider.

@msadeqhe
Copy link
Author

msadeqhe commented Jun 15, 2023

From this comment:

I have to correct my suggestion about derived units e.g. <A*B> in previous comment:

  • The type of <A * B> is always A.
    • For every arithmetic operator, the type of A op B is always A.
  • The type of <A && B> is always bool.
    • For every logical operator, the type of A op B is always bool.
  • The type of <A += B> is always A&.
    • For every assignment operator, the type of A op B is always A&.
  • But the type of the following operators depend on the signature of their functions:
    • <A*> aka Indirection
    • <A&> aka Address-of
    • <A()> aka Function Call
    • <A[]> aka Subscript

For arithmetic, logical and assignment operators, the type of expression is always known from themselves. So <> is not needed at all to get the type of them. Also <A*>, <A&>, <A()> and <A[]> operations are not useful within derived units.

So simply EXPR:TYPE is enough and EXPR:<TYPE> is not needed.

@JohelEGP
Copy link
Contributor

The type of 1s + 1.0s is their common type, which is the type of 1.0s.

@SebastianTroy
Copy link

SebastianTroy commented Jun 15, 2023 via email

@msadeqhe
Copy link
Author

What if Cpp2 uses the following syntax for named arguments?

x: = call(name: = "someone", age: int = 20);

It may conflict with this suggestion (EXPR:TYPE).

Repository owner locked and limited conversation to collaborators Aug 30, 2023
@hsutter hsutter converted this issue into discussion #631 Aug 30, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

6 participants