Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] Update stdlib corresponding to 2024-05-06 nightly/mojo #2559

Merged
merged 27 commits into from
May 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
866f43a
[stdlib] Migrating atomic.mojo Pointer -> UnsafePointer (#39244)
rparolin May 3, 2024
50127c0
[stdlib] reference.mojo renaming Pointer -> LegacyPointer NFC (#39245)
rparolin May 3, 2024
cdd8fe5
[mojo-stdlib] Remove `math.div_ceil` in favor of `math.ceildiv` (#39273)
laszlokindrat May 3, 2024
896847c
[stdlib] Combine device_print.mojo with stdlib _printf (#39292)
ConnorGray May 3, 2024
6af6c32
[mojo-stdlib] Make `ceildiv` generic (#39279)
laszlokindrat May 3, 2024
04a1b29
[KGEN] Remove SignatureType from CoroutineType (#39274)
Mogball May 3, 2024
f5a15ee
[mojo-stdlib] Move `Coroutine` callback setting into `__await__` (#39…
Mogball May 3, 2024
6296df3
[LSP] Detect unused variables (#39072)
AmaranthineCodices May 4, 2024
e6f8d34
[KGEN] Rename `co.await` to `co.suspend` (NFC) (#39285)
Mogball May 4, 2024
b6a3417
[mojo-stdlib] Wrap `co.suspend` raw op uses into a helper function (N…
Mogball May 4, 2024
4348f8b
[stdlib] feature: Change SIMD to use `Formattable` (#39200)
ConnorGray May 4, 2024
3368a7f
[KGEN] Drop the result types on `!co.routine` (#39325)
Mogball May 4, 2024
685308b
[KGEN] Add the current coroutine handle as an argument to `co.suspend…
Mogball May 4, 2024
064f9a7
[SDLC] Add mojo proposals to Copybara take 2 (#39277)
patrickdoc May 4, 2024
5ddbcc9
[mojo-stdlib] Move `Int` and `IntLiteral` __init__ of `-> Self` synta…
lattner May 4, 2024
b39ef0c
[Docs] Free memory in file read example (#39343)
jackos May 4, 2024
21c8ac7
[stdlib] Migrating memory.mojo use cases from Pointer -> UnsafePointe…
rparolin May 4, 2024
dc3e73a
[mojo-lang] Enhance CheckLifetimes copy elision for 'inout self' regp…
lattner May 4, 2024
6f311a4
[mojo-stdlib] Tidy up Arc (#39354)
lattner May 5, 2024
210b0bc
[mojo-stdlib] Continued cleanup of Arc. (#39356)
lattner May 5, 2024
cfba93c
[External] [stdlib] Use `UnsafePointer` in `vector.mojo` (#38804)
gabrieldemarmiesse May 5, 2024
c2319e2
[External] [docs] Correct link (#39359)
YichengDWu May 5, 2024
e59c501
[External] [stdlib] Remove FileCheck relics (#39360)
gabrieldemarmiesse May 5, 2024
f514304
[mojo-stdlib] Massively simplify parametric mutability reference usag…
lattner May 6, 2024
28928b3
[mojo-lang] Update the MOF + testsuite for self of `Reference` type. …
lattner May 6, 2024
6d041e4
[External] [mojo-stdlib] implement all bitwise operators for `object`…
LJ-9801 May 6, 2024
cead219
[stdlib] Bump compiler version to 2024.5.622
modularbot May 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/changelog-released.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ modular update mojo

- [`initialize_pointee_copy`](/mojo/stdlib/memory/unsafe_pointer/initialize_pointee_copy)
- [`initialize_pointee_move`](/mojo/stdlib/memory/unsafe_pointer/initialize_pointee_move)
- [`move_from_pointee()`](/mojo/stdlib/memory/unsafe_pointer/initialize_pointee_move)
- [`move_from_pointee()`](/mojo/stdlib/memory/unsafe_pointer/move_from_pointee)
- [`move_pointee`](/mojo/stdlib/memory/unsafe_pointer/move_pointee)

- A new
Expand Down
16 changes: 16 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,19 @@ what we publish.
directory, now outputs a Mojo package to `my-dir/my-package.mojopkg`.
Previously, this had to be spelled out, as in `-o my-dir/my-package.mojopkg`.

- The Mojo Language Server now reports a warning when a local variable is unused.

- The `math` module now has `CeilDivable` and `CeilDivableRaising` traits that
allow users to opt into the `math.ceildiv` function.

- Mojo now allows methods to declare `self` as a `Reference` directly, which
can be useful for advanced cases of parametric mutabilty and custom lifetime
processing. Previously it required the use of an internal MLIR type to
achieve this.

- `object` now implements all the bitwise operators.
([PR #2324](https://github.com/modularml/mojo/pull/2324) by [@LJ-9801](https://github.com/LJ-9801))

### 🦋 Changed

- The `abs` and `round` functions have moved from `math` to `builtin`, so you no
Expand All @@ -110,6 +123,9 @@ what we publish.
- The `math.roundeven` function has been removed from the `math` module. The new
`SIMD.roundeven` method now provides the identical functionality.

- The `math.div_ceil` function has been removed in favor of the `math.ceildiv`
function.

### 🛠️ Fixed

- [#2363](https://github.com/modularml/mojo/issues/2363) Fix LSP crashing on
Expand Down
49 changes: 34 additions & 15 deletions proposals/byte-as-uint8.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,44 @@
# Standardise the representation of byte sequence as a sequence of unsigned 8 bit integers

At this point in time, a sequence of bytes is often represented as a sequence of signed 8 bit integers in Mojo standard library.
Most noticeable example is the underlying data of string types `String`, `StringLiteral`, `StringRef` and `InlinedString`, but also APIs like for example the hash function `fn hash(bytes: DTypePointer[DType.int8], n: Int) -> Int:`.
At this point in time, a sequence of bytes is often represented as a sequence of
signed 8 bit integers in Mojo standard library. Most noticeable example is the
underlying data of string types `String`, `StringLiteral`, `StringRef` and
`InlinedString`, but also APIs like for example the hash function `fn
hash(bytes: DTypePointer[DType.int8], n: Int) -> Int:`.

# Motivation
## Motivation

Logically a byte is an integer value between `0` and `255`. Lots of algorithms make use of arithmetics ground by this assumption.
A signed 8 bit integer on the contrary represents values between `-128` and `127`. This introduces very subtle bugs, when an algorithm written for unsigned 8 bit integer is used on a signed 8 bit integer.
Logically a byte is an integer value between `0` and `255`. Lots of algorithms
make use of arithmetics ground by this assumption. A signed 8 bit integer on
the contrary represents values between `-128` and `127`. This introduces very
subtle bugs, when an algorithm written for unsigned 8 bit integer is used on a
signed 8 bit integer.

Another motivation for this change is that Mojo aims to be familiar to Python users. Those Python users are familiar with the `bytes` class, which itself is working with values between `0` and `255`, not values between `-128` and `127`.
Another motivation for this change is that Mojo aims to be familiar to Python
users. Those Python users are familiar with the `bytes` class, which itself is
working with values between `0` and `255`, not values between `-128` and `127`.

## Examples:
## Examples

### Division:
A value `-4` represented as `Int8` has the same bit pattern as value `252` represented as `UInt8`.
`-4 // 4` equals to `-1` (`bx11111111`), where `252 // 4` equals to `63` (`bx00111111`) as we can see the bit patterns are different.
### Division

### Bit shift:
Values `-1` and `255` have the same bit pattern as `Int8` and `UInt8` `bx11111111` but `-1 >> 1` results in `-1` (same bit pattern), where `255 >> 1` results in `127` (`bx01111111`)
A value `-4` represented as `Int8` has the same bit pattern as value `252`
represented as `UInt8`. `-4 // 4` equals to `-1` (`bx11111111`), where `252 //
4` equals to `63` (`bx00111111`) as we can see the bit patterns are different.

# Proposal
### Bit shift

A text based search for `DTypePointer[DType.int8]` and `Pointer[Int8]` on current open-sourced standard library revealed 29 results for `Pointer[Int8]` and 78 results for `DTypePointer[DType.int8]`.
Replacing `DTypePointer[DType.int8]` with `DTypePointer[DType.uint8]` and `Pointer[Int8]` with `Pointer[UInt8]` on case by case bases is a substantial refactoring effort, but it will prevent a certain class of logical bugs (see https://github.com/modularml/mojo/pull/2098). As it is a breaking change in sense of API design, it is sensible to do the refactoring as soon as possible.
Values `-1` and `255` have the same bit pattern as `Int8` and `UInt8`
`bx11111111` but `-1 >> 1` results in `-1` (same bit pattern), where `255 >> 1`
results in `127` (`bx01111111`)

## Proposal

A text based search for `DTypePointer[DType.int8]` and `Pointer[Int8]` on
current open-sourced standard library revealed 29 results for `Pointer[Int8]`
and 78 results for `DTypePointer[DType.int8]`. Replacing
`DTypePointer[DType.int8]` with `DTypePointer[DType.uint8]` and `Pointer[Int8]`
with `Pointer[UInt8]` on case by case bases is a substantial refactoring effort,
but it will prevent a certain class of logical bugs (see
<https://github.com/modularml/mojo/pull/2098>). As it is a breaking change in
sense of API design, it is sensible to do the refactoring as soon as possible.
116 changes: 84 additions & 32 deletions proposals/inferred-parameters.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Inferring Parameters from Other Parameters

A common feature in programming language with generics is the ability to infer the value of generics/templates/parameters from the argument types. Consider C++:
A common feature in programming language with generics is the ability to infer
the value of generics/templates/parameters from the argument types. Consider
C++:

```cpp
template <typename T>
Expand All @@ -12,7 +14,8 @@ inferMe(x);
inferMe<int>(x);
```

Mojo is a parametric language and also supports this feature in a variety of use cases that make code significantly less verbose:
Mojo is a parametric language and also supports this feature in a variety of use
cases that make code significantly less verbose:

```python
fn infer_me[dt: DType, size: Int](x: SIMD[dt, size]): pass
Expand All @@ -22,53 +25,76 @@ infer_me(Int32())
infer_me[DType.int32, 1](Int32())
```

But Mojo pushes these needs a step further. As a language that encourages heavy parameterization, dependent types are very common throughout the language. Consider:
But Mojo pushes these needs a step further. As a language that encourages heavy
parameterization, dependent types are very common throughout the language.
Consider:

```python
fn higher_order_func[dt: DType, unary: fn(Scalar[dt]) -> Scalar[dt]](): pass

fn scalar_param[dt: DType, x: Scalar[dt]](): pass
```

Language users commonly encounter cases where dependent types could infer their parameter values from other parameters in the same way from argument types. Consider `scalar_param` in the example above: `dt` could be inferred from the type of `x` if `x` were passed as an argument, but we have no syntax to express inferring it from `x` as a parameter since the user is required to pass `dt` as the first parameter.
Language users commonly encounter cases where dependent types could infer their
parameter values from other parameters in the same way from argument types.
Consider `scalar_param` in the example above: `dt` could be inferred from the
type of `x` if `x` were passed as an argument, but we have no syntax to express
inferring it from `x` as a parameter since the user is required to pass `dt` as
the first parameter.

```python
scalar_param[DType.int32, Int32()]() # 'dt' parameter is required
```

This has been requested multiple times in various forms, especially given the new autoparameterization feature. The current tracking feature request:
This has been requested multiple times in various forms, especially given the
new autoparameterization feature. The current tracking feature request:

- https://github.com/modularml/mojo/issues/1245
- <https://github.com/modularml/mojo/issues/1245>

# Proposal
## Proposal

In the above example, we want to be able to infer `dt` instead of explicitly specifying it:
In the above example, we want to be able to infer `dt` instead of explicitly
specifying it:

```python
scalar_param[Int32()]()
```

Laszlo Kindrat and I proposed several options to remedy this and members of the “Mojo Language Committee” met to discuss these ideas, summarized below.
Laszlo Kindrat and I proposed several options to remedy this and members of the
“Mojo Language Committee” met to discuss these ideas, summarized below.

We decided to move forward with the following option. Mojo will introduce a new keyword, `inferred`, as a specifier for parameters only. `inferred` parameters must precede all non-inferred parameters in the parameter list, and they **cannot** be specified by a caller — they can **only** be inferred from other parameters. This allows us to express:
We decided to move forward with the following option. Mojo will introduce a new
keyword, `inferred`, as a specifier for parameters only. `inferred` parameters
must precede all non-inferred parameters in the parameter list, and they
**cannot** be specified by a caller — they can **only** be inferred from other
parameters. This allows us to express:

```python
fn scalar_param[inferred dt: DType, x: Scalar[dt]](): pass

scalar_param[Int32()]() # 'dt' is skipped and 'Int32()' is bound to 'x'
```

Where `dt` is inferred from `x`. The decision to choose a keyword instead of introducing a new punctuation character [like Python does for keyword-only arguments](https://docs.python.org/3/tutorial/controlflow.html#special-parameters) is because a keyword clearly indicates the intent of the syntax, and it’s easy to explain in documentation and find via internet search.
Where `dt` is inferred from `x`. The decision to choose a keyword instead of
introducing a new punctuation character [like Python does for keyword-only
arguments](https://docs.python.org/3/tutorial/controlflow.html#special-parameters)
is because a keyword clearly indicates the intent of the syntax, and it’s easy
to explain in documentation and find via internet search.

# Aside: Inferring from Keyword Parameters
## Aside: Inferring from Keyword Parameters

Related but separate to the proposal, we can enable parameter inference from other parameters using keyword arguments. This allows specifying function (and type) parameters out-of-order, where we can infer parameters left-to-right:
Related but separate to the proposal, we can enable parameter inference from
other parameters using keyword arguments. This allows specifying function (and
type) parameters out-of-order, where we can infer parameters left-to-right:

```python
scalar_param[x=Int32()]() # 'dt' is inferred from 'x'
```

We should absolutely enable this in the language, since this does not work today. However, with respect to the above proposal, in many cases this still ends up being more verbose than one would like, especially if the parameter name is long:
We should absolutely enable this in the language, since this does not work
today. However, with respect to the above proposal, in many cases this still
ends up being more verbose than one would like, especially if the parameter name
is long:

```python
scalar_param[infer_stuff_from_me=Int32()]()
Expand All @@ -79,53 +105,76 @@ scalar_param[Int32()]()

So this feature is orthogonal to the `inferred` parameter proposal.

# Alternatives Considered
## Alternatives Considered

Several alternative ideas were considered for this problem.

## Non-Lexical Parameter Lists
### Non-Lexical Parameter Lists

This solution would alter the name resolution rules inside parameter lists, allowing forward references to parameters within the same list. The above example would be expressed as:
This solution would alter the name resolution rules inside parameter lists,
allowing forward references to parameters within the same list. The above
example would be expressed as:

```python
fn scalar_param[x: Scalar[dt], dt: DType](): pass
```

Where any parameter is inferrable from any previous parameter. The benefits of this approach are that the order of parameters at the callsite match the order in the declaration: `scalar_param[Int32()]()`
Where any parameter is inferrable from any previous parameter. The benefits of
this approach are that the order of parameters at the callsite match the order
in the declaration: `scalar_param[Int32()]()`

This alternative was rejected because:

1. Non-lexical parameters are potentially confusing to users, who normally expect named declarations to be lexical. Relatedly, we are moving towards removing non-lexical parameters in general from the language.
2. This would incur a huge implementation burden on the compiler, because the type system needs to track the topological order of the parameters.
1. Non-lexical parameters are potentially confusing to users, who normally
expect named declarations to be lexical. Relatedly, we are moving towards
removing non-lexical parameters in general from the language.

## New Special Separator Parameter
2. This would incur a huge implementation burden on the compiler, because the
type system needs to track the topological order of the parameters.

This solution is fundamentally the same as the accepted proposal, but differs only in syntax. Instead of annotating each parameter as `inferred`, they are separated from the rest using a new undecided sigil (`%%%` is a placeholder):
### New Special Separator Parameter

This solution is fundamentally the same as the accepted proposal, but differs
only in syntax. Instead of annotating each parameter as `inferred`, they are
separated from the rest using a new undecided sigil (`%%%` is a placeholder):

```python
fn scalar_param[dt: DType, %%%, x: Scalar[dt]](): pass
```

The benefit of this approach is this matches the [Python syntax](https://docs.python.org/3/tutorial/controlflow.html#special-parameters) for separating position-only and keyword-only parameters. It also structurally guarantees that all infer-only parameters appear at the beginning of the list.
The benefit of this approach is this matches the [Python
syntax](https://docs.python.org/3/tutorial/controlflow.html#special-parameters)
for separating position-only and keyword-only parameters. It also structurally
guarantees that all infer-only parameters appear at the beginning of the list.

This alternative was rejected because:

1. There was no agreement over the syntax, and any selected sigil would introduce additional noise into the language.
2. `inferred` clearly indicates the intent of the syntax, and can be found via internet search, and is overall easier to explain syntax than introducing a new argument separator.
1. There was no agreement over the syntax, and any selected sigil would
introduce additional noise into the language.

2. `inferred` clearly indicates the intent of the syntax, and can be found via
internet search, and is overall easier to explain syntax than introducing a new
argument separator.

## Special Separator Parameter at the End
### Special Separator Parameter at the End

This is a variation on the above, where the infer-only parameters would appear at the end of the parameter list, and subsequent parameters would be allowed to be non-lexical:
This is a variation on the above, where the infer-only parameters would appear
at the end of the parameter list, and subsequent parameters would be allowed to
be non-lexical:

```python
fn scalar_param[x: Scalar[dt], %%%, dt: DType](): pass
```

The benefit of this approach is that the parameters appear in the same position at the callsite. This alternative was rejected for a combination of the reasons for rejecting a new separator and non-lexical parameters.
The benefit of this approach is that the parameters appear in the same position
at the callsite. This alternative was rejected for a combination of the reasons
for rejecting a new separator and non-lexical parameters.

## Segmented Parameter Lists
### Segmented Parameter Lists

This proposal would allow functions to declare more than one parameter list and enable right-to-left inference of the parameter “segments”. The above would be expressed as:
This proposal would allow functions to declare more than one parameter list and
enable right-to-left inference of the parameter “segments”. The above would be
expressed as:

```python
fn scalar_param[dt: DType][x: Scalar[dt]](): pass
Expand All @@ -137,7 +186,10 @@ The callsite would look like
scalar_param[Int32()]()
```

And call resolution would match the specified parameter list to the last parameter list and infer `dt`. This proposal was rejected because
And call resolution would match the specified parameter list to the last
parameter list and infer `dt`. This proposal was rejected because

1. The right-to-left inference rules are potentially confusing.
2. This is an overkill solution to the problem, because this opens to door to arbitrary higher-order parameterization of functions.

2. This is an overkill solution to the problem, because this opens to door to
arbitrary higher-order parameterization of functions.