Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Convert the stdlib bytes from List[Int8] to List[UInt8] #2317

Closed
1 task done
gabrieldemarmiesse opened this issue Apr 17, 2024 · 6 comments
Closed
1 task done
Labels
enhancement New feature or request mojo-repo Tag all issues with this label q2-2024

Comments

@gabrieldemarmiesse
Copy link
Contributor

gabrieldemarmiesse commented Apr 17, 2024

Review Mojo's priorities

What is your request?

This issue is a follow up to track the progress on the implementation of the proposal #2099 . @JoeLoser mentioned that Modular doesn't have the bandwidth internally to make the change in the stdlib, it's up to the community to do it. See #2099 (comment)

We should convert all bytes usage from Int8 to UInt8. Here is some reference:

git grep 'List\[Int8\]' | cat - 
stdlib/src/base64/base64.mojo:    var out = List[Int8](capacity=length + 1)
stdlib/src/builtin/file.mojo:    fn read_bytes(self, size: Int64 = -1) raises -> List[Int8]:
stdlib/src/builtin/file.mojo:        var list = List[Int8](capacity=int(size_copy))
stdlib/src/builtin/hex.mojo:    var buf = List[Int8]()
stdlib/src/builtin/hex.mojo:    inout fmt: List[Int8],
stdlib/src/builtin/string.mojo:    alias _buffer_type = List[Int8]
stdlib/src/builtin/string.mojo:        var buf = List[Int8]()
stdlib/src/builtin/string.mojo:    fn as_bytes(self) -> List[Int8]:
stdlib/src/builtin/string.mojo:        var res = List[Int8]()
stdlib/src/builtin/string.mojo:        var res = List[Int8]()
stdlib/src/pathlib/path.mojo:    fn read_bytes(self) raises -> List[Int8]:
stdlib/src/utils/inlined_string.mojo:            var buffer = List[Int8](capacity=total_len)
stdlib/test/collections/test_list.mojo:    var some_list = List[Int8](new_pointer, size=3, capacity=5)
stdlib/test/collections/test_list.mojo:    initial_list = List[Int8](0, 1, 2)
stdlib/test/collections/test_list.mojo:    var some_list = List[Int8](
stdlib/test/utils/issue_13632.mojo:fn sum_items(data: List[Int8]) -> Int:
stdlib/test/utils/issue_13632.mojo:fn make_abcd_vector() -> List[Int8]:
stdlib/test/utils/issue_13632.mojo:    return List[Int8](97, 98, 99, 100)

It's possible that some of those usages don't need to be converted or that there are more, but it gives an overview of what to do. Don't hesitate to add unit tests if you're scared of breaking something.

What is your motivation for this change?

The proposal was accepted.

Any other details?

No response

@gabrieldemarmiesse gabrieldemarmiesse added enhancement New feature or request mojo Issues that are related to mojo labels Apr 17, 2024
@ematejska ematejska added the mojo-stdlib Tag for issues related to standard library label Apr 18, 2024
JoeLoser pushed a commit that referenced this issue Apr 21, 2024
)

Related to #2317

I think we can do this safely by making PRs for one struct at a time.
Here is the change for the `Error` struct. I believe that as we progress
on the migration, the `bitcast` will go away.

Signed-off-by: gabrieldemarmiesse <gabrieldemarmiesse@gmail.com>
@StandinKP
Copy link

Can we close this if this is merged?

@gabrieldemarmiesse
Copy link
Contributor Author

I only did a small part of the work listed here, so I don't think we can close this issue yet

@StandinKP
Copy link

Oh please let me know if I can help or any subtask I can pick up

@gabrieldemarmiesse
Copy link
Contributor Author

There are a few things to do, but we might want to wait a few days that the number of PRs opened go down a bit. With so many PRs opened, that might lead to git conflicts later on. It would be a pain for all the contributors and maintainers to fix those conflicts

@JoeLoser
Copy link
Collaborator

Changing

fn read_bytes(self, size: Int64 = -1) raises -> List[Int8]:
read_bytes to return List[UInt8] seems like a nice thing to pick off next in the list towards this goal.

@StandinKP
Copy link

Sounds good. I'll pick this up @JoeLoser

@linear linear bot removed mojo Issues that are related to mojo mojo-stdlib Tag for issues related to standard library labels Apr 29, 2024
@ematejska ematejska added the mojo-repo Tag all issues with this label label Apr 29, 2024
patrickdoc added a commit that referenced this issue May 2, 2024
…essage (#2… (#38373)

…318)

Related to #2317

I think we can do this safely by making PRs for one struct at a time.
Here is the change for the `Error` struct. I believe that as we progress
on the migration, the `bitcast` will go away.

Signed-off-by: gabrieldemarmiesse <gabrieldemarmiesse@gmail.com>

mojo-orig-commit: 67d406a

Co-authored-by: Gabriel de Marmiesse <gabrieldemarmiesse@gmail.com>
MODULAR_ORIG_COMMIT_REV_ID: 260cec3712af85d41d0904925f6c064facec452a
JoeLoser pushed a commit that referenced this issue May 3, 2024
…nd list too (#38859)

[External] [stdlib] Allow `String()` to be created from `UInt8` ptr and
list too

This is related to #2317.

It allows a graceful and slow migration to UInt8 as the primary byte
type. This allows everyone to give whatever (Int8 or UInt8) to the
String struct and it just works™

When everything uses Uint8, we can remove the old methods that work with
Int8, and the migration will be complete.

Co-authored-by: gabrieldemarmiesse <gabrieldemarmiesse@gmail.com>
Closes #2448
MODULAR_ORIG_COMMIT_REV_ID: d12649d6dfa178eb756d56df563ad155b5240093
JoeLoser pushed a commit that referenced this issue May 3, 2024
…8 (#38989)

[External] [stdlib] Allow `StringRef` to work with both Int8 and UInt8

This is related to #2317

This is similar to #2448 but for
StringRef.

With this, users and maintainers can work with both Int8 and UInt8 and
will allow a graceful transition. This should also allow users to switch
during the next release.

Co-authored-by: gabrieldemarmiesse <gabrieldemarmiesse@gmail.com>
Closes #2454
MODULAR_ORIG_COMMIT_REV_ID: 4ef46b217579ad62ca22defc2d8a31930ef37c5a
@rparolin rparolin added the q2-2024 label May 16, 2024 — with Linear
lsh pushed a commit to lsh/mojo that referenced this issue May 17, 2024
…nd list too (#38859)

[External] [stdlib] Allow `String()` to be created from `UInt8` ptr and
list too

This is related to modularml#2317.

It allows a graceful and slow migration to UInt8 as the primary byte
type. This allows everyone to give whatever (Int8 or UInt8) to the
String struct and it just works™

When everything uses Uint8, we can remove the old methods that work with
Int8, and the migration will be complete.

Co-authored-by: gabrieldemarmiesse <gabrieldemarmiesse@gmail.com>
Closes modularml#2448
MODULAR_ORIG_COMMIT_REV_ID: d12649d6dfa178eb756d56df563ad155b5240093

Signed-off-by: Lukas Hermann <lukashermann28@gmail.com>
lsh pushed a commit to lsh/mojo that referenced this issue May 17, 2024
…8 (#38989)

[External] [stdlib] Allow `StringRef` to work with both Int8 and UInt8

This is related to modularml#2317

This is similar to modularml#2448 but for
StringRef.

With this, users and maintainers can work with both Int8 and UInt8 and
will allow a graceful transition. This should also allow users to switch
during the next release.

Co-authored-by: gabrieldemarmiesse <gabrieldemarmiesse@gmail.com>
Closes modularml#2454
MODULAR_ORIG_COMMIT_REV_ID: 4ef46b217579ad62ca22defc2d8a31930ef37c5a

Signed-off-by: Lukas Hermann <lukashermann28@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request mojo-repo Tag all issues with this label q2-2024
Projects
None yet
Development

No branches or pull requests

5 participants