Skip to content

Conversation

@martinvuyk
Copy link
Contributor

Add all split overloads with examples. Part of #3528

cc: @ConnorGray

@ConnorGray
Copy link
Collaborator

This looks fantastic and is much easier to keep all the context in my head, making review snappy. Thank you so much for doing that Martin 😄

Other than a question about method placement, this looks ready to go.

@ConnorGray
Copy link
Collaborator

!sync

@modularbot modularbot added the imported-internally Signals that a given pull request has been imported internally. label Feb 25, 2025
@ConnorGray
Copy link
Collaborator

Hmm, it seems this is running into Mojo compiler assertion failures in CI internally.

I'm able to reproduce a crash using $ mojo package ./stdlib/src/ to build this PR. (Curious that the CI here succeeds? That's odd.). The crash I've been seeing looks like this:

1.  Crash resolving decl body at loc("collections/string/string.mojo":1445:8)
    >>     fn split(self, sep: StringSlice, maxsplit: Int) raises -> List[String]:
              ^...................................................................
    >>         """Split the string by a separator.
    >> 
    >>         Args:
    >>             sep: The string to split on.
2.  Crash parsing statement at loc("collections/string/string.mojo":1466:9)
    >>         return _to_string_list(self.as_string_slice().split(sep, maxsplit))
               ^..................................................................<

Are you able to call these new split() functions in any example programs locally, without running into a compiler crash?

I think this is running into the same parameter inference problem I ran into when I recently did some work on StringSlice.split(), which led me to use the explicit parameters you see in the signature here:

    fn split[
        sep_mut: Bool,
        sep_origin: Origin[sep_mut], //,
    ](
        self,
        sep: StringSlice[sep_origin],
        maxsplit: Int = -1,
    ) raises -> List[
        String
    ]:

I notice that you still change StringSlice.split() to return a List[StringSlice] (spelled as List[Self]) in this PR — I'm fairly certain that's the cause of the crash.

Originally I thought this PR was writing out each of these overloads as a way you found to workaround that crash, but it appears that may not have been the goal?

My best guess is that this issue occurs when something goes wrong inferring origin parameters when StringSlice appears in both as the sep: StringSlice argument and in the return List[StringSlice[..]], but I haven't narrowed down a minimal repro myself.

@martinvuyk martinvuyk requested a review from a team as a code owner February 26, 2025 22:06
@martinvuyk
Copy link
Contributor Author

martinvuyk commented Feb 26, 2025

@ConnorGray I can't repro the crash

I notice that you still change StringSlice.split() to return a List[StringSlice] (spelled as List[Self]) in this PR — I'm fairly certain that's the cause of the crash.

AFAIK the Self type has the origin bound defined and is not the same as some_argument: StringSlice

Originally I thought this PR was writing out each of these overloads as a way you found to workaround that crash, but it appears that may not have been the goal?

I didn't experience any crash 🤷‍♂️ . The List[Self] was mainly because I found List[StringSlice[origin]] verbose and I'm lazy

Could you test it with reverting back to parametrizing the origin?

@martinvuyk martinvuyk requested a review from a team as a code owner March 11, 2025 16:09
@martinvuyk
Copy link
Contributor Author

@ConnorGray gentle ping and also to ask whether the crash is still there, and if I can help in any way to repro/fix it

@JoeLoser
Copy link
Collaborator

!sync

@JoeLoser
Copy link
Collaborator

This is still crashing parameter inference in the compiler internally. Perhaps we can help it out by explicitly specifying origins etc here.

@soraros

This comment was marked as resolved.

@martinvuyk
Copy link
Contributor Author

This is quite unfortunate. Should we add a TODO to indicate future cleanups?

Probably, but I'm not even sure what is causing the errors or how to replicate them to open an issue

@martinvuyk
Copy link
Contributor Author

martinvuyk commented Mar 19, 2025

@JoeLoser @ConnorGray I think I found the issue, could any of you test it?

fn _to_string_list[
    O: ImmutableOrigin, //
](items: List[StringSlice[O]]) -> List[String]:

Changed it to O: Origin

@JoeLoser
Copy link
Collaborator

!sync

@JoeLoser JoeLoser force-pushed the add-split-overloads branch from 99e4f97 to d817560 Compare March 26, 2025 16:18
@JoeLoser
Copy link
Collaborator

!sync

@martinvuyk martinvuyk force-pushed the add-split-overloads branch 3 times, most recently from 6667307 to ad47d94 Compare April 8, 2025 19:38
@martinvuyk
Copy link
Contributor Author

@JoeLoser I think I found the issue, when a StringSlice[MutableOrigin] is passed to split() and it returned a List[StringSlice[MutableOrigin]] it allowed writing to an already referenced memory location. So I just made it so that it returns a List[StringSlice[ImmutableOrigin]] regardless of the initial origin's mutability.

@martinvuyk martinvuyk force-pushed the add-split-overloads branch from ad47d94 to 4494db1 Compare April 8, 2025 23:19
@martinvuyk martinvuyk force-pushed the add-split-overloads branch 2 times, most recently from 73984db to 28a6b49 Compare April 21, 2025 17:00
@martinvuyk
Copy link
Contributor Author

@JoeLoser this PR is ready for testing as well

@martinvuyk martinvuyk force-pushed the add-split-overloads branch 2 times, most recently from 577702a to e7cef57 Compare June 21, 2025 17:52
@martinvuyk martinvuyk force-pushed the add-split-overloads branch from e7cef57 to 9b77f79 Compare July 10, 2025 12:10
@ConnorGray
Copy link
Collaborator

!sync

@martinvuyk martinvuyk force-pushed the add-split-overloads branch 2 times, most recently from 34c8962 to d2ca636 Compare July 29, 2025 12:43
@ConnorGray
Copy link
Collaborator

!sync

@martinvuyk martinvuyk force-pushed the add-split-overloads branch from d2ca636 to 8010318 Compare July 30, 2025 12:12
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
@martinvuyk martinvuyk force-pushed the add-split-overloads branch from 8010318 to f875b86 Compare July 30, 2025 15:21
@ConnorGray
Copy link
Collaborator

!sync

@modularbot
Copy link
Collaborator

✅🟣 This contribution has been merged 🟣✅

Your pull request has been merged to the internal upstream Mojo sources. It will be reflected here in the Mojo repository on the main branch during the next Mojo nightly release, typically within the next 24-48 hours.

We use Copybara to merge external contributions, click here to learn more.

@modularbot modularbot added the merged-internally Indicates that this pull request has been merged internally label Jul 31, 2025
@modularbot
Copy link
Collaborator

Landed in 4e1c074! Thank you for your contribution 🎉

@modularbot modularbot added the merged-externally Merged externally in public mojo repo label Aug 1, 2025
@modularbot modularbot closed this in 4e1c074 Aug 1, 2025
@github-actions github-actions bot locked and limited conversation to collaborators Aug 1, 2025
@martinvuyk martinvuyk deleted the add-split-overloads branch August 1, 2025 19:32
SasankYadati pushed a commit to SasankYadati/modular that referenced this pull request Aug 5, 2025
[External] [stdlib] Add split overloads

Add all split overloads with examples. Part of modular#3528

ORIGINAL_AUTHOR=martinvuyk
<110240700+martinvuyk@users.noreply.github.com>
PUBLIC_PR_LINK=modular#4015

Co-authored-by: martinvuyk <110240700+martinvuyk@users.noreply.github.com>
Closes modular#4015
MODULAR_ORIG_COMMIT_REV_ID: 870f9e2d80e3a9cc657540af44932fd66da8ea22
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

imported-internally Signals that a given pull request has been imported internally. merged-externally Merged externally in public mojo repo merged-internally Indicates that this pull request has been merged internally

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants