Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] String comparisons implemented #2409

Closed
wants to merge 13 commits into from
Closed

[stdlib] String comparisons implemented #2409

wants to merge 13 commits into from

Conversation

siitron
Copy link
Contributor

@siitron siitron commented Apr 25, 2024

For issue #2346 (as an alternative to #2378). All four comparisons (__lt__, __le__, __gt__, & __ge__) uses a single __lt__ comparison (instead of checking less/greater than + potentially another "equals to"-check, for __le__ & __ge__). Sorry if this is considered a duplicate PR, I only meant to give an alternative suggestion. This is my first ever PR on GitHub.

StringLiterals also get comparisons.

@siitron siitron requested a review from a team as a code owner April 25, 2024 17:23
@siitron siitron closed this Apr 25, 2024
@siitron siitron deleted the nightly branch April 25, 2024 17:36
@siitron siitron restored the nightly branch April 25, 2024 17:57
@VMois
Copy link

VMois commented Apr 27, 2024

Are there any reasons to close a PR? This looks like quite a nice implementation.

@siitron siitron reopened this Apr 28, 2024
@siitron
Copy link
Contributor Author

siitron commented Apr 28, 2024

@VMois Thank you! I closed this PR before because I couldn't pass the tests (because I didn't know that the math package wasn't supported to use because of cyclic dependencies). But now I've replaced min with if/else-statements and that nicely resulted in fewer comparisons.

I can't complete ./stdlib/scripts/run-tests.sh locally due to some issues I have with WSL, but the current version should be fine now.

… added tests.

Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
…n_operators() for String.

Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
…al (+tests)

Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
@siitron siitron requested a review from a team as a code owner April 29, 2024 19:19
siitron and others added 2 commits April 29, 2024 21:32
…g_literal.mojo with if/else-statements. Also fixed the string comparison tests to make them look better when formatted correctly.

Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
var len1 = len(self)
var len2 = len(rhs)

if len1 < len2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: you can consolidate the duplication between this and the implementation in String by implementing the comparison on StringRef

Copy link
Contributor Author

@siitron siitron Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean something like return StringRef(self) < StringRef(rhs) as __lt__ for StringLiteral (and return self._strref_dangerous() < rhs._strref_dangerous() for String), etc.?

I didn't even consider StringRef (forgot it existed), and I tried to match __lt__ with their respective __eq__-methods. Right now String, StringLiteral, and StringRef does __eq__ in three separate ways, so I thought there was some reason for keeping them apart ATM(?). Would it still be safe to use memory.memcmp in the StringRef __lt__ implementation (considering it currently looks like StringLiteral has to use it's own local _memcmp, and StringRef's __eq__ also avoids it), or should StringRef's __lt__ be done like its __eq__?

What do you think of:

    @always_inline("nodebug")
    fn __lt__(self, rhs: StringRef) -> Bool:
        """Compare this StringRef to the RHS using LT comparison.

        Args:
            rhs: The other StringRef to compare against.

        Returns:
            True if this StringRef is strictly less than the RHS StringRef and False otherwise.
        """
        var len1 = len(self)
        var len2 = len(rhs)

        if len1 < len2:
            return memcmp(self.data, rhs.data, len1) <= 0
        else:
            return memcmp(self.data, rhs.data, len2) < 0

@JoeLoser JoeLoser requested review from abduld and a team and removed request for gabrieldemarmiesse May 5, 2024 00:37
@ematejska ematejska added the mojo-repo Tag all issues with this label label May 6, 2024
siitron and others added 4 commits May 7, 2024 13:55
…iteral's __lt__-methods. Also updated the tests to use the dunder methods directly (instead of their operators).

Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
@laszlokindrat laszlokindrat self-assigned this May 10, 2024
Copy link
Contributor

@laszlokindrat laszlokindrat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you! Do you think you could do a followup where you move the core logic to StringRef and delegate to that from both String and StringLiteral?

@modularbot
Copy link
Collaborator

✅🟣 This contribution has been merged 🟣✅

Your pull request has been merged to the internal upstream Mojo sources. It will be reflected here in the Mojo repository on the nightly branch during the next Mojo nightly release, typically within the next 24-48 hours.

We use Copybara to merge external contributions, click here to learn more.

@modularbot modularbot added the merged-internally Indicates that this pull request has been merged internally label May 10, 2024
@siitron
Copy link
Contributor Author

siitron commented May 10, 2024

This is great, thank you! Do you think you could do a followup where you move the core logic to StringRef and delegate to that from both String and StringLiteral?

Sure np, I'll create a new PR for the StringRef stuff!

Some thoughts (that I'll probably repeat in the PR):

@laszlokindrat Does this mean that StringLiteral no longer needs its private _memcmp-function (assuming __eq__ and __ne__ are also delegated to StringRef), i.e. StringLiteral (and StringRef) are ok to use memory.memcmp now? If yes, then I can make the 18 comparison methods all just use __eq__ and __lt__ in StringRef. I'm asking since StringLiteral's private _memcmp function has this comment:

# Use a local memcmp rather than memory.memcpy to avoid #31139 and #25100.

(I can't access "#31139 and #25100", I'm guessing its some internal stuff).

Also, for String to be delegated to StringRef I'd have to use String's _strref_dangerous-method which I was a bit hesitant to do

JoeLoser pushed a commit that referenced this pull request May 11, 2024
[External] [stdlib] String comparisons implemented

For issue #2346 (as an alternative to #2378). All four comparisons
(`__lt__`, `__le__`, `__gt__`, & `__ge__`) uses a single `__lt__`
comparison (instead of checking less/greater than + potentially another
"equals to"-check, for `__le__` & `__ge__`). Sorry if this is considered
a duplicate PR, I only meant to give an alternative suggestion. This is
my first ever PR on GitHub.

StringLiterals also get comparisons.

ORIGINAL_AUTHOR=Simon Hellsten
<56205346+siitron@users.noreply.github.com>
PUBLIC_PR_LINK=#2409

---------

Co-authored-by: Simon Hellsten <56205346+siitron@users.noreply.github.com>
Closes #2409
MODULAR_ORIG_COMMIT_REV_ID: b2ed4756c2741fd27387fa295515f4a7222e0ca5
@JoeLoser
Copy link
Collaborator

Landed in today's nightly: #2615. Thanks for the contribution!

@JoeLoser JoeLoser closed this May 11, 2024
@JoeLoser JoeLoser added merged-externally Merged externally in public mojo repo imported-internally Signals that a given pull request has been imported internally. labels May 11, 2024
@laszlokindrat
Copy link
Contributor

@laszlokindrat Does this mean that StringLiteral no longer needs its private _memcmp-function (assuming __eq__ and __ne__ are also delegated to StringRef), i.e. StringLiteral (and StringRef) are ok to use memory.memcmp now? If yes, then I can make the 18 comparison methods all just use __eq__ and __lt__ in StringRef. I'm asking since StringLiteral's private _memcmp function has this comment:

Use a local memcmp rather than memory.memcpy to avoid #31139 and #25100.

(I can't access "#31139 and #25100", I'm guessing its some internal stuff).

Good question. If you don't mind, would you just give it a go, and let's see what breaks? If things don't work, I will take a look at those tickets.

Also, for String to be delegated to StringRef I'd have to use String's _strref_dangerous-method which I was a bit hesitant to do

I think this should be okay, since the lifetimes of self should be guaranteed to outlive method calls, so you can take a dangerous reference in these method implementations.

@siitron
Copy link
Contributor Author

siitron commented May 13, 2024

If you don't mind, would you just give it a go, and let's see what breaks? If things don't work, I will take a look at those tickets.

@laszlokindrat You can take a look at #2620 if you'd like :)

lsh pushed a commit to lsh/mojo that referenced this pull request May 17, 2024
[External] [stdlib] String comparisons implemented

For issue modularml#2346 (as an alternative to modularml#2378). All four comparisons
(`__lt__`, `__le__`, `__gt__`, & `__ge__`) uses a single `__lt__`
comparison (instead of checking less/greater than + potentially another
"equals to"-check, for `__le__` & `__ge__`). Sorry if this is considered
a duplicate PR, I only meant to give an alternative suggestion. This is
my first ever PR on GitHub.

StringLiterals also get comparisons.

ORIGINAL_AUTHOR=Simon Hellsten
<56205346+siitron@users.noreply.github.com>
PUBLIC_PR_LINK=modularml#2409

---------

Co-authored-by: Simon Hellsten <56205346+siitron@users.noreply.github.com>
Closes modularml#2409
MODULAR_ORIG_COMMIT_REV_ID: b2ed4756c2741fd27387fa295515f4a7222e0ca5

Signed-off-by: Lukas Hermann <lukashermann28@gmail.com>
modularbot added a commit that referenced this pull request May 17, 2024
[External] [stdlib] Delegate string comparisons to `StringRef`

This is a follow-up to #2409.

String comparisons for `StringRef` are implemented. `StringRef` make use
of `memory.memcmp` for all of its 6 comparisons now, hopefully this
change is ok.

`String`'s and `StringLiteral`'s comparisons (`__eq__`, `__ne__`,
`__lt__`, `__le__`, `__gt__`, & `__ge__`) are delegated to `StringRef`.
As a result:
1. `String` comparisons uses its `_strref_dangerous`-method. And,
2. `StringLiteral`'s local `_memcmp`-function is removed (which was used
"rather than memory.memcpy to avoid #31139 and #25100").

The base logic for all 18 comparisons are implemented in `StringRef`'s
`__ne__` and `__lt__` methods. All other comparisons uses some
variation/negation of either `__ne__` or `__lt__`.

ORIGINAL_AUTHOR=Simon Hellsten
<56205346+siitron@users.noreply.github.com>
PUBLIC_PR_LINK=#2620

---------

Co-authored-by: Simon Hellsten <56205346+siitron@users.noreply.github.com>
Closes #2620
MODULAR_ORIG_COMMIT_REV_ID: ed03ee44f06caed7cad77c9c909e83d3c2c2d3cc
msaelices pushed a commit to msaelices/mojo that referenced this pull request May 18, 2024
[External] [stdlib] Delegate string comparisons to `StringRef`

This is a follow-up to modularml#2409.

String comparisons for `StringRef` are implemented. `StringRef` make use
of `memory.memcmp` for all of its 6 comparisons now, hopefully this
change is ok.

`String`'s and `StringLiteral`'s comparisons (`__eq__`, `__ne__`,
`__lt__`, `__le__`, `__gt__`, & `__ge__`) are delegated to `StringRef`.
As a result:
1. `String` comparisons uses its `_strref_dangerous`-method. And,
2. `StringLiteral`'s local `_memcmp`-function is removed (which was used
"rather than memory.memcpy to avoid #31139 and #25100").

The base logic for all 18 comparisons are implemented in `StringRef`'s
`__ne__` and `__lt__` methods. All other comparisons uses some
variation/negation of either `__ne__` or `__lt__`.

ORIGINAL_AUTHOR=Simon Hellsten
<56205346+siitron@users.noreply.github.com>
PUBLIC_PR_LINK=modularml#2620

---------

Co-authored-by: Simon Hellsten <56205346+siitron@users.noreply.github.com>
Closes modularml#2620
MODULAR_ORIG_COMMIT_REV_ID: ed03ee44f06caed7cad77c9c909e83d3c2c2d3cc
martinvuyk pushed a commit to martinvuyk/mojo that referenced this pull request May 24, 2024
[External] [stdlib] Delegate string comparisons to `StringRef`

This is a follow-up to modularml#2409.

String comparisons for `StringRef` are implemented. `StringRef` make use
of `memory.memcmp` for all of its 6 comparisons now, hopefully this
change is ok.

`String`'s and `StringLiteral`'s comparisons (`__eq__`, `__ne__`,
`__lt__`, `__le__`, `__gt__`, & `__ge__`) are delegated to `StringRef`.
As a result:
1. `String` comparisons uses its `_strref_dangerous`-method. And,
2. `StringLiteral`'s local `_memcmp`-function is removed (which was used
"rather than memory.memcpy to avoid #31139 and #25100").

The base logic for all 18 comparisons are implemented in `StringRef`'s
`__ne__` and `__lt__` methods. All other comparisons uses some
variation/negation of either `__ne__` or `__lt__`.

ORIGINAL_AUTHOR=Simon Hellsten
<56205346+siitron@users.noreply.github.com>
PUBLIC_PR_LINK=modularml#2620

---------

Co-authored-by: Simon Hellsten <56205346+siitron@users.noreply.github.com>
Closes modularml#2620
MODULAR_ORIG_COMMIT_REV_ID: ed03ee44f06caed7cad77c9c909e83d3c2c2d3cc
modularbot added a commit that referenced this pull request Jun 7, 2024
[External] [stdlib] String comparisons implemented

For issue #2346 (as an alternative to #2378). All four comparisons
(`__lt__`, `__le__`, `__gt__`, & `__ge__`) uses a single `__lt__`
comparison (instead of checking less/greater than + potentially another
"equals to"-check, for `__le__` & `__ge__`). Sorry if this is considered
a duplicate PR, I only meant to give an alternative suggestion. This is
my first ever PR on GitHub.

StringLiterals also get comparisons.

ORIGINAL_AUTHOR=Simon Hellsten
<56205346+siitron@users.noreply.github.com>
PUBLIC_PR_LINK=#2409

---------

Co-authored-by: Simon Hellsten <56205346+siitron@users.noreply.github.com>
Closes #2409
MODULAR_ORIG_COMMIT_REV_ID: b2ed4756c2741fd27387fa295515f4a7222e0ca5
modularbot added a commit that referenced this pull request Jun 7, 2024
[External] [stdlib] Delegate string comparisons to `StringRef`

This is a follow-up to #2409.

String comparisons for `StringRef` are implemented. `StringRef` make use
of `memory.memcmp` for all of its 6 comparisons now, hopefully this
change is ok.

`String`'s and `StringLiteral`'s comparisons (`__eq__`, `__ne__`,
`__lt__`, `__le__`, `__gt__`, & `__ge__`) are delegated to `StringRef`.
As a result:
1. `String` comparisons uses its `_strref_dangerous`-method. And,
2. `StringLiteral`'s local `_memcmp`-function is removed (which was used
"rather than memory.memcpy to avoid #31139 and #25100").

The base logic for all 18 comparisons are implemented in `StringRef`'s
`__ne__` and `__lt__` methods. All other comparisons uses some
variation/negation of either `__ne__` or `__lt__`.

ORIGINAL_AUTHOR=Simon Hellsten
<56205346+siitron@users.noreply.github.com>
PUBLIC_PR_LINK=#2620

---------

Co-authored-by: Simon Hellsten <56205346+siitron@users.noreply.github.com>
Closes #2620
MODULAR_ORIG_COMMIT_REV_ID: ed03ee44f06caed7cad77c9c909e83d3c2c2d3cc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
imported-internally Signals that a given pull request has been imported internally. merged-externally Merged externally in public mojo repo merged-internally Indicates that this pull request has been merged internally mojo-repo Tag all issues with this label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants