Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mojo-stdlib] Added fn to normalize index calculation #2386

Closed

Conversation

StandinKP
Copy link
Contributor

Fixes #2251

@StandinKP StandinKP requested a review from a team as a code owner April 23, 2024 15:14
@StandinKP StandinKP changed the title Added fn to normalize index calculation [mojo-stdlib] Added fn to normalize index calculation Apr 23, 2024
@always_inline
fn _normalize_index(self, index: Int) -> Int:
if index < 0:
return _max(0, len(self) + index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the _max here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be here just to prevent the negative indexes. How would you suggest to do it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that index outside [-len(self), len(self)) is out of bound, we can simply skip the _max call.

return index if index >= 0 else index + len(self)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this but it gives a seg fault. I will try to resolve it but till then I think keeping max would be good

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoeLoser @gabrieldemarmiesse can I get your views on this ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @StandinKP — sorry for the slow reply here. We discussed this the other day during our team meeting and here's my notes:

  • The current code seems wrong because it will silently turn an out of bounds negative value into an index of the 0th element by using _max. But I don’t think the solution is to remove the _max and allow negative indices to segfault (or worse) at runtime. Instead, let's abort on out-of-bound indices.
  • In the future, it would be nice to have one "normalized index" function for use in the library for any contiguous data structure we have in the library such as List, String, etc. Nothing to do for this PR, but just pointing out how we'd like to clean this up over time.
  • In the short (and long-term), I think aborting on out-of-bounds access is a reasonable approach, both here in List and elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Makes sense. Do you want me to keep an issue filed to add the common "normalized function" and update all references?

@StandinKP StandinKP force-pushed the add_normalized_index_calculation branch from fe381c0 to 6720cad Compare April 30, 2024 05:31
Copy link
Contributor

@abduld abduld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

stdlib/src/collections/list.mojo Outdated Show resolved Hide resolved
@ematejska ematejska added the mojo-repo Tag all issues with this label label May 6, 2024
@StandinKP StandinKP force-pushed the add_normalized_index_calculation branch from 6720cad to 57a9113 Compare May 7, 2024 04:27
@StandinKP
Copy link
Contributor Author

@JoeLoser @soraros I am facing this issue right now after rebasing with nightly.

❯ ./stdlib/scripts/run-tests.sh
Creating build directory for building the stdlib running the tests in.
Packaging up the Standard Library.
Included from /Users/kaushalphulgirkar/Documents/Projects/mojo/stdlib/src/collections/__init__.mojo:1:
/Users/kaushalphulgirkar/Documents/Projects/mojo/stdlib/src/collections/list.mojo:573:39: error: invalid initialization: could not deduce parameter 'type' of parent struct 'Reference'
        var normalized_idx = Reference(self)._normalize_index(i)
                             ~~~~~~~~~^~~~~~
/Users/kaushalphulgirkar/Documents/Projects/mojo/stdlib/src/__init__.mojo:1:1: note:  struct declared here
# ===----------------------------------------------------------------------=== #
^
Included from /Users/kaushalphulgirkar/Documents/Projects/mojo/stdlib/src/memory/__init__.mojo:1:
/Users/kaushalphulgirkar/Documents/Projects/mojo/stdlib/src/memory/reference.mojo:222:8: note: function declared here
    fn __init__(inout self, value: Self._mlir_type):
       ^
mojo: error: failed to parse the provided Mojo source module

Not able to figure how to exactly fix this. Any suggestions? Did we change the Reference initializations?

@gabrieldemarmiesse
Copy link
Contributor

gabrieldemarmiesse commented May 7, 2024

Hi @StandinKP, indeed you are right :) there was a change with how Reference works right now, see 0fe4cb7

I think that in your case, you don't need to do Reference(self) since self is already a reference. The error message from the compiler is not very clear. Let's hope it will get better in the future!

@StandinKP
Copy link
Contributor Author

Makes sense. Thanks @gabrieldemarmiesse

@StandinKP
Copy link
Contributor Author

StandinKP commented May 8, 2024

@gabrieldemarmiesse if I use self[]._normalize_index(i) it gives a very nasty error and crashes all the test suite
trying to figure out this but if you have any idea let me know Thanks

Signed-off-by: Kaushal Phulgirkar <kaushalpp01@gmail.com>
Signed-off-by: Kaushal Phulgirkar <kaushalpp01@gmail.com>
@StandinKP StandinKP force-pushed the add_normalized_index_calculation branch from 4c83594 to c3b5a6d Compare May 10, 2024 04:41
@StandinKP
Copy link
Contributor Author

@soraros can you provide some guidance here? Thanks

@gabrieldemarmiesse
Copy link
Contributor

@StandinKP I think we should merge #2677 first and then you can open one pull request per method to change. This will ensure the easy changes will be merged to nightly quickly and the harder changes (I know there are some out of bounds bugs in the tests that you will hit) can be merged afterwards. What do you think?

modularbot pushed a commit that referenced this pull request May 28, 2024
[External] [stdlib] Add the `normalize_index` function

This PR is some kind of mix between
#2386 and
#2384 which have issues (aborting)
or have too many conflicts because the PR is too big.

This PR solves part of #2251 and
#2337

We try here to give the ground work for indexing correctly. This
function added can then be used wherever we work with sequences.

Two things I noticed during development:
1) The `debug_assert` does not run in unit tests. Is there any way to
enable it? We currently have out-of-bounds bugs in our test suite.
2) The null terminator is causing pain, again, again, and again. Do we
have any plans to make it optional when working with String? I opened
#2678 to discuss this.

To avoid to fix those issues in this PR, I used the `normalize_index` on
the `__refitem__` of `InlineArray` which doesn't have widespread use yet
and isn't impacted by out-of-bounds bugs.

My recommendation would be to merge this PR then to rebase
#2386 and
#2384 on it. We should also
afterwards fix the out of bounds bugs that can be triggered in the test
suite by enabling debug_assert.

The diff might seem big, but no worries, it's mostly the licenses and
docstrings :)

Co-authored-by: Gabriel de Marmiesse <gabriel.demarmiesse@datadoghq.com>
Closes #2677
MODULAR_ORIG_COMMIT_REV_ID: 66e7121a6a333c16284eb33a89eb85c034c296c3
@JoeLoser
Copy link
Collaborator

Closing as this is now implemented at

. Thanks for pushing on this @StandinKP and @gabrieldemarmiesse!

@JoeLoser JoeLoser closed this May 29, 2024
modularbot pushed a commit that referenced this pull request Jun 7, 2024
[External] [stdlib] Add the `normalize_index` function

This PR is some kind of mix between
#2386 and
#2384 which have issues (aborting)
or have too many conflicts because the PR is too big.

This PR solves part of #2251 and
#2337

We try here to give the ground work for indexing correctly. This
function added can then be used wherever we work with sequences.

Two things I noticed during development:
1) The `debug_assert` does not run in unit tests. Is there any way to
enable it? We currently have out-of-bounds bugs in our test suite.
2) The null terminator is causing pain, again, again, and again. Do we
have any plans to make it optional when working with String? I opened
#2678 to discuss this.

To avoid to fix those issues in this PR, I used the `normalize_index` on
the `__refitem__` of `InlineArray` which doesn't have widespread use yet
and isn't impacted by out-of-bounds bugs.

My recommendation would be to merge this PR then to rebase
#2386 and
#2384 on it. We should also
afterwards fix the out of bounds bugs that can be triggered in the test
suite by enabling debug_assert.

The diff might seem big, but no worries, it's mostly the licenses and
docstrings :)

Co-authored-by: Gabriel de Marmiesse <gabriel.demarmiesse@datadoghq.com>
Closes #2677
MODULAR_ORIG_COMMIT_REV_ID: 66e7121a6a333c16284eb33a89eb85c034c296c3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mojo-repo Tag all issues with this label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants