Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fast path when extraction leads to intnat #90

Merged
merged 4 commits into from
Jan 2, 2023

Conversation

recoules
Copy link
Contributor

@recoules recoules commented Nov 9, 2020

Proposition for #88.
At least, it passes the make tests.

caml_z.c Outdated
@@ -934,6 +934,30 @@ CAMLprim value ml_z_extract(value arg, value off, value len)
if (x >= 0) return Val_long(x);
/* If x < 0, fall through slow path */
}
} else if (l <= Z_BASE2_LENGTH_OP) {
Z_ARG(arg);
c1 = o / Z_LIMB_BITS;
Copy link
Collaborator

@pascal-cuoq pascal-cuoq Nov 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anyone else reads this and wonders whether it wouldn't be better to convert o to an unsigned integer type before computing / and %, the answer is, “with GCC and Clang at -O, yes, although not at -O2. See https://gcc.godbolt.org/z/Ya4vxW

I would recommend casting o to uintnat before these operations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I simply reused what I found below: this part is the same as the slow path. So if needed, it should be changed in both places.
Yet I would have directly used corresponding bitwise operation >> and & but I though that / and % were still clearer and that the code would be compiled with optimisation so we would have not to bother with this kind of subtlety.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new version should address this issue.

@recoules
Copy link
Contributor Author

@pascal-cuoq @antoinemine
An update for #88 in relation to #79.

@recoules
Copy link
Contributor Author

@antoinemine @xavierleroy
As it moves in the same direction than #79, what is yours though about it?

@xavierleroy
Copy link
Contributor

Yes, it's nice to handle the "small" case in a separate, "noalloc" C function.

However, I haven't reviewed the bit-twiddling that implements this "small" case.

caml_z.c Outdated
if (csz > 0) {
if (c2) {
x = ptr_arg[c1] >> c2;
if ((o + l > (intnat)Z_LIMB_BITS) && (csz > 1))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After careful reading, I think that the test should be ((c2 + l > (intnat)Z_LIMB_BITS) && (csz > 1)) (c2 instead of o).

@antoinemine
Copy link
Collaborator

I've checked and run a few tests. It seems OK to me. Thanks for the contribution.
I'm ready to merge, unless there is more to it.

It should close #88, but also #103 (which could now be implemented efficiently in pure OCaml on top of Zarith, but I feel belongs to a separate library).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants