Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py/objint.c: Add support for int.bit_length(). #11679

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

dmazzella
Copy link
Contributor

As requested into: #4065

add support for int.bit_length()

@dmazzella dmazzella changed the title py/objint.c: Add support for int.bit_length py/objint.c: Add support for int.bit_length(). Jun 1, 2023
@github-actions
Copy link

github-actions bot commented Jun 1, 2023

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
        rp2:    +0 +0.000% PICO

@codecov
Copy link

codecov bot commented Jun 1, 2023

Codecov Report

Merging #11679 (9dd4f8f) into master (5159304) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 9dd4f8f differs from pull request most recent head d6ee656. Consider uploading reports for the commit d6ee656 to get more accurate results

@@           Coverage Diff           @@
##           master   #11679   +/-   ##
=======================================
  Coverage   98.40%   98.40%           
=======================================
  Files         156      156           
  Lines       20609    20628   +19     
=======================================
+ Hits        20281    20300   +19     
  Misses        328      328           
Impacted Files Coverage Δ
py/mpz.h 100.00% <ø> (ø)
py/mpz.c 100.00% <100.00%> (ø)
py/objint.c 100.00% <100.00%> (ø)
py/objint_mpz.c 100.00% <100.00%> (ø)

Signed-off-by: Damiano Mazzella <damianomazzella@gmail.com>
@massimosala
Copy link

massimosala commented Jun 5, 2023

Hi

Wondering about the possible use-cases, and the impact on firmware size,
I propose to have bit_length and bit_count working only on native integers, not on mpn/mpz.

Rationale:

  • many cpus have specific instructions for these two operations, on native integers
  • the routines working only on native integers are smaller and easier to mantain
  • the code path is more efficient, it hasn't to fork based on the object types.

With my proposal, I think we can have the two features, bit_length and bit_count, which require less memory than just the bit_length proposal above.

IMHO these scaled down implementations will cover 99% of their real usage.

Addendum: if for special cases (?) you need to use "big integers", you can code the two routines in python and use @micropython.native
In the docs for bit_length and bit_count, we can show the python code for "big integers".

@dpgeorge, what do you think?

@robert-hh
Copy link
Contributor

I propose to have bit_length and bit_count working only on native integers, not on mpn/mpz.

A user cannot tell which internal representation is used for an integer, and Python code must just work irrespective of the internal representation of a number. With your proposal there is the risk that bit_length() and bit_count() sometimes work and sometimes not. That is hardly acceptable.

Copy link
Member

@jimmo jimmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this PR is +128 bytes on PYBV11.

I have implemented my suggestions in the review comments here: master...jimmo:micropython:int-bit-length

It comes out slightly smaller at +108 bytes.

I also tried a version that used builtin intrinsic bit operations (specifically CLZ). This didn't help with code size, and also because we still need to provide fallbacks it made the code a lot more complicated.

@@ -42,3 +42,8 @@
#define MICROPY_TRACKED_ALLOC (1)
#define MICROPY_WARNINGS_CATEGORY (1)
#define MICROPY_PY_UCRYPTOLIB_CTR (1)

// Enable int.bit_length(n)
#ifndef MICROPY_INT_BIT_LENGTH
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be defined (and defaulted) in mpconfig.h.

In general rather than enabling things in coverage, just make them MICROPY_CONFIG_ROM_LEVEL_AT_LEAST_EVERYTHING in mpconfig, which will be enabled by coverage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for consistency I think this should be called MICROPY_PY_BUILTINS_INT_BIT_LENGTH.

return 0;
}

mpz_t *dest = mpz_clone(n);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of copying the mpz into a new integer and modifying it, you can just count the digits in the mpz.

#if MICROPY_INT_BIT_LENGTH
STATIC mp_obj_t int_bit_length(size_t n_args, const mp_obj_t *args) {
(void)n_args;
#if MICROPY_LONGINT_IMPL == MICROPY_LONGINT_IMPL_MPZ
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unnecessarily creates an mpz if the input is a small int.

Instead, these functions should handle small ints separately, and only send them off to mpz if necessary.

#else
mp_uint_t dest = MP_OBJ_SMALL_INT_VALUE(args[0]);
mp_uint_t num_bits = 0;
while (dest > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not work for negative numbers. It will always return 32 (or 64 depending on sizeof(mp_uint_t)) as it will just keep shifting down the sign bit.

The reason the tests pass with negative numbers is that this code is never hit.

if (n == &n_temp) {
mpz_deinit(n);
}
return mp_obj_new_int_from_ull(res);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

res will always fit in a small int, so can just use MP_OBJ_NEW_SMALL_INT directly.

#if MICROPY_INT_BIT_LENGTH
mp_obj_t mp_obj_int_mpz_bit_length(mp_obj_t size) {
mpz_t n_temp;
mpz_t *n = mp_mpz_for_int(size, &n_temp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, if you handle small ints directly, you don't need to allocate an mpz here.

print('SKIP')
raise SystemExit

n = -37
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs tests for big ints (i.e. mpz). Also some more interesting bit patterns around the boundaries.

return mp_obj_new_int_from_uint(num_bits);
#endif
}
STATIC MP_DEFINE_CONST_FUN_OBJ_VAR_BETWEEN(int_bit_length_obj, 0, 1, int_bit_length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to provide this for the "long long" implementation.

@jimmo
Copy link
Member

jimmo commented Jun 5, 2023

@massimosala

Wondering about the possible use-cases, and the impact on firmware size, I propose to have bit_length and bit_count working only on native integers, not on mpn/mpz.

As @robert-hh has pointed out, MicroPython's integer type is a "small integer" until it needs to be big. But the integer type itself is what says that it has a bit_length function. So what you're proposing is that bit_length just stops working (raises not implemented?) when the integer is too big?

Rationale:

* **many cpus have specific instructions for these two operations, on native integers**

There are operations like clz, but they are not universally available. Additionally, different compilers have different ways of accessing these instructions, plus we have to provide a fallback. Also they come with quirks (e.g. clz is undefined in the input is zero), which leads to more code size.

* the routines working only on native integers are smaller and easier to mantain

See my proposed version linked above, the MPZ version is actually simpler because MPZ works entirely in unsigned digits.

* the code path is more efficient, it hasn't to fork based on the object types.

You still have to fork to decide at runtime whether the given integer is small or big. You end up with extra code to handle the big case anyway (e.g. raising an error).

With my proposal, I think we can have the two features, bit_length and bit_count, which require less memory than just the bit_length proposal above.

Unfortunately this is not the case. These things have to be measured and it takes time and effort and the results can be counterintuitive.

@dmazzella
Copy link
Contributor Author

@jimmo good savings of bytes and optimizations, I flatly agree with your review.

@massimosala
Copy link

@jimmo wrote:
... results can be counterintuitive

I see, I was too focused on bit banging, where smallints are fine.

If I need to, I'll write a specialized viper routine.
And eventually I will compare its performance against this generic implementation ;-)

@massimosala
Copy link

massimosala commented Jun 23, 2023

FWIW, this PR is +128 bytes on PYBV11.

I have implemented my suggestions in the review comments here: master...jimmo:micropython:int-bit-length

It comes out slightly smaller at +108 bytes.

I also tried a version that used builtin intrinsic bit operations (specifically CLZ). This didn't help with code size, and also because we still need to provide fallbacks it made the code a lot more complicated.

A very nice code rewrite.
If I understand, the loop is done only on the last digit.

Quite smart, it overtakes Damiano code and my attempt to improve it.
(#4065)

Thanks to Damiano for the startup and to Jim for the improvement.
I hope bit_length will be included in the next MP release.

@massimosala
Copy link

@jimmo

Hi Jim

Is it possible to add your bit_length to the milestones of MP 1.21 ?

@projectgus
Copy link
Contributor

This is an automated heads-up that we've just merged a Pull Request
that removes the STATIC macro from MicroPython's C API.

See #13763

A search suggests this PR might apply the STATIC macro to some C code. If it
does, then next time you rebase the PR (or merge from master) then you should
please replace all the STATIC keywords with static.

Although this is an automated message, feel free to @-reply to me directly if
you have any questions about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants