PEP 237: int/long unification #1329

BCSharp · 2022-02-26T05:14:06Z

This PR implements PEP 237, as discussed under #52. I have tried to keep it functionally self-contained (i.e. working consistently) but still as small as possible. As a result, there is still a dozen or so smaller updates coming up, mostly cleanup (notably, in LongOps/IntOps). Therefore I checked off PEP 237 as implemented in "WhatsNewInPython30" but want to keep #52 open for a while until everything gets into the mainline.

Despite that, it is still a sizable PR, though most of the changes are in the tests. The old tests used long profusely, which was aliased to int. This made the tests passing, but since int was Int32, BigInteger codepaths were not really well tested and the tests were often testing just the same as for old int. I've reviewed each case of long usage one by one and replaced long with either int or big, which is a function creating BigInteger instances for small integer values that normally would be Int32. Some tests that became obsolete are removed, and some new tests are added, but in most cases the existing tests are repurposed to test both Int32 and BigInteger.

Before going full scale with this implementation, I did some prototyping of the remaining possible scenarios. All together, I've considered the following scenarios:

int is pure BigInteger and Int32 is treated like all other unmanaged .NET types (e.g. Int64).
int type is BigInteger but instances of both BigInteger and Int32 are treated as int for performance reasons (this proposal).
int type stays Int32 but instances of both BigInteger and Int32 are treated as int for performance reasons.
There are two separated Python types for BigInteger and Int32 but both named int and further indistinguishable from the Python level.

After going back and forth between them, I am happy to conclude that the chosen scenario seems to be the best choice. Each of the scenarios has some strengths and weaknesses, but scenario 2 seems like the best compromise.

For instance, scenario 1 is the most clean and logical, but gives in some performance and is not so convenient for interop, since lots of .NET API return just Int32 and not BigInteger.

Scenario 3 comes close and has the advantage that the existing IronPython 2.x code would be easier to migrate to 3.x, but is difficult to keep free of surprises, mostly because not all of int instances would fit in objects/collections/generic methods strongly typed for int. I wouldn't even try to explain it to the people.

Scenario 4 becomes convoluted to get full compatibility with CPython.

This leaves scenario 2 which is fairly clean, straightforward and (mostly) free of surprises. For the migration, one must remember that Python 2.x long is renamed to int in Python 3.x, and Python 2.x int is "renamed" to System.Int32. In practice it seems that the only times that the developer has to pay attention to that distinction is when dealing with generics, since using int will imply BigInteger rather than Int32.

There is one regression: instances of subclasses of int behave differently during overloaded method resolution. The root cause is not so much this PR as the fact that the method resolution with arguments of type BigInteger and Int32 works differently. I think something is wrong here. For instance, given argument BigInteger(-200), and a method group with overloads taking either Int32 or Byte, why would the resolver choose the latter one and then choke on it with an overflow exception? The tests for those cases are in test_methodbinder2, which, according to the code comment, are originally algorithmically generated, but maybe never got properly reviewed and simply froze incorrect behaviour. In any case, this is one of the loose ends I want to investigate in a separate PR, so for now the myint tests in that test module are commented out.

Incidentally, this PR also resolves #894.

slozier

Didn't manage to go through it all (probably going to take me a few passes to review). Here are my initial comments.

I'm guessing one of the follow-ups will be killing class test(System.Int32): pass and getting rid of uses of Extensible<int>?

Tests/test_metaclass.py

slozier · 2022-02-26T15:34:50Z

Src/IronPython/Modules/Builtin.cs

@@ -656,7 +656,7 @@ public static object eval(CodeContext/*!*/ context, [NotNull]FunctionCode code,
            return (BigInteger)res;
        }

-        public static PythonType @int => DynamicHelpers.GetPythonTypeFromType(typeof(int));
+        public static PythonType @int => DynamicHelpers.GetPythonTypeFromType(typeof(BigInteger));


Not necessarily for this PR, but we should consider using TypeCache.BigInteger instead. Think I saw this in other places as well.

Good point. I'll add it to my follow-up list.

slozier · 2022-02-26T15:37:36Z

Src/IronPython/Lib/iptest/type_util.py

@@ -44,7 +52,8 @@ def remove_clr_specific_attrs(attr_list):

    # CLR array shortcut
    array_cli       = System.Array
-    array_int       = System.Array[int]
+    array_int       = System.Array[System.Int32]


Probably a good idea to make a note of this difference in Upgrading from IronPython 2 to 3. I'm sure I've used System.Array[int] in my own code assuming I'll get System.Array[System.Int32] .

I know, me too. I didn't notice Upgrading from IronPython 2 to 3 before. It will put a comprehensive note there (separate PR).

slozier · 2022-02-26T16:20:59Z

Src/IronPython/Runtime/Operations/LongOps.cs

-                return cls.CreateInstance(context, value);
+        #region Constructors
+
+        private static object FastNew(CodeContext/*!*/ context, object o, int @base = 10) {


I guess this one will become the master and replace Int32Ops.FastNew? I wonder if there's a "nicer" way to do this from a git history perspective (e.g. instead of replicating the code here we could update and call the Int32Ops versions). Maybe it's a non-issue and the follow-up LongOps/IntOps unification will take care of it...

Yes, I was planning to handle it in the follow-up LongOps/IntOps unification. It is kind of a separate issue since there are several ways of doing it. The idea I had was to make BigIntegerOps the main version and call it from Int32Ops as appropriate. This also applies to other methods in Int32Ops.

Src/IronPythonTest/EngineTest.cs

Tests/modules/misc/test_math.py

Tests/test_bigint.py

Tests/test_bool.py

Tests/test_class.py

Tests/test_dlrkwarg.py

slozier

Alright, turns out I had a bit more time. Here are some more comments after my first pass.

Tests/test_ironmath.py

Tests/test_methodbinder1.py

Src/IronPython/Runtime/ConversionWrappers.cs

BCSharp · 2022-02-26T20:07:12Z

I'm guessing one of the follow-ups will be killing class test(System.Int32): pass and getting rid of uses of Extensible<int>?

Thanks for the review. Yes. I didn't post the list of follow-ups I have because it is tentative and more like a jot-down of ideas and mental anchors than clean text, but maybe it is useful to post it anyway, esp. if you will be going through in multiple passes; you may get more questions for follow-ups that may or may not be on the list (in which case please raise the issue). So here it is:

# TODO: PythonOps.ThrowingConvertToLong and NonThrowingConvertToLong: rename "Long" to "BigInt".
# Similarly, rename "Int" to "Int32".
# TODO: Clean up BigIntegerOps/Int32Ops (see TODO in the assertions below)
# TODO: remove all references to Extensible<int>
# TODO: check out Tests/Tools/cmodule.py
# TODO: Support Extensible<BigInteger> in MetaUserObject.TryPythonConversion
# I suppose if the object is derived from int, no __int__ is called
# but the assignment/conversion succeeds based on the inheritance.
# See how it is handled for String and Int32. Surprising, it is not done for Double or Complex, maybe a bug?
# Example test for the case in in StdLib/Lib/test/test_int.IntTestCases.test_int_subclass_with_int
# What about bool?
# TODO: Scan code for all references to https://github.com/IronLanguages/ironpython3/issues/52 and check if everything's OK
# TODO: generate_alltypes generates a bunch of __new__ constructors, but does not test for Extensible<Complex>. Bug?
# TODO: Idem, Complex64 is used. It is marked as obsolete. Consider removal.
# TODO: ConversionWrappers, when converting Int32 objects to BigInteger are inefficient because of boxing.
# The problem is a cast from BigInteger to generic T, which is not supported; it has to go though object
# Perhaps having specialized subtypes of the wrappers would work. Or inlining IL code.
# TODO: Type assertions using @clr.accepts and @clr.returns do not properly display type names in error messages.
# When fixed, add tests to test_functions.py
# TODO: test_methodbinder2.py and BinderTest.cs: add tests using BigInteger where Int32 is tested
# It seems to me that tests in test_methodbinder2.py are incorrect. If they were originally script-generated,
# then the code just froze incorrect behaviour at that moment and the tests never really got fully reviewed
# Example: method group COverloads_Int32.M102 has two overloads: Int32 and Boolean. How can the test expect
# argument BigInteger(-200) to fail due to overflow? Basically, I'd expect big(200) and big(-200) to work exactly the same
# on all signed overloads Once this is done, myint(100) can be enabled.
# TODO: Make test_numtypes.py to pass. Probably best after a cleanup in IntOps.
# TODO: test_methoddispatch.py: test_multical_generator: add overload M3 and test BigInteger dispatch.
# TODO: Test that IComparable<BigInteger> and IEquatable<BigInteger> accept Int32.
# TODO: PythonOps.ConvertFloatToComplex can be simplified to the form of PythonOps.ConvertInt32ToBigInt
# TODO:  Extensible<T> fails on accessing static properties: https://github.com/IronLanguages/ironpython3/issues/1326
# When fixed, enable the rest of test_type_descs (test_cliclass.py:360)
# TODO (DLR): C:\Code\ironlang\ironpython3\Src\DLR\Tests\ClrAssembly\Src has a test file fieldtests.cs
# It is being used by tests in .\Tests\interop\net\field\
# It may be appropriate to write similar tests for properties, given that support for properties is kinda broken (#1326)
# TODO: consider using TypeCache.BigInteger instead of DynamicHelpers.GetPythonTypeFromType(typeof(BigInteger));
# TODO: Simplify nullability checks of PythonType instances.

i = 1            # Int32
j = 1<<64        # BigInteger

# before import System
assert set(dir(j)) == set(dir(i))
assert set(dir(i)) == set(dir(j))

import System

assert set(dir(i)) - set(dir(j)) == {'MinValue', 'MaxValue'}
assert set(dir(j)) - set(dir(i)) == {},  "TODO: should be empty: " + str(set(dir(j)) - set(dir(i)))

slozier

Looks good to me!

Side note, guess we'll have to update https://ironpython.net/documentation/dotnet/dotnet.html#mapping-between-python-builtin-types-and-net-types

slozier · 2022-02-28T15:07:18Z

Tests/test_tuple.py

@@ -47,13 +47,13 @@ def test_add_mul(self):
        self.assertEqual((1,2,3) * 2, (1,2,3,1,2,3))
        self.assertEqual(2 * (1,2,3), (1,2,3,1,2,3))

-        class mylong(long): pass
+        class mylong(int): pass


Maybe not worth the effort but I guess we could search the codebase for this pattern and use myint. Probably something for a follow-up.

I thought about it too but tried to restraint myself from too much cleanup in this PR to keep is as small as possible, but also because the idea of cleaning up tests can become a big distraction, there is so much that can be cleaned up and it is tempting.

In this particular case, if you just object to the name mylong, I have planned a pass on that in my follow ups. Since Python long is gone, long can only be read as Int64 and better not used at all except as a C# type.

If you meant using myint from type_util, then I was not planning of using it here, because the test also defines/uses mylong2 and I think it is more explicit about the test intentions to have both types defined side-by-side. So here probably I would use names myint1/myint2 instead but keep the class definitions.

slozier · 2022-02-28T15:41:22Z

Src/IronPython/Runtime/ConversionWrappers.cs


        public ListGenericWrapper(IList<object> value) { _value = value; }

        #region IList<T> Members

        public int IndexOf(T item) {
-            return _value.IndexOf(item);
+            int pos = _value.IndexOf(item);
+            if (IsBigIntWrapper && item is BigInteger bi && bi >= int.MinValue && bi <= int.MaxValue) {


I hope the jit is smart enough to discard all this and inline the thing. I wish C# had generic specialization...

I'm not sure it will get inlined or even compiled in an optimal way. For instance, even if T is BigInteger, going from T item to BigInteger bi likely involves boxing, similarly (in other places) casting a BigInteger result to T. I have put on my follow-up list to look into the prefromance of the conversion wrappers, but right now, short of creating dedicated classes for BigInteger and Nullable<BigInteger> or dropping down to IL, I don't have any ideas.

slozier · 2022-02-28T16:28:46Z

Tests/test_file.py

@@ -721,7 +721,7 @@ def test_buffering_kwparam(self):
        with self.assertRaises(ValueError): # can't have unbuffered text I/O
            open(file=fname, mode='w', buffering=0)

-        self.assertRaisesMessage(TypeError, "expected int, got float" if is_cli else "integer argument expected, got float",
+        self.assertRaisesMessage(TypeError, "expected Int32, got float" if is_cli else "integer argument expected, got float",


Not a big fan of having Int32 pop up on a TypeError for a standard Python operation. Though not a showstopper since it's an error message... Wonder if we'd be able to make the distinction between a method on a PythonType and a regular .NET method.

Interesting, I had the same reaction here. But I do like seeing the actual type in error messages from calling regular .NET methods, but then again how to tell the difference between them and methods used to implement Python operations? Maybe check if the method comes from one of IronPython assemblies?

Frankly at this moment, the whole error message creation in DLR is half-broken and would benefit from a redesign, though it is not high on my list of interests... If it comes to that point, I can keep this case in mind.

slozier · 2022-02-28T16:33:09Z

Tests/test_function.py

@@ -753,48 +753,54 @@ def classmeth(cls): pass
        self.assertEqual(D.classmeth.__class__, MethodType)

    def test_cases(self):
-        def runTest(testCase):
+        from collections import deque


Hah, was wondering how it worked before without the deque import, but I guess it didn't even run! Good catch.

BCSharp · 2022-02-28T20:45:39Z

Looks good to me!

Side note, guess we'll have to update https://ironpython.net/documentation/dotnet/dotnet.html#mapping-between-python-builtin-types-and-net-types

Is this page generated from some source, like RST? If so, where is it? The style sheets look rather outdated. I would be OK to review and update the page, and probably learn a few things for myself in the process.

Another side note: the link "Tools" at the top of that page is broken.

And more side notes: I thought I saw somewhere an issue report about some documentation that didn't get properly migrated from project main (or ironpython2) but now I can't locate it. Any ideas? If indeed there is some documentation missing, it could be taken/migrated in the same cleanup action (though probably separate PRs).

slozier · 2022-02-28T21:00:52Z

Unfortunately that particular page seems to be in html instead of RST. ~~I wonder if it was RST once upon a time...~~ Found this https://github.com/IronLanguages/main/blob/ipy-2.7-maint/Languages/IronPython/Public/Doc/dotnet-integration.rst which we might be able to salvage... The site hasn't gotten much attention in the past years.

Were you thinking of this issue? #1295

BCSharp · 2022-02-28T21:06:56Z

Were you thinking of this issue? #1295

Yes! Thank you.

BCSharp added 2 commits February 25, 2022 20:03

PEP 237: int/long unification

1eadfae

Update tests

58adfc0

slozier reviewed Feb 26, 2022

View reviewed changes

Update after review

4c8346f

slozier approved these changes Feb 28, 2022

View reviewed changes

slozier merged commit 3ed501b into IronLanguages:master Feb 28, 2022

BCSharp deleted the big_int branch February 28, 2022 21:07

This was referenced Mar 6, 2022

Eliminate Extensible<int> #1332

Merged

Revert workarounds in StdLib for Issue #52 #1333

Merged

Rename LongOps.cs to BigIntegerOps.cs #1335

Merged

Delete Complex64 #1336

Merged

Cleanup of constructors in BigIntegerOps/IntOps #1347

Merged

BCSharp mentioned this pull request Mar 27, 2022

Implement __int__ across all integer types #1382

Merged

BCSharp mentioned this pull request Apr 4, 2022

Cleanup IntOps #1390

Merged

This was referenced Apr 13, 2022

Mimic BigInteger members on Int32 #1399

Merged

Rename ConvertToLong to ConvertToInt #1403

Merged

BCSharp restored the big_int branch September 12, 2022 22:00

BCSharp deleted the big_int branch September 14, 2022 04:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PEP 237: int/long unification #1329

PEP 237: int/long unification #1329

BCSharp commented Feb 26, 2022

slozier left a comment

slozier Feb 26, 2022

BCSharp Feb 26, 2022

slozier Feb 26, 2022

BCSharp Feb 26, 2022

slozier Feb 26, 2022

BCSharp Feb 26, 2022

slozier left a comment

BCSharp commented Feb 26, 2022 •

edited

Loading

slozier left a comment

slozier Feb 28, 2022

BCSharp Feb 28, 2022

slozier Feb 28, 2022

BCSharp Feb 28, 2022

slozier Feb 28, 2022

BCSharp Feb 28, 2022

slozier Feb 28, 2022

BCSharp commented Feb 28, 2022

slozier commented Feb 28, 2022

BCSharp commented Feb 28, 2022

PEP 237: int/long unification #1329

PEP 237: int/long unification #1329

Conversation

BCSharp commented Feb 26, 2022

slozier left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

slozier left a comment

Choose a reason for hiding this comment

BCSharp commented Feb 26, 2022 • edited Loading

slozier left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BCSharp commented Feb 28, 2022

slozier commented Feb 28, 2022

BCSharp commented Feb 28, 2022

BCSharp commented Feb 26, 2022 •

edited

Loading