New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GSOC 2019: Nuitka all built-ins optimized #232

Open
kayhayen opened this Issue Jan 26, 2019 · 14 comments

Comments

Projects
None yet
3 participants
@kayhayen
Copy link
Member

kayhayen commented Jan 26, 2019

Project description: Nuitka has support for many built-ins, e.g. len already, which means dedicated C code, compile time evaluation, type shapes produced (in this case an int), but there are some notable exceptions, e.g. enumerate where we know types too, that are still missing, but definitely can have high performance impact on some loops.

Your task would be to immitate existing built-in codes to achieve a complete support for ultimately all C built-ins. The first step would be to identify which ones are missing (by means of a warning added), then to find out in test runs of the test suites, which ones are warned about, and to resolve as many of those as possible. It is assumed that all should be possible.

Skills: Python and C programming, platform wouldn't matter

@bksahu

This comment has been minimized.

Copy link
Contributor

bksahu commented Feb 3, 2019

Hello @kayhayen

This sounds really interesting so I am interested to work on it this summer. I am proficient in Python and also have some experience in C. Could you please let me know how should I get started? Also could you please provide some relevant pointers to this issue?

@kayhayen

This comment has been minimized.

Copy link
Member Author

kayhayen commented Feb 3, 2019

I am currently updating the ideas page with more details. I will also post information to this issue later today.

@kayhayen

This comment has been minimized.

Copy link
Member Author

kayhayen commented Feb 3, 2019

So that update is out. Basically this is pointer number one, each (called) built-in at compile time is represented with a node like this one:

def getTypeShape(self):

Then, should this node not be optimized away during compile time, there is code generation for it that then points to a function that emits C code with an entry in a dictionary:

"EXPRESSION_BUILTIN_LEN" : generateBuiltinLenCode,

This results then in pretty generic code:

def generateBuiltinLenCode(to_name, expression, emit, context):
    generateCAPIObjectCode(
        to_name          = to_name,
        capi             = "BUILTIN_LEN",
        arg_desc         = (
            ("len_arg", expression.getValue()),
        ),
        may_raise        = expression.mayRaiseException(BaseException),
        conversion_check = decideConversionCheckNeeded(to_name, expression),
        source_ref       = expression.getCompatibleSourceReference(),
        emit             = emit,
        context          = context
    )

This basically emits C code that calls one single C function. These are typically very easy to write, and we can help with this:

   PyObject *BUILTIN_LEN(PyObject *value) {
       CHECK_OBJECT(value);

       Py_ssize_t res = PyObject_Size(value);

       if (unlikely(res < 0 && ERROR_OCCURRED())) {
           return NULL;
       }

       return PyInt_FromSsize_t(res);
   }

@kayhayen

This comment has been minimized.

Copy link
Member Author

kayhayen commented Feb 3, 2019

Then to turn a built-in reference like len that is called, into the mentioned node type, there is an optimization code like this:

def len_extractor(node):

Which is tied to the len name like this:

"len" : len_extractor,

in a dictionary.

@kayhayen

This comment has been minimized.

Copy link
Member Author

kayhayen commented Feb 3, 2019

Names in the dictionary

and

builtin_named_values = dict(

will mismatch in extent. Your first task would be identify what's missing, and then tackle them, probably starting out with something simple, say all and any, then work your way up to enumerate and so on.

@kayhayen

This comment has been minimized.

Copy link
Member Author

kayhayen commented Feb 3, 2019

One thing that I forgot to point out, during the optimization of the built-in name call to the len node, it's pretty generic, and uses a definition of that built-in that lives here:

builtin_len_spec = BuiltinParameterSpecNoKeywords("len", ("object",), 0)

This one defines len as a function with one argument, that refuses keyword arguments (cannot call as len(object = x) and in that file there are classes for defining classes to use for built-in signatures, often the harder part can be to find out the truth of how many arguments are really required, are names accepted, or not.

@kayhayen

This comment has been minimized.

Copy link
Member Author

kayhayen commented Feb 3, 2019

As you can see, this is going to give you a nice tour of Nuitka from optimization to code generation, and across various built-ins you may not even know exist, even if you are proficient at Python. I sure didn't know all of those implemented so far.

@bksahu

This comment has been minimized.

Copy link
Contributor

bksahu commented Feb 3, 2019

I totally agree with you. First let me setup the my environment and then I will try to optimize a simple build-in.

@bksahu

This comment has been minimized.

Copy link
Contributor

bksahu commented Feb 5, 2019

Hi. I have been reading the docs but couldn't find anything on How to test those build-in optimizations codes. I tried compiling a small script having both optimized build-ins (like len) and not-optimized (like any) codes in it with Nuitka but couldn't find any significant difference.

@kayhayen

This comment has been minimized.

Copy link
Member Author

kayhayen commented Feb 5, 2019

@bksahu Good point, please allow a bit of time, and I will try and post a diff with an example.

@bksahu bksahu referenced this issue Feb 6, 2019

Open

Adding support for "any" built-in #246

1 of 4 tasks complete
@bksahu

This comment has been minimized.

Copy link
Contributor

bksahu commented Feb 6, 2019

@kayhayen I have tried to imitate for any like the len built-in. Also, where can find docs for classes involved in this whole optimization process?

@MrJiggsaw

This comment has been minimized.

Copy link

MrJiggsaw commented Feb 7, 2019

Hello I am interested in this topic and I would love to work on this.I am good at Python and I would like to contribute to this.Any guide for how to proceed will be grateful.I have seen @bksahu imitating any to the environment but I am little confused .It would be good if you guide me with a path

@kayhayen

This comment has been minimized.

Copy link
Member Author

kayhayen commented Feb 8, 2019

@bksahu So this is an example of expected effect for generated C code:

Using this test program:

def f():
    x = something()
    return len(x)


-        PyObject *tmp_called_name_2;
-        PyObject *tmp_args_element_name_1;
-        tmp_called_name_2 = LOOKUP_BUILTIN( const_str_plain_len );
-        assert( tmp_called_name_2 != NULL );
+        PyObject *tmp_len_arg_1;
         CHECK_OBJECT( var_x );
-        tmp_args_element_name_1 = var_x;
-        frame_8b78d490904d40bde1d1d570167070ef->m_frame.f_lineno = 4;
-        {
-            PyObject *call_args[] = { tmp_args_element_name_1 };
-            tmp_return_value = CALL_FUNCTION_WITH_ARGS1( tmp_called_name_2, call_args );
-        }
-
+        tmp_len_arg_1 = var_x;
+        tmp_return_value = BUILTIN_LEN( tmp_len_arg_1 );

I got that by disabling the len built-in call optimization, and it shows that instead of calling the built-in name len as a generic function call, we have the dedicated helper BUILTIN_LEN being called.

That is the kind of difference this is about when facing C code. As you can see, this code is not only simpler, but BUILTIN_LEN will be visible to the linker for link time optimization (LTO) once we have that, and can become amazingly faster instead of a relatively generic Python call.

@MrJiggsaw

This comment has been minimized.

Copy link

MrJiggsaw commented Feb 9, 2019

Wow I got a nice glimpse of it .Can you refer me more examples ?Maybe which is inside this repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment