Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access to raw code objects #11

Closed
joonis opened this issue Mar 11, 2018 · 118 comments
Closed

Access to raw code objects #11

joonis opened this issue Mar 11, 2018 · 118 comments

Comments

@joonis
Copy link
Contributor

joonis commented Mar 11, 2018

I obfuscated the following Python script.

examples/test/mymod.py:

from __future__ import division, absolute_import, print_function, unicode_literals

def func2():
    print('func2')

def func3():
    print(err)

Build pyarmor:

python pyarmor.py init --src=examples/test --entry=mymod.py projects/test

python pyarmor.py config --disable-restrict-mode=1 projects/test

cd projects/test
./pyarmor build

In Python shell:

>>> import mymod, dis
>>> dis.dis(mymod)
Disassembly of func2:
  4     >>    0 LOAD_GLOBAL              0 (print)
              3 LOAD_CONST               1 (u'func2')
              6 CALL_FUNCTION            1
              9 POP_TOP             
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE        
             14 NOP                 
             15 NOP                 
             16 NOP                 
             17 NOP                 
             18 NOP                 
             19 NOP                 
             20 NOP                 
             21 NOP                 
             22 NOP                 
             23 NOP                 
             24 JUMP_ABSOLUTE            0

Disassembly of func3:
  7     >>    0 LOAD_GLOBAL              0 (print)
              3 LOAD_GLOBAL              1 (err)
              6 CALL_FUNCTION            1
              9 POP_TOP             
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE        
             14 NOP                 
             15 NOP                 
             16 NOP                 
             17 NOP                 
             18 NOP                 
             19 NOP                 
             20 NOP                 
             21 NOP                 
             22 NOP                 
             23 NOP                 
             24 JUMP_ABSOLUTE            0

Is it the correct behaviour that you have access to the raw code objects of a module?
So, what is the advantage of using pyarmor? Thanks.

@jondy
Copy link
Contributor

jondy commented Mar 13, 2018

If restrict mode is enable, that is the default behavior, there is a limit:

  • Obfuscated script can't imported from non-obfuscated scripts.

That is, you CAN NOT disassemble obfuscated module "mymod" by "dis", because no any other script can import "mymod" out of obfuscated scripts.

Refer to Restricted mode

But if restrict mode is disable as you do, you can import obfuscated scripts from other script. Although byte code can not be disassembled at the begin, but the byte code of code object will be restored after its first called, so you can disassemble those code objects which have been executed.

I have taken it into account to obfuscate byte code after it returns, but it will affect performance. Before I find good way, restrict mode is one solution, of course, it's not apply to all of the cases.

You're a great guy!

@joonis
Copy link
Contributor Author

joonis commented Mar 13, 2018

Well, in this case the Odoo example is more or less useless.

https://github.com/dashingsoft/pyarmor/blob/master/src/user-guide.md#obfuscate-odoo-module

Because I can even use the raw bytecode (.pyc) rather than encrypting in unrestricted mode to get the same level of obfuscation.

Pyarmor should wrap all functions and methods:

def wrap(func):
    def wrapper(*args, **kwargs):
        return func(*args, **kwargs)
    wrapper.__module__ = func.__module__
    wrapper.__name__ = func.__name__
    wrapper.__doc__ = func.__doc__
    wrapper.__dict__.update(func.__dict__)
    return wrapper

Any downsides to this?

@jondy
Copy link
Contributor

jondy commented Mar 13, 2018

The only downside of wrap all functions is to reduce the performance, but it's a way. I'll try to implement it and test the performance. If it's acceptable, add this feature in next minor release. Thanks.

@jondy
Copy link
Contributor

jondy commented Mar 13, 2018

The workaround may be like this

  1. Add a built function __wraparmor__
    It will restore func_code before call function and obfuscate func_code after it returns.

  2. Add a decorator wraparmor

def wraparmor(func):
    def wrapper(*args, **kwargs):
        __wraparmor__(func.func_code)
        result = func(*args, **kwargs)
        __wraparmor__(func.func_code)
        return result
    wrapper.__module__ = func.__module__
    wrapper.__name__ = func.__name__
    wrapper.__doc__ = func.__doc__
    wrapper.__dict__.update(func.__dict__)
    return wrapper
  1. Compile python scripts to ast node, and add decorator "wraparmor" for each function and class method.

  2. Obfuscate wrapped python scripts.

@joonis
Copy link
Contributor Author

joonis commented Mar 13, 2018

But it should go into a try/finally:

def wraparmor(func):
    def wrapper(*args, **kwargs):
        __wraparmor__(func.func_code)
        try:
            return func(*args, **kwargs)
        finally:
            __wraparmor__(func.func_code)
    wrapper.__module__ = func.__module__
    wrapper.__name__ = func.__name__
    wrapper.__doc__ = func.__doc__
    wrapper.__dict__.update(func.__dict__)
    return wrapper

With the downside, that we loose the function signatures...

@jondy
Copy link
Contributor

jondy commented Mar 13, 2018

Nice. And I read the source of built-in decorator classmethod and staticmethod in file Objects/funcobject.c. Maybe it's another way to write a quick decorator.

@jondy
Copy link
Contributor

jondy commented Mar 15, 2018

The final decorator will be

def wraparmor(func):
    def wrapper(*args, **kwargs):
         __wraparmor__(func)
         try:
             return func(*args, **kwargs)
         finally:
             __wraparmor__(func, 1)
    wrapper.__module__ = func.__module__
    wrapper.__name__ = func.__name__
    wrapper.__doc__ = func.__doc__
    wrapper.__dict__.update(func.__dict__)
    func.__refcalls__ = 0
    return wrapper

Add an attribute __refcalls__ to fix recursive call and multi-threads problem.

@jondy
Copy link
Contributor

jondy commented Mar 15, 2018

Released in v3.7.0

Here is the basic usage: Use decorator to protect code objects when disable restrict mode

Here is an example: Protect module with decorator "wraparmor"

@joonis
Copy link
Contributor Author

joonis commented Mar 15, 2018

Seems to work. Even the debugger crashes ;) but this needs further testing...

One thing to note: If an exception occurs in func someone could go back in the stack frames and access the local variables of func. To prevent this you could modify the wrapper function:

def wraparmor(func):
    def wrapper(*args, **kwargs):
        __wraparmor__(func)
        try:
            return func(*args, **kwargs)
+       except Exception as err:
+           raise err
        finally:
            __wraparmor__(func, 1)
    wrapper.__module__ = func.__module__
    wrapper.__name__ = func.__name__
    wrapper.__doc__ = func.__doc__
    wrapper.__dict__.update(func.__dict__)
    func.__refcalls__ = 0
    return wrapper

This way it is possible to pass the corresponding frame. It has the downside of a wrong location of the exception and makes debugging more difficult. But if you have some secret variables in a particular function someone could use this decorator instead.

@joonis
Copy link
Contributor Author

joonis commented Mar 16, 2018

Users must also know that globals are mutable but locals are not. So you should always wrap the whole module within a function and just make public functions global:

@wraparmor
def __load__():

    @wraparmor
    def my_private_func():
        pass

    global my_public_func
    @wraparmor
    def my_public_func():
        pass

__load__()
del __load__

@joonis
Copy link
Contributor Author

joonis commented Mar 16, 2018

One more thing: Would it be possible to allow the import of the entry module only, but restrict the direct import of any other module?

@jondy
Copy link
Contributor

jondy commented Mar 16, 2018

Good idea! The workaround would be Only entry module can run "pyarmor_runtime()"

Rationale:

  • pyarmor_runtime will call c function "init_runtime"
  • init_runtime will check the current code object in the stack frame whether it's entry module or not
  • The identify information of entry module will be stored in the project license (license.lic)

If there is more than one entry, end users just generate different project license for each entry script.

Why didn't I think of it, it's terrific! After all, the way to use decorator is somewhat complicated, and just as you have mentioned, need more and more tests.

@joonis
Copy link
Contributor Author

joonis commented Mar 16, 2018

Next issue: Only encrypted modules must be allowed to call __wraparmor__. Otherwise everyone could just decrypt an encrypted function.

@jondy
Copy link
Contributor

jondy commented Mar 16, 2018

From v3.7.1, __wraparmor__ only can be called from decorator wraparmor, it will check like this

static PyObject *wrapcaller = NULL;
static PyObject*
__wraparmor__(PyObject *self, PyObject *args)
{
    PyObject *co_code = PyEval_GetFrame()->f_code->co_code;
    // First time call, set wrapcaller to code object of function wrapper in decorator
    if (!wrapcaller) 
        wrapcaller = co_code;
    // If it's not called from the same code object, return NULL, and no exception set
    else if (wrapcaller != co_code)
       return NULL;
    // Go on ...

@jondy
Copy link
Contributor

jondy commented Mar 16, 2018

Adding checkpoints in c function init_runtime can NOT avoid importing obfuscated module from other scripts, it only works when entry script is module __main__. So it's not apply to the case of odoo module. I think it over, and I found it's difficult to deal with this case. For now, use decorator "wraparmor" is one alternative solution.

@joonis
Copy link
Contributor Author

joonis commented Mar 17, 2018

Great, __wraparmor__ seems to work reliable now.

@joonis
Copy link
Contributor Author

joonis commented Mar 17, 2018

As soon as my time allows I will migrate a project to pyarmor and buy a license. I let you know if I run into troubles.

Great work!

@joonis
Copy link
Contributor Author

joonis commented Mar 20, 2018

Is it somehow possible to prevent a wrapped function dectrypting itself by just calling it?
Because a wrapped function should be decryptable only by __wraparmor__...

@jondy
Copy link
Contributor

jondy commented Mar 20, 2018

But I think if wrapped function will be encrypted again as soon as it returns in any way, it should not be a problem.

@joonis
Copy link
Contributor Author

joonis commented Mar 20, 2018

Well, I think you are right. Just triggered an exception, got the respective wrapper frame, the wrapped function in its locals and tried to call it directly. It did not work. So everthing seems to be fine. But shouldn't it be callable this way?

@jondy
Copy link
Contributor

jondy commented Mar 21, 2018

I have refined __wraparmor__ in v3.8.1, it will crash to access original func_code out of decorator.

@joonis joonis closed this as completed Mar 21, 2018
@joonis
Copy link
Contributor Author

joonis commented Mar 29, 2018

Unfortunately ipython crahes as well now, since it tries to access the func_code apparently.
Anyway, no problems with v3.8.0. Maybe Pyarmor should just raise a SystemError again, instead causing a segfault. What do you think?

@joonis joonis reopened this Mar 29, 2018
@jondy
Copy link
Contributor

jondy commented Mar 30, 2018

It's not possible to raise SystemError because __wraparmor__ just modify some members of func_code when function returns. One way is to add an option to disable this constraint. For example,

cd /path/to/project
# When mode is 3, do not change members of func_code, so ipython can work
./pyarmor config --disable-restrict-mode=3

@joonis
Copy link
Contributor Author

joonis commented Mar 30, 2018

Patching IPython works as well: https://stackoverflow.com/a/28758396
It's the better solution I think.

@joonis joonis closed this as completed Mar 30, 2018
@joonis
Copy link
Contributor Author

joonis commented Mar 30, 2018

The problem is accessing co_varnames of the wrapped code object.
Maybe there is a possibility to leave this attribute untouched?

@joonis joonis reopened this Mar 30, 2018
@jondy
Copy link
Contributor

jondy commented Mar 30, 2018

It's no problem.

@jondy
Copy link
Contributor

jondy commented Apr 7, 2018

Not yet, I find this issue just now. Fixed in v3.8.8

@joonis
Copy link
Contributor Author

joonis commented Apr 7, 2018

Unfortunately it doesn't work in v3.8.8 either.
The getter must be set in the first call of __wraparmor__ which decrypts the bytecode.

@joonis
Copy link
Contributor Author

joonis commented Apr 7, 2018

Further more it should be possible to call __wraparmor__(func, tb) with just two arguments.
In this case the frames should be cleared but the code object should not be encrypted again.
I don't know how this is handled at time. Currently it already seems to work, but I don't know if frames are cleared correctly.

This is required to support dynamically created functions and it solves the problem with generator functions.

Are there any issues calling __wraparmor__(func) repeatedly without finalizing it
with __wraparmor__(func, tb, 1)?

@joonis
Copy link
Contributor Author

joonis commented Apr 7, 2018

Ok, encryption depends on func.__refcalls__. If I set it initially to 1 the bytecode is never encrypted again, right? But is it cleaned on exceptions?

@jondy
Copy link
Contributor

jondy commented Apr 7, 2018

func.__refcalls__ must be 0 initially. And __wraparmor__(func, tb, 1) must match __wraparmor__(func).
Otherwise, pyarmor will think the function is running, so does nothing.
Call __wraparmor__(func) will increase func.__refcalls, and __wraparmor__(func, tb, 1) decrease.
Only func.__refcalls__ equals 1, pyarmor goes into action.

Can still the callback get data from frame.f_locals? I have a simple testcase, the callback only get an empty dictionary.

@jondy
Copy link
Contributor

jondy commented Apr 7, 2018

Call __wraparmor__(func, tb) will not decrease func.__refcalls, so the function will not be encrypted again, and not tp_clear frame. But the customized getter takes effect, frame.f_locals still returns {}

@joonis
Copy link
Contributor Author

joonis commented Apr 7, 2018

If I call an external function from an encrypted function regardless of an exception,
the getter is not set and the external function can access the locals of the encrypted function.
Therefore the getter should be set initially or at least in the first call of __wraparmor__(func).

Furthermore calling __wraparmor__(func, tb, 1) should clear the frames in any case, regardless of whether func.__refcalls__ counts zero or not. Because the frames are independent from that value
and certainly finalized on exceptions. It's only important for the code object of the function.

@joonis
Copy link
Contributor Author

joonis commented Apr 7, 2018

The latter would also solve issues with generator functions, since the wrapper can initially set func.__refcalls__ to 1. So such functions won't be encrypted again but its locals are still protected.

@joonis
Copy link
Contributor Author

joonis commented Apr 7, 2018

Have a look at this example:

        def wraparmor(func, shift_error=False, is_generator=False):
            func.__refcalls__ = is_generator and 1 or 0
            def wrapper(*args, **kwargs):
                __wraparmor__(func)
                tb = None
                try:
                    return func(*args, **kwargs)
                except Exception as err:
                    tb = exc_info()[2]
                    if shift_error:
                        raise err
                    raise
                finally:
                    __wraparmor__(func, tb, 1)
            return wrapper

        def wrapgenerator(func):
            return wraparmor(func, is_generator=True)

        @wrapgenerator
        def myfunc():
            for i in range(10):
                yield i

You cannot encrypt myfunc before the generator exits, otherwise it will cause a segfault.
func.__refcalls__ = 1 solves this issue.

@joonis
Copy link
Contributor Author

joonis commented Apr 8, 2018

And here an example of the first issue.

The encrypted module:

@wraparmor
def do_something(progress_callback):
    foo = 'secret'
    progress_callback()

End user script:

from encrypted_module import do_something

def mycallback():
    frame = sys._getframe(1)
    print(frame.f_code.co_name, frame.f_locals)

do_something(mycallback)

@jondy
Copy link
Contributor

jondy commented Apr 8, 2018

No extra parameter is_generator=False required, it's better to tell generator in __wraparmor__:

    if (co_flags & CO_GENERATOR) {
       // Do not obfuscate byte-code
    }

And func.__refcalls__ always is set to 0 initially.

@joonis
Copy link
Contributor Author

joonis commented Apr 8, 2018

Even better, yes.

@jondy
Copy link
Contributor

jondy commented Apr 8, 2018

About the callback issue, I have almost same test case as your example, the callback can only get an empty dictionary in v3.8.8, does still the cached version takes effect?

@joonis
Copy link
Contributor Author

joonis commented Apr 8, 2018

No, version is 3.8.8.
I will try a simpler test case tomorrow...

@joonis
Copy link
Contributor Author

joonis commented Apr 8, 2018

Here it doesn't work. I still get the locals.
When is the getter set in v3.8.8?

@jondy
Copy link
Contributor

jondy commented Apr 8, 2018

The getter is set when any __wraparmor__(func) is called first time.

@jondy
Copy link
Contributor

jondy commented Apr 8, 2018

Here it's latest version

http://pyarmor.dashingsoft.com/downloads/platforms/linux_x86_64/_pytransform.so
md5sum: 12d5e6b48ba58c11b00d8958f6580933

In this version

  • Generator will not be obfuscated in any case
  • Frame will be clear as long as tb is not None

I need to be sure the callback issuse is fixed before publish.

@joonis
Copy link
Contributor Author

joonis commented Apr 8, 2018

Latest version does work!
There is just one last issue with some nested decorators.
But I don't get it broken down to a small piece of code...

@joonis
Copy link
Contributor Author

joonis commented Apr 8, 2018

Ok, I got it. The following example causes a segfault:

def factory():
    
    @wraparmor
    def nestedfunc(f=None):
        if f:
            f()
  
    return nestedfunc

func1 = factory()
func2 = factory()
func1(func2)

The problem is, that the codeobject of nestedfunc is always the same. So func1 and func2 hold a reference to the same codeobject. But we use a reference counter for each function instead the codeobject. Don't know if we could misuse some attribute of the codeobject for that ...

@jondy
Copy link
Contributor

jondy commented Apr 8, 2018

Fixed in v3.8.10. Use co_names->ob_refcnt as counter of each function call.

@joonis
Copy link
Contributor Author

joonis commented Apr 8, 2018

So func.__refcalls__ is no longer required?

@jondy
Copy link
Contributor

jondy commented Apr 8, 2018

It's still required in v3.8.10. But it can be removed in next version as long as co_names->ob_refcnt is stable.

@joonis
Copy link
Contributor Author

joonis commented Apr 8, 2018

The example above works now. But there is still an issue with a similar construct. Strange.
For what is func.__refcalls__ used at the moment?

@jondy
Copy link
Contributor

jondy commented Apr 9, 2018

I'm not sure whether co_names->ob_refcnt works in complex circumstance or not, so keep func.__refcalls__ temporary.

@joonis
Copy link
Contributor Author

joonis commented May 22, 2018

Since v3.9 it runs into a segfault again, even if obfuscated without --obf-code-mode=wrap.

@jondy
Copy link
Contributor

jondy commented May 22, 2018

Maybe it's better use the previous version of Pyarmor, only update the latest dynamic library _pytransform.so

--obf-code-mode=wrap would be the default mode from v3.9 if obfuscating a package, and yes, it's conflicted with decorator wraparmor. But it should work with other mode, for example

    python pyarmor.py init --src /path/to/package projects/mypackage
    cd projects/mypackage
    ./pyarmor config --obf-code-mode=des
    ./pyarmor build

@joonis
Copy link
Contributor Author

joonis commented May 22, 2018

Just updating the dynamic libraries results in a segfault as well.
Same with --obf-code-mode=des and v3.9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants