Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyInstaller with built in Cythonize option and _MEIxxxx on Memory #7584

Closed
gaamaaresosa opened this issue Apr 19, 2023 · 16 comments
Closed

PyInstaller with built in Cythonize option and _MEIxxxx on Memory #7584

gaamaaresosa opened this issue Apr 19, 2023 · 16 comments
Labels
feature Feature request

Comments

@gaamaaresosa
Copy link

gaamaaresosa commented Apr 19, 2023

Is there any plan to incorporate built in Cython conversion on PyInstaller ?

I am trying some readymade samples which throw lot of errors and no success.
I am not an expert on Linux and Python to fix those errors.

You are the best team to add this feature, Cythonizing most of the files on PyInstaller itself.
And bunding as a single exe.
That would be faster and secured.

Is there any way to extract _MEIxxxx temp folder on Memory (ramdisk) ?
That too hidden to the normal eyes ?
--runtime-tmpdir /mnt/.pyinst
A dot prefix can not be so clear idea to hide. Any better idea in linux ?

Note:
Mean while, please suggest any simple tutorial to bundle the python/flask project with
Cython and PyInstaller

Thanks in advance !

System:
PyInstaller 5.10.1
Raspberry Pi 4
Linux Buster 32bit

@gaamaaresosa gaamaaresosa added feature Feature request triage Please triage and relabel this issue labels Apr 19, 2023
@gaamaaresosa gaamaaresosa changed the title PyInstaller with built in Cithonize option and _MEIxxxx on Memory PyInstaller with built in Cythonize option and _MEIxxxx on Memory Apr 19, 2023
@rokm
Copy link
Member

rokm commented Apr 20, 2023

Is there any plan to incorporate built in Cython conversion on PyInstaller ?

There was a tentative and low-priority plan, but based on #6999 (comment) and subsequent comments, it seems rather unlikely now.

Is there any way to extract _MEIxxxx temp folder on Memory (ramdisk) ?
That too hidden to the normal eyes ?
--runtime-tmpdir /mnt/.pyinst
A dot prefix can not be so clear idea to hide. Any better idea in linux ?

Why bother? The format of executable-embedded archive is not exactly a secret and can be read using the PyInstaller-provided pyi-archive_viewer utility. Even if that was not the case, wherever you unpack your executable, the contents need to be readable by the user in order for them to be able to run the program.

I get that you're trying to push for a code obfuscation/protection solution, but PyInstaller is not a code obfuscation and protection tool, and does not intend to become one. If you need a solution for that, you're in the wrong place.

@rokm rokm removed the triage Please triage and relabel this issue label Apr 20, 2023
@rokm rokm closed this as not planned Won't fix, can't repro, duplicate, stale Apr 20, 2023
@bwoodsend
Copy link
Member

Even if Cython was incorporated into PyInstaller, it still wouldn't be much easier. It's compiled code so to make it ABI compatible with more than just newer versions of your own OS, you'd need to get gcc, python, your project, your dependencies and umpteen -devel packages, all in a docker container and build in there.

I'd be reluctant even to document this into an example because the bottom line would be, if you're not already comfortable with Cython and with distributing compiled code across distributions (i.e. you don't need an example from us) then you shouldn't be attempting it.

@gaamaaresosa
Copy link
Author

gaamaaresosa commented Apr 20, 2023

(i.e. you don't need an example from us) then you shouldn't be attempting it.

Thanks to everyone !
I have successfully done the proof of concept combining Cython+PyInstaller
I got very light weight executable also.
Only few lines of python code made it well.

Why I needed PyInstaller ?
It does seamless building of my Flask Static/Template folders and all the Cythonized .so files also.

Million of thanks to PyInstaller team !

But again I say I am not an expert like you guys.

And please don't think security is not improtant to protect the source code/credentials/APIKeys etc.
You experts know how to secure it.
But intermediate like us trust on your easy most valuable tools like PyInstaller and moving on...

Thanks again !

@gaamaaresosa
Copy link
Author

I get that you're trying to push for a code obfuscation/protection solution, but PyInstaller is not a code obfuscation and protection tool, and does not intend to become one. If you need a solution for that, you're in the wrong place.

May be yes !
Whats worng in it ?
One day or other every one should move to next level right ?
May be some one will take PyInstaller code and add very good security to the built binary.
May be named as "CyInstaller"

Few community may not bother to secure their source code but most of other do.
We should not think all are same.

I agree adding security on to the PyInstaller may not be your interest.
I welcome that.

We will wait for the "CyInstaller" to come soon !
Anyhow every one thank PyInstaller first !

@da-woods
Copy link

Is there any plan to incorporate built in Cython conversion on PyInstaller ?

There was a tentative and low-priority plan, but based on #6999 (comment) and subsequent comments, it seems rather unlikely now.

Just to clarify - if there's anything we can do to make it easier to bundle Cython-compiled modules into PyInstaller or similar tools I'd be happy to look at it. There's plenty of good reasons people might want to do that. (I don't personally know how well it works right now).

The main thing we're not really interested in is being an obfuscation tool, so feature requests purely around that are likely to be rejected.

@gaamaaresosa
Copy link
Author

gaamaaresosa commented Apr 25, 2023

bundle Cython-compiled modules into PyInstaller

Which already I made in 3 days time called "CyInstaller"
Just 300 lines of code:

  1. Read the OriginalMain.py file, scan the strings and convert chr()
    Eg: API_KEY = 'This is my API Key '
    API_KEY = chr(84) + chr(104) + chr(105) + chr(115) + chr(32) + chr(105) + chr(115) + chr(32) + chr(109) + chr(121) + chr(32) + chr(65) + chr(80) + chr(73) + chr(32) + chr(75) + chr(101) + chr(121)

  2. Scan the OriginalMain.py list all "import and from" lines and prepare a dummy.py

  3. Call Cythonize command passing OriginalMain.py and get OriginalMain.so file

  4. Create a starter NewMain.py by importing the cythonizied OldMain.so and dummy.py file

  5. Pass this NewMain.py to PyInstaller, which will make a single executable file
    Which has got all our modules as *.so (Exellent native binary without easy reversing to source code)

  6. Try to extract the PyInstaller executable and you will find only protected *.so file
    Plus only the NewMain.py which has nothing big secret.

My work is still going on to hide the string information from cythonized *.so files.
Which is badly exposing all strings on binary.
This is a hot cake to any beginer crackers to patch the binary and bypass license code.

NOTE:
If the above process is done my any experts (PyInstaller team or Cython team) it will be an ultimate secured solution to all Python communities.

@gaamaaresosa
Copy link
Author

gaamaaresosa commented Apr 25, 2023

The main thing we're not really interested in is being an obfuscation tool, so feature requests purely around that are likely to be rejected.

One must know why securing source code is higly improtant now a days.
Only hard working coder knows the pain of spending years to make a software and the competitors easily get the source code, modify and realease in different name.
Also selling cheaper.
These are more after the modern so called dot net, python and other non native binary based languages.
Easy to code and easier to crack.

@bwoodsend
Copy link
Member

Which already I made in 3 days time called "CyInstaller"
Just 300 lines of code:

And in 30 seconds and two lines of code I can run:

import OriginalMain
print(OriginalMain.API_KEY)

Just avoiding writing strings in plain text isn't going to stop anyone armed with more than just a hexdump. Your attackers are going to be able to load your libraries, run your code with a debugger or tracer attached and scrape all its variables and __code__ attributes, use heap dumps to export your program's memory where they can scan for in-memory secrets (i.e after they've been de-obfuscated), use wireshark to sniff network traffic going in and out, swap out libraries like libssl.dll with ones that log every piece of information your program encrypts or decrypts.

@gaamaaresosa
Copy link
Author

gaamaaresosa commented Apr 25, 2023

And in 30 seconds and two lines of code I can run

Boss ! you are always great !
I already admitted I am not an expert.

I never challenge real crackers, I first bother about student level attackers.
As you said if anything should be cracked or any house lock could be opened,
There why crypto, themida and costly locks for home, why secret API keys and App Keys ?

Right now for PyInstaller binaries no need for that students too.
Any developers or user them self can disassemble to get the source code.

There are several layers is security.
We expect some basic layers only to escape from the local team.

import OriginalMain
print(OriginalMain.API_KEY)

This is how our security level is there by the experts who develops compilers.
If you know "OriginalMain.API_KEY" this name only you could call right ?
If this name is really hidden (Obfuscated) how you will call ?
You have to waste lot of time and get bored and leave your desk.
This is what at the most of the developers could do (delay).

Usually patching is easy with JE/JNE/JMP, but if you keep 20 hotspots, then more delay.

That's why I get upset with all our compilers exposes these strings to the world.
That can easily hide that.

I learnt cracking 20 years back and that's why I am surviving to safe my products.
I know how to do in Windows, but I am new to Linux.
Windows tools will not support Linux binaries.
And to recreate all my tools for Linux will take ages.

Please let me know if you come accross some tips on the topics.

@bwoodsend
Copy link
Member

If you know "OriginalMain.API_KEY" this name only you could call right ?

dir(OriginalMain) would give me all available variables and functions.

@gaamaaresosa
Copy link
Author

gaamaaresosa commented Apr 25, 2023

If you know "OriginalMain.API_KEY" this name only you could call right ?

dir(OriginalMain) would give me all available variables and functions.

My point is locking the street gate, tying an alsatian, locking front gate, locking the main gate,
then locking my locker.
Can any one say as "No one could break my locker" ?

But still I expect a good lock for my street gate first.
:)

@gaamaaresosa
Copy link
Author

gaamaaresosa commented Apr 26, 2023

@bwoodsend

dir(OriginalMain)

Thank you so much to open my eyes.
I tried this an entire things are exposed.
Why so weak our technologies are ?
Better we will ship the source code itself to the clients.

By the way, can I understand executables are safer than the libraries ?
Or is there any was to explore the executables also like dir() ?

Please give me some tips on this security.
If this thread is not fit for such questions, where I can reach you ?

@rokm
Copy link
Member

rokm commented Apr 26, 2023

By the way, can I understand executables are safer than the libraries ?

After all discussion, are you still somehow hoping for an answer that is not a "no"? :)

Assuming I know your executable is made with PyInstaller, I can use PyInstaller's pyi-archive_viewer tool to extract the embedded PYZ archive, and any collected .pyc module from it. Then, assuming I have a matching python version available (which is not terribly hard to guess, since it's in the name of collected python shared library; and also explicitly stored in embbedded archive's header), I can load any of your top-secret modules and dir them. If necessary, I can extract all modules from PYZ into filesystem, and re-use them as regular source-less modules. For example, reuse them in my application (assuming I figure out the API, and as long as I don't change the python version).

And if you use the built-in bytecode encryption (the --key option) to obfuscate .pyc modules, I can extract the pyimod00_crypto_key module, load and dir it to obtain the encryption key (or just find the string with hex editor), and decrypt them before using them.

@gaamaaresosa
Copy link
Author

gaamaaresosa commented Apr 26, 2023

@rokm
First of all thanks for your detailed explanations.
As a developer I couldn't skip this security concern and proceed wasting my life with coding.

Assuming I know your executable is made with PyInstaller

I am sorry I didn't explaing on this before.
No doubt I finally wanted to use PyInstaller as a single executable maker.
Which is bug free and stable I use for years.
Now my PyInstaller executable is reversed and same application is sold by the crackers.
That's why I am fed up and looking into security first.

So I need to produce my executable and libaries as a native binary, which can't be reversed to original human readable source code. May be using cython.

  1. I don't know how to make an onefile executable from PyInstaller without a Starter.py file.
    And Starter.py needs to import my execuatable binary (also other shared lib.so) which is not possible to
    import executable.
    Some steps on PyInstaller I am not familiare to handle here.
    I mean:
    A) Cython build executable binary file.
    B) Other cython built shared libraries *.so files
    C) I need single executable file by combining above A and B

    I can use PyInstaller's pyi-archive_viewer tool to extract the embedded PYZ archive

    Here after even though the PYZ is extracted only native binaries available for them.

  2. Is there anything is linux like executable packer which will do above A + B = C ?
    Of course I don't want any other system dependencies, which already present on my Raspberry Pi OS
    image.

  3. Is there any commecial tool to do this on Linux.
    Where as on WIndows Themida is the best.

Thanks again !

@rokm
Copy link
Member

rokm commented Apr 26, 2023

So I need to produce my executable and libaries as a native binary, which can't be reversed to original human readable source code. May be using cython.

Well, you're in a wrong place then. PyInstaller is not a code obfuscation and protection tool, and it does not intend to become one.

And we are not in obfuscation/protection business, so we cannot do much more than tell you that PyInstaller is not the tool to address your concerns.

  1. I don't know how to make an onefile executable from PyInstaller without a Starter.py file.
    And Starter.py needs to import my execuatable binary (also other shared lib.so) which is not possible to
    import executable.

Yes, we do require an entry-point script (that can be minimal load-and-run some obfuscated module or compiled extension program). But that makes no difference in the grand scheme of things. Your byte-compiled entry-point script, byte-compiled pure-python modules (.pyc), and binary extension modules (.so; either cythonized modules or original binary extensions) are all collected in the frozen application and can be extracted one way or another (either from PKG archive or PYZ archive). So it makes no difference if you need starter.py to import topsecret binary extension or if the extension could be launched directly (equivalent of python's python -m topsecret vs python start.py).

Some steps on PyInstaller I am not familiare to handle here.
I mean:
A) Cython build executable binary file.
B) Other cython built shared libraries *.so files
C) I need single executable file by combining above A and B

There's no Cython involved in PyInstaller. We use pre-built bootloader executable (that's written in C), to which we append the archive containing data, binaries, etc. When you run the assembled executable, it scans itself for the embedded-archive, extracts its contents (if onefile), sets up embedded python interpreter, and runs your entry-point script.

I can use PyInstaller's pyi-archive_viewer tool to extract the embedded PYZ archive
Here after even though the PYZ is extracted only native binaries available for them.

PYZ contains byte-compiled pure-python modules. So those are, at least in theory, cross-platform. But yes, if you cythonized everything, there would be no .pyc modules in the PYZ. But the cythonized .so files could still be extracted (from the parent PKG archive) and loaded under the same python version on Raspberry Pi. Either for analysis or re-use in counterfeit application.

  1. Is there anything is linux like executable packer which will do above A + B = C ?
    Of course I don't want any other system dependencies, which already present on my Raspberry Pi OS
    image.

I don't know, you'll have to look around yourself. Or, in all likelihood, you'll have to make your own.

  1. Is there any commecial tool to do this on Linux.
    Where as on WIndows Themida is the best.

Is Themida really applicable here? From what I remember, you can protect blocks of C++ code using special macros and whatever magic it does then presumably gets applied during program compilation. And even if you can post-process whole executable in some other protection mode, this will likely corrupt PyInstaller's embedded archive detection (unless it also unpacks the original executable and runs it).

You might want to check out PyArmor. And maybe nuitka, since they actually compile your python code (in contrast to PyInstaller, which collects everything as-is, except for byte-compilation of pure-python modules). I think both have commercial plans, so they might be more inclined to humor your concerns.

@gaamaaresosa
Copy link
Author

@rokm
Thanks a million for your kind explanation.
Really no more questions left as you have answed all well.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature Feature request
Projects
None yet
Development

No branches or pull requests

4 participants