v.1.1.28.1. Added support for running Opy utility in an alternate co… #32

BuvinJT · 2018-10-15T22:31:18Z

Added support for employing Opy in an alternate context, i.e. as an import that another python script can use to provide a more robust packaging process. The obfuscation can then acts as one, programmatically driven, "stage" within that.

Added mask_external_modules option/feature to slightly improve the obfuscation.

…text, i.e. as an import from another script. Enabled installation as a site-packages library. Changed setup.py to use distutils.core rather than setuptools. Added __init__.py to make this into a standard library. Added opymaster.py module as a bridge between the library and the original module. Changed reference in opy.py to __builtins__ to __builtin__ (no "s") after importing that module. This resolved a problem with supporting the alternate contexts. Added unit test for the new context.

…ls.core back to setuptools module. Refactored opymaster.py module into settings.py module, and eliminated the need to copy that into a project when implementing opy in the original manner. Expanded upon / improved the readme.

…is able to employ for the extended configurations rather than requiring an external hard file.

…y provides aliases for those external imports which are set to not be obfuscated, before the main obfuscation process. When the main process then runs, the result is to obfuscate those aliases, thereby increasing the total amount of obfuscation.

…into a separate module. Added options features: dry_run and subset_files. Added tracking of import identifiers which are obfuscated or not. Added analyze() function to library, drawing upon each of these enhancements.

JdeH · 2018-11-05T08:25:27Z

Hi, thanks for all the work.
Currently I am a bit busy but I'll look into you PR's as soon as I find the time.
Thanks in advance.

Jacques

BuvinJT · 2018-11-05T13:23:32Z

Great! No rush.

This is a really cool program, btw. There is a definitive need for this, and great many engineers could benefit from it.

I have more enhancements (and likely patches to those!) on the way. I'm working on this in parallel to another library (currently closed source) which uses this as a resource, and that is helping to drive the development here. I'm trying my best to minimize my changes to your code, and as you'll see, on a few occasions I changed or added something, and then found a cleaner way to get the original work back closer to what it was.

If you find you that you have the time, I'd like to discuss a few things. I have both suggestions and questions.

geatec · 2018-11-05T14:40:24Z

Ok, I'll get back to you on this, probably within 14 days!
Currently working against a deadline for a customer...
Jacques (sorry, used my company git account)

…tched glitch in library method of wrapping the utility (i.e. using module reload rather than simply import).

…es of known problems to be resolved. Updated documentation.

…, this will allow for leaving all publicly accessible module members in clear text. The value in that option is for library obfuscation, where many (public) identifiers must be preserved in clear text. Perhaps more importantly, this includes the first example implementation of the ast module (a built-in comprehensive Python language parser), which may have many applications moving forward.

…rnal modules" references in clear text, you may instead opt to bundle the source of those into your obfuscated version of the code, so that you can obfuscate that as well. Where upon you might find it necessary (or easier) to modify your imports during the obfuscation process. Added new beta feature "prepped_only". Similar to dry_run, prevents the production of obfuscated results. Instead, the clear text, "pre-obfuscation stage" of the files will be produced. This includes module "replacements", "masks", string obfuscations, etc.

… patches" to obfuscated results. When the utility (or the user configuration) isn't quite working as desired, this let's you just tweak specific file lines using functions e.g. "replaceInLine".

…y 2 & 3. Added six library as requirement.

…ry analyze function. This allows wrapper scripts to map clear text paths to the obfuscated results. The primary purpose at this point being for use with OpyFile patching.

JdeH · 2018-12-02T16:40:27Z

Hi,

How time flies... Do you feel all of this has stabilized enough for me to take a thorough look in anticipation of merging the pull requests or do you expect there's more to come / change? In the latter case maybe I'd better still wait a little. What do you think?

I'd like to do as much as possible in one go, since it'll require some testing (which no doubt you'll also have done already).

Kind regards
Jacques

BuvinJT · 2018-12-02T19:45:22Z

Hi Jacques,

It's pretty reliable right now. I've documented the known bugs and weakness, both in the readme, and in examples. I was focused on Python 2 to begin with, but I just ironed things out for Py3 and pushed that. The dual support is one of the major advantages of your project over other such work, so it was important to me that be preserved.

I am developing this as a supporting component to a larger project, for building distributions. I'm close to posting a preliminary version of that. I think you'll really want to check that out when I make it available, because the two of these support each other. (should be a matters of days now max, before I post that)

One thing I really want to get working here is the need for leaving "external" imports in clear text. In my wrapper project, I've started rolling in pip to gather source code from such libraries, which can then be bundled into a distribution, and the whole thing obfuscated via Opy. In a dream scenario, I wish that could be made to work with minimal effort from the client.

I added a "parser" module as you'll see. In part, I do a lot of ugly, painful, manual parsing of the language. But then, in another feature I started employing the "ast" library - which is the right way to achieve such. The ugly parts of my code should be rewritten to use that awesome library instead at some point.

As for the specific merge question, I recommend merging my fork into your project on an alternate "develop" branch. That provides the benefit of including my work, but there is no immediate need to brand it "release" quality today.

BuvinJT · 2018-12-02T19:51:17Z

Perhaps, bump this "develop" branch to v 1.2?

Note that everything I added is optional, btw. So nothing is required to be used that really diverges from your work. Other than the new "masking" feature, I believe that I defaulted all those things to NOT being employed automatically.

…ing the same values as provided by (the "dry run") analyze function.

…bfuscate and analyze functions, rather than a tuple, making the return values more easily expandable / "future proof" compared to unpacking a tuple with an expected order / length.

BuvinJT · 2018-12-11T15:55:51Z

I'm hoping that I can leave this alone for the moment now, to give you time to review and merge it. I discovered today that I had been adding commits with emails/user names that I didn't want in an open source / public repo, however. As such, I had to run a script to fix that and perform a forced push. If this rendered it difficult to merge those commits into your original repo, let me know. I'll figure out how to correct it by forking from your work again then reapplying my changes on top of that.

BuvinJT · 2018-12-11T15:58:41Z

I released a version of the "counterpart" project to Opy I've been talking about, i.e. 'Distribution Builder". Here's the url: https://github.com/BuvinJT/distbuilder
It's still an early beta, but it does function, and will give you a good idea about what I'm doing there and how I'm wrapping and building upon Opy.

JdeH · 2018-12-13T15:43:07Z

@BuvinJT

Since you want to use Opy as part of another Python application c.q. library, I've made it easier to import Opy as a module. This work is in the opy_as_a_module branch. Sorry that this comes so late but I couldn't find the time until now.

Example on how to use Opy as a module, from opy/developmen/tests/dog_walker/obfuscate.py:

import sys

sys.path.append ('../../../..')

import opy

print (opy.run ('plain_code', 'obfuscated_code', 'obfuscate.cnf'))

I've read to your additions and I'd like to provide some feedback, starting with what in my view is most urgent. If I understood right you've introduced the use of the AST module. However, abstract syntax trees are Python version dependent, e.g. AST's for Python 3.6 and 3.7 differ and for 2.7 and 3.7 differ quite a lot.

While a much more powerful version of Opy is possible using AST's, they were deliberately avoided upto now to prevent version dependencies, as Opy should be usable from Python 2.7 upto 3.7 and further.

If AST's are used, it's a whole different ballgame: more flexibility in obfuscation, much simpler code, but also: no version independency anymore.

What's your take on this?

Kind regards
Jacques

BuvinJT · 2018-12-13T22:31:30Z

Thanks for reviewing this work and getting started on incorporating it, Jacques!

First, regarding the import:

I converted Opy into an actual library, installable via pip and then importable from anywhere. That is a necessity, especially when it is nested inside another library which also works that way. So, we should preserve that and not require something like appending on to the sys.path.

Also note that when my fork runs opy, it returns a collection of results (like a clear to obfuscated file paths dictionary, the words which were obfuscated, etc.) - or an "analyze" function can be called which works in similar way, but employs my "dry run" option to not actually create any files. Both of those of hard requirements for my wrapper project.

Regarding AST:

I will not presume to know much on the subject. I never used AST until I did so here. But, the little bit that I used it, there were no issues with crossing versions.

I love that Opy works in 2 and 3. Every competitor only works in 3 that I've seen. Yet, I personally, have tons of v2 legacy code I would like to employ this on as well. I think for sure that feature must be preserved, as a way to differentiate your work from others.

That said, if a better product can be produced using AST, I'd have to argue in favor of that. Better trumps shorter/cleaner. I want to seriously protect proprietary code, or code which is security sensitive. I love Python, but the fact that it is inherently in clear text, and even a standalone version of a program (via pyinstaller or py2exe) can be reversed engineered back to the original code, is a HUGE problem.

If using AST requires more code, and/or explicit checks for v2 / 3, then so be it. Speed is not overly relevant. No one needs this process to be lightning quick. Also, the present length of the code is pretty minimal right now (great job by the way doing so much with so little!), if the code base swells to double or triple that's still very short.

BuvinJT · 2018-12-13T22:44:18Z

I don't know if you ever used Qt? I do daily, as I'm also a C++ developer. Qt just released a whole new version of "PySide", called "PySide 2". That is Qt for Python! It lets you create interfaces using all of the Qt Library, or even QML. That's like a dream come true, because Python is the "king of the backend", and Qt is the "king of the frontend" arguably. Now you can use both in one project.

Anyway, I've recently spoken with the lead developer of PySide 2 face-to-face. What they are actively working on right now is exactly what my "Distribution Builder" library does. I asked if they would build in an obfuscator, and the answer was they didn't yet plans for it, but the concept is intriguing. When I shared that I already wrote what they are planing (wrapping PyInstaller, Qt Installer, etc), plus an obfuscator built-in, there was interest in getting my work if I wanted to share it open source (as they would in turn).

My point being, I'd like to present my project to Qt, along with this awesome Opy component. If we have this well enough developed, the work could get rolled into the PySide project potentially. In which case, it's likely to be employed by a gigantic number of users (even if they don't know it!).

BuvinJT · 2018-12-13T22:51:30Z

For my own personal plans, the reason I'm developing these libraries is because I'm creating a large scale project in Python that I want to sell commercially. I can't have the source readily accessible, thus this tool is critical.

Also, I'm breaking my project into a slew of libraries, which all come together in to form the final product. I want to be able to share obfuscated versions of the libraries with multiple collaborators, as they work on their own library components in clear text. Then, they could build and run the big main product, with the new code they are developing on the fly, to confirm that it works in the target context. But, each of those collaborators would not need to have the clear text work from everyone else, or the option of stealing the project on the whole.

BuvinJT · 2018-12-13T23:12:20Z

I skipped over the point you made regarding "future proofing" with AST. That is important for sure. It's very hard to be confident that any code is future proof though. Python 4 could break away from 3 in any number of manners. Some of the major Python libraries created for 2 were not immediately made available in v3. For instance "Twisted" was only fully function in v3 for a long time (and I'm not sure if it's finally squared now?). Anyway, I think that the utility could just write a warning to stderr when run on a new Python release, stating that it has not yet been tested and confirmed.

Potentially, we might want to consider developing formal unit tests too. Where we run a series of short, atomic tests, and then confirm the results are as expected. Displaying SUCCESS/FAILURE on the screen as each is tried. With that in place, we'd readily identify bugs in future Python releases.

JdeH · 2018-12-14T11:31:14Z

@BuvinJT

Thanks for your clear explanation of what's required for the distbuilder project.
I think that the possibility to distribute obfuscated versions of Python software will contribute to the popularity of Python,
although I hope that most Python software will remain human readable!

About your additions to Opy:

Using the AST module has some drawbacks with regard to Python version independency.
But if you decide to use it in your fork it opens up a world of possibilities.

Opy's simplistic way of parsing, while version independent, results in a number of limitations.
At the core of Opy is a bag of tricks to circumvent these limitations.
Once you have the AST at your disposition, there's no need to perform those tricks,
like replacing strings by placeholders and then, after obfuscation, replacing them back again.

Also all of the following restrictions would completely disappear if you use the AST everywhere:

A comment after a string literal should be preceded by whitespace.
A ' or " inside a string literal should be escaped with \ rather then doubled.
If the pep8_comments option is False (the default), a # in a string literal can only be used at the start, so use 'p''#''r' rather than 'p#r'.
If the pep8_comments option is set to True, however, only a # cannot be used in the middle or at the end of a string literal
No renaming backdoor support for methods starting with __ (non-overridable methods, also known as private methods)

If you have the parse tree at your disposal, you can easily distinguish between e.g. names of variables and functions and the contents of string literals.
You can also easily find out what the imported modules are.
You can take apart expressions and put them back together again in an obfuscated way and many other things.
In short, obfuscation based on the AST is far superior to what Opy currently does.

So if you decide to use the AST anyhow, I think it's better to base the whole obfuscation on that.
I anticipate that, from where you are now, gradually you'll use the AST more and more,
moving away from Opy's regular expression based parsing scheme, since AST's will give you much more flexibility.
But since you've already invested quite some time, it may indeed be a gradual evolution.

My suggestion to leave Opy more or less "as is", maybe with some simple improvements,
and have separate branch for distbuilder, including AST based obfuscation, as it is a different approach.
For now I've called it opy_db and branched it from the current master branch, as that's what your code is currently based on.
Furthermore, maybe it would be a good thing to reserve the name opy_db at PyPi.
If you prefer I can do it for you.
Of course you can pick any other name you like, or just make it part of the distbuilder project.

KR
Jacques

BuvinJT · 2018-12-17T17:00:58Z

Sorry for the slow response! I've been ill, and haven't able to work for several days.

Thank you for adding the branch! Also, I really appreciate it if you would be able to set up the PyPi hooks for it. I've never done that myself, but do plan to now for the distbuilder project. Since Opy is your project, and I'm just tacking on features, it probably makes sense for the PyPi registration etc. for it to be in your name / control.

1 minor request I will make is to change the name of the branch, PyPi project to opy_distbuilder or opy_dstbldr to reduce characters perhaps. The trouble with the suffix "db" is that's the standard abbreviation for database - so I didn't want to create confusion for anyone who might be lead to think this was related to something entirely different.

In theory, rewriting Opy to use AST at it's foundation would be a great thing to get done for sure. But, as this is a side-project on a side-project, and few layers deep, I can't expect to realistically get that done any time soon. Using AST selectively will have to suffice for the moment.

BuvinJT · 2018-12-17T17:05:35Z

With this new long lived, parallel branch, we will need to work out a name / version resolution. Should the branch still install a library named "opy" or should it install one called "opy_distbuilder"? Should the new branch start over on the version number, be kept the same as opy, or skip ahead of it? This is extremely important for the distbuilder, because it will define package/version requirements and automatically manage them via pip.

JdeH · 2018-12-21T12:26:26Z

@BuvinJT

I've renamed opy_db to opy_distbuilder for clarity, as you proposed.
It may be wise to consider that the master branch of the opy_distbuilder variety of opy.
So maybe your development should happen on branches derived from opy_distbuilder, aiming for opy_distbuilder itself to evolve into a stable, tested branch, I leave that to your own judgement.
In general I'd like to lay the responsibility for the opy_distbuilder branch with you.
In principle I'll accept pull requests from you for that branch without retesting.

Since these branches diverge, lets decouple the versioning.
It isn't a problem if e.g. there would be an opy 4.0.0 and an opy_distbuilder 4.0.0.
The distinction is still clear from the names opy and opy_distbuilder.

Having independent versions means you can hand out version numbers for opy_distbuilder at will,
keeping maximal control of the version relations between distbuilder and opy_distbuilder.

Having dependent versions, on the other hand, would suggest co-evolution,
which may not always be the case.

Of course (parts of) the code may converge in the future, one including fruitful new parts of the other and vice versa,, but that doesn't pose any problems.
I think of them as sibling projects, based on the same principles, but sufficiently different to have different (but related) names.

BuvinJT · 2018-12-26T18:37:41Z

Thanks for the branch rename. I will treat that like the master for my fork now.

I setup both distbuilder and opy_distbuilder on PyPi. Executing pip install distbuilder will install both packages on your machine. Note that while the package which is installed is called opy_distbuilder, it is still being imported as opy. So long as the original is not installed into site packages, that will work without conflict generally speaking. Let me know if you'd prefer changing that.

I restarted the versioning on this branch / fork, branding it as a beta release v.0.9.0.

On the PyPi registration, I listed you as the primary author, followed by myself, and then I set the email address to point to me. I assigned the "home page" to your GitHub page, on the opy_distbuilder branch.

When you get a chance, perhaps you could check the PyPi details, and give me your approval? There are also a few meta data changes I made, and pushed to my fork, which should be merged into the dedicated branch on your repo sometime. With that done, I'm hoping to leave this alone for a little while (or at least not have a merge request for a few months to bother you with).

My distbuilder project will be actively developed for a little while. There are a few important things I still need to get on that to even bump it from "alpha" to "beta", but I think you'll find it interesting / useful in its present state if you want to play around with it.

JdeH and others added 6 commits December 16, 2017 15:03

Update README.rst

9b90089

Update README.rst

c9c47fb

Merge branch 'master' of https://github.com/QQuick/Opy

1d0ff53

Added class ConfigSettings (aka OpyConfigSettings) which the library …

fac7dbf

…is able to employ for the extended configurations rather than requiring an external hard file.

BuvinJT force-pushed the master branch from 6a058fc to 78891a9 Compare October 26, 2018 18:04

BuvinJT force-pushed the master branch from 78891a9 to 13f09c8 Compare October 26, 2018 21:07

BuvinJ added 3 commits October 27, 2018 16:09

Refactor to squash.

7fec9c4

Minor refactoring for library interface.

2170067

JdeH added the STATE: under consideration label Nov 5, 2018

BuvinJ added 9 commits November 5, 2018 16:31

Patched minor glitch with original plain_files argument / setting. Pa…

2f1441a

…tched glitch in library method of wrapping the utility (i.e. using module reload rather than simply import).

Patched glitches in masking feature. Added bugs directory with exampl…

38b5cf7

…es of known problems to be resolved. Updated documentation.

Patched bug with bytes type (i.e. string literals with "b" prefixes).

4886dc5

Added class OpyFile, within new module "patcher", for applying "quick…

5ee7abd

… patches" to obfuscated results. When the utility (or the user configuration) isn't quite working as desired, this let's you just tweak specific file lines using functions e.g. "replaceInLine".

Added "high-level" / "convenience" patching functions.

d6cef75

Patched fork for Python 3. Now seemingly fully functional again for P…

06f54cd

…y 2 & 3. Added six library as requirement.

Added "obfuscatedFileDict", and now returning it as part of the libra…

a2d33bc

…ry analyze function. This allows wrapper scripts to map clear text paths to the obfuscated results. The primary purpose at this point being for use with OpyFile patching.

JdeH added the NEED: feedback label Dec 2, 2018

Added a return tuple from the library core obfuscate function contain…

7bbf07d

…ing the same values as provided by (the "dry run") analyze function.

BuvinJ added 3 commits December 6, 2018 20:55

Defined class OpyResults. Now returning an object of that type from o…

e068af3

…bfuscate and analyze functions, rather than a tuple, making the return values more easily expandable / "future proof" compared to unpacking a tuple with an expected order / length.

Fixed glitch in obfuscatedFileDict keys.

9d35d97

Fixed bug in beta import masking feature when * wildcard encountered.

39ab183

BuvinJT force-pushed the master branch 2 times, most recently from 87e3c62 to 39ab183 Compare December 11, 2018 13:54

JdeH changed the base branch from master to opy_db December 14, 2018 11:24

JdeH merged commit 81335da into QQuick:opy_db Dec 14, 2018

JdeH added IS: enhancement STATE: part. complete and removed STATE: under consideration NEED: feedback labels Dec 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v.1.1.28.1. Added support for running Opy utility in an alternate co… #32

v.1.1.28.1. Added support for running Opy utility in an alternate co… #32

BuvinJT commented Oct 15, 2018 •

edited

Loading

JdeH commented Nov 5, 2018

BuvinJT commented Nov 5, 2018

geatec commented Nov 5, 2018 •

edited

Loading

JdeH commented Dec 2, 2018 •

edited

Loading

BuvinJT commented Dec 2, 2018

BuvinJT commented Dec 2, 2018

BuvinJT commented Dec 11, 2018

BuvinJT commented Dec 11, 2018

JdeH commented Dec 13, 2018 •

edited

Loading

BuvinJT commented Dec 13, 2018

BuvinJT commented Dec 13, 2018

BuvinJT commented Dec 13, 2018

BuvinJT commented Dec 13, 2018

JdeH commented Dec 14, 2018 •

edited

Loading

BuvinJT commented Dec 17, 2018

BuvinJT commented Dec 17, 2018

JdeH commented Dec 21, 2018 •

edited

Loading

BuvinJT commented Dec 26, 2018

v.1.1.28.1. Added support for running Opy utility in an alternate co… #32

v.1.1.28.1. Added support for running Opy utility in an alternate co… #32

Conversation

BuvinJT commented Oct 15, 2018 • edited Loading

JdeH commented Nov 5, 2018

BuvinJT commented Nov 5, 2018

geatec commented Nov 5, 2018 • edited Loading

JdeH commented Dec 2, 2018 • edited Loading

BuvinJT commented Dec 2, 2018

BuvinJT commented Dec 2, 2018

BuvinJT commented Dec 11, 2018

BuvinJT commented Dec 11, 2018

JdeH commented Dec 13, 2018 • edited Loading

BuvinJT commented Dec 13, 2018

BuvinJT commented Dec 13, 2018

BuvinJT commented Dec 13, 2018

BuvinJT commented Dec 13, 2018

JdeH commented Dec 14, 2018 • edited Loading

BuvinJT commented Dec 17, 2018

BuvinJT commented Dec 17, 2018

JdeH commented Dec 21, 2018 • edited Loading

BuvinJT commented Dec 26, 2018

BuvinJT commented Oct 15, 2018 •

edited

Loading

geatec commented Nov 5, 2018 •

edited

Loading

JdeH commented Dec 2, 2018 •

edited

Loading

JdeH commented Dec 13, 2018 •

edited

Loading

JdeH commented Dec 14, 2018 •

edited

Loading

JdeH commented Dec 21, 2018 •

edited

Loading