Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc: minor updates to the documentation for MERGE. #7236

Merged
merged 1 commit into from
Nov 18, 2022

Conversation

berleant
Copy link
Contributor

Several people have encountered the same issue using MERGE (described here: #6053). This is a simple update of the documentation based on the solution presented in that issue. Issue 6094 (#6094) suggests a more extensive update to the docs, but these minor updates will be helpful in the meantime.

Copy link
Member

@rokm rokm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for trying to address this.

Unfortunately, the situation with MERGE is rather complicated, as MERGE really should be used only when one has multiple onefile executables.

It makes practically no sense for any scenario involving onedir builds (such as #6053, although we failed to mention that there), because it forces a onefile semantic upon onedir executables (i.e., the executable will copy all the referenced files into temporary directory on each run).

I think the docs really don't make this point clear enough, and there should be a note about this at the beginning of the MERGE documentation. Because for combining multiple onedir executables into a single package, a shared collect should be used instead, as discussed in #7140.

And yes, the MERGE docs should be updated as per #6094, but I think MERGE in its current form should be scrapped completely, and instead replaced with an option to use a shared external PKG archive among multiple onefile executables (and a shared external PYZ archive among multiple onedir executables).

@@ -584,7 +584,8 @@ a list of tuples, each tuple having three elements:

* The second element is the script name of the analyzed app (without the ``.py`` extension).

* The third element is the name for the executable (usually the same as the script).
* The third element is the path to the executable relative to the ``dist``
directory (e.g. ``script_dir/script_name``).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When an executable is a onedir executable, the relative path in dist directory is indeed script_dir/script_name (usually just script_name/script_name). But when the executable is a onefile executable, it is just script_name. This is an important distinction, and is probably what is throwing off most of the users that run into problems here.

MERGE( (foo_a, 'foo', 'foo'), (bar_a, 'bar', 'bar'), (zap_a, 'zap', 'zap') )
MERGE( (foo_a, 'foo', os.path.join('foo', 'foo'),
(bar_a, 'bar', os.path.join('bar', 'bar'),
(zap_a, 'zap', os.path.join('zap', 'zap') )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the comment above, both the original and the new example are correct (and coversely, incorrect at the same time), depending on the context; i.e., are executables onedir or onefile ones.

That said, MERGE doesn't really make sense with onedir executables, even if it is technically possible and even though we even have tests for various mixed-mode scenarios:

But for onedir executables, MERGE brings no benefits and forces the executable into onefile semantics (copying the files to temporary directory on each run). For onedir executables, a shared COLLECT should be used instead.

@berleant
Copy link
Contributor Author

berleant commented Nov 12, 2022

I can modify this PR with the information you've shared here, but are you sure that MERGE is not intended to be used for one-folder apps? The documentation contains multiple references (both indirect and direct) to it. At this point, that's not just insufficient documentation, but incorrect documentation.

  1. https://github.com/pyinstaller/pyinstaller/blame/c076ee096fdb782784a654cf86807adcfc9ee1bd/doc/spec-files.rst#L555
  2. https://github.com/pyinstaller/pyinstaller/blame/c076ee096fdb782784a654cf86807adcfc9ee1bd/doc/spec-files.rst#L543,
  3. https://github.com/pyinstaller/pyinstaller/blame/c076ee096fdb782784a654cf86807adcfc9ee1bd/doc/spec-files.rst#L645,
  4. https://github.com/pyinstaller/pyinstaller/blame/c076ee096fdb782784a654cf86807adcfc9ee1bd/doc/spec-files.rst#L652

@rokm
Copy link
Member

rokm commented Nov 12, 2022

I can modify this PR with the information you've shared here, but are you sure that MERGE is not intended to be used for one-folder apps?

While you can use it with onedir builds, it really makes no sense to do so, at least in my opinion. Because the resources that would be shared are already external anyway; but after the merge, they need to be copied to temporary directory on each run, thus slowing down the execution. So you gain nothing at the cost of the incurred onefile semantics.

So for multiple onedir applications, a shared COLLECT is the way to go, because then all external resources are implicitly shared due to being in the same directory. What remains in the executables themselves is the (very lightweight) PKG archive with executable-specific PYZ archive (and bootloader options, and bootstrap modules and runtime hooks), and those currently cannot be shared (with or without MERGE).

The mixed-mode really makes no sense. Making a onedir build depend on a onefile one is basically the same as making the seocnd build a onefile one in the first place. And conversely, making a onefile build depend on a onedir build is almost the same as making both onedir and using a shared COLLECT, except that with the latter, neither executable would need to copy files to temporary directory.

Lastly, I suppose using MERGE with onefile builds at least makes some sense, but the way all executables depend on the first one is a rather poor choice, either. Because if user decides they don't need the first executable for whatever reason and remove it, all execuables will stop working.

But the current implementation is the one that we've inherited, along with its documentation, such as it is...

@berleant
Copy link
Contributor Author

Sounds good, I'll update the documentation a little bit to reflect the knowledge you've shared here. Thanks!

Several people have encountered the same issue using MERGE (described
here: pyinstaller#6053). This is
a simple update of the documentation based on the solution presented in
that issue. Issue 6094 (pyinstaller#6094)
suggests a more extensive update to the docs, but these minor updates
will be helpful in the meantime.
@berleant
Copy link
Contributor Author

Updated, hopefully this will suffice.

@bwoodsend
Copy link
Member

Looks ok to me although I've honestly never used this feature so I wouldn't know.

@rokm
Copy link
Member

rokm commented Nov 18, 2022

Looks good to me, thanks for the update!


As a side note, I've tried to create a onefile multipackage following the updated instructions to make sure we haven't missed anything, and found out that MERGE has been effectively broken since 14949d3 because marking entries-to-be-removed by assinging toc[i] = (None, None, None) does not work anymore (admittedly, doing it like that in the first place is asking for trouble...):

toc[i] = (None, None, None)
# Clean the list
toc[:] = [tpl for tpl in toc if tpl != (None, None, None)]

So while MERGE is supposed to remove entries (that are found in preceding executable(s)) in .binaries and .datas and replace them with references in .dependencies (and then all of them need to be passed the EXE), the entries from .binaries and .datas are not cleared (so the executables are actually not reduced in size, and you get warnings about files that should not exist at runtime).

But that's our problem to solve...

@rokm rokm merged commit f47d59d into pyinstaller:develop Nov 18, 2022
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants