Improve the main module documentation #83633

maggyero · 2020-01-25T14:00:08Z

BPO	39452
Nosy	@gvanrossum, @terryjreedy, @ncoghlan, @cameron-simpson, @stevendaprano, @ambv, @maggyero, @andresdelfino, @miss-islington, @iritkatriel, @jdevries3133
PRs	bpo-39452: Improve the __main__ module documentation #14487 bpo-39452: rewrite and expand __main__.rst #26883 [3.10] bpo-39452: Rewrite and expand __main__.rst (GH-26883) #27932 bpo-39452: [doc] Change "must" to "can", on relative import style in __main__ modules #29379 [3.10] bpo-39452: [doc] Change "must" to "can" on relative import style in `__main__` (GH-29379) #29449
Files	less_prescriptive.diff

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2021-08-24.20:55:28.252>
created_at = <Date 2020-01-25.14:00:07.649>
labels = ['3.11', 'type-feature', '3.10', 'docs']
title = 'Improve the __main__ module documentation'
updated_at = <Date 2021-11-06.18:50:09.178>
user = 'https://github.com/maggyero'

bugs.python.org fields:

activity = <Date 2021-11-06.18:50:09.178>
actor = 'lukasz.langa'
assignee = 'docs@python'
closed = True
closed_date = <Date 2021-08-24.20:55:28.252>
closer = 'lukasz.langa'
components = ['Documentation']
creation = <Date 2020-01-25.14:00:07.649>
creator = 'maggyero'
dependencies = []
files = ['50249']
hgrepos = []
issue_num = 39452
keywords = ['patch']
message_count = 19.0
messages = ['360682', '360695', '377026', '377035', '377050', '396157', '396443', '399348', '400219', '400238', '400239', '400671', '400781', '400782', '400791', '400793', '401441', '405875', '405877']
nosy_count = 12.0
nosy_names = ['gvanrossum', 'terry.reedy', 'ncoghlan', 'cameron', 'steven.daprano', 'docs@python', 'lukasz.langa', 'maggyero', 'adelfino', 'miss-islington', 'iritkatriel', 'jack__d']
pr_nums = ['14487', '26883', '27932', '29379', '29449']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue39452'
versions = ['Python 3.10', 'Python 3.11']

The text was updated successfully, but these errors were encountered:

maggyero · 2020-01-25T14:00:07Z

This PR will apply the following changes on the __main__ module documentation:

correct the phrase "run as script" by "run from the file system" (as used in the runpy documentation) since "run as script" does not mean the intended python foo.py but python -m foo (cf. PEP-338);
replace the phrase "run with -m" by "run from the module namespace" (as used in the runpy documentation) since the module can be equivalently run with runpy.run_module('foo') instead of python -m foo;
make the block comment PEP-8-compliant (located before the if block, capital initialised, period ended);
add a missing case for which a package's __main__.py is executed (when the package is run from the file system: python foo/).

stevendaprano · 2020-01-25T16:10:06Z

There are some serious problems with the PR.

You state that these two phrases are from the runpy documentation:

"run from the module namespace"
"run from the file system"

but neither of those phrases appear in the runpy documentation here:

https://docs.python.org/3/library/runpy.html

You also say:

"run as script" does not mean the intended python foo.py
but python -m foo

but this is incorrect, and I think based on a misunderstanding of PEP-338. The title of PEP-338, "Executing modules as scripts", is not exclusive: the PEP is about the -m mechanism for locating the module in order to run it as a script. It doesn't imply that python spam.py should no longer be considered to be running a script.

In common parlance, "run as a script" certainly does include the case where you specify the module by filename python spam.py as well as the -m case where you specify it as a module name and let the interpreter locate the file. In other words, both

python pathname/spam.py
python -m spam

are correctly described as "running spam.py as a script" (and other variations). They differ in how the script is specified, but both mechanisms treat the spam.py file as a script and run it.

See for example https://duckduckgo.com/?q=how+to+run+a+python+script for examples of common usage.

Consequently, it is simply wrong to say that the intended usage of "run a script" is the -m mechanism.

The PR changes the term "scope" to "environment", but I think that is wrong. An environment is potentially greater than a scope. __main__ is a module namespace, hence a scope. The environment includes things outside of that scope, such as the builtins, environment variables, the current working directory, the python path, etc. We don't talk about modules being an environment, but as making up a scope.

The PR introduces the phrase "when the module is run from the file system" to mean the case where a script is run using python spam.py, but it equally applies to the case of python -m spam. In both cases, spam is located somewhere in the file system.

(It is conceivable that -m could locate and run a built-in module, but I don't know any cases where that actually works. Even if it does, we surely don't need to complicate the docs for this corner case. It's enough to know that -m will locate the module and run it.)

The PR describes three cases: running from the file system, running from stdin, and running "from the module namespace" but that last one is a clumsy phrase which, it seems to me, is not correct. How do you run a module from its own namespace? Modules *are* a namespace, and we say code runs *in* a namespace, not "from" it.

In any case, it doesn't matter whether the script is specified on the command line as a file name, or as a module name with -m, or double-clicked in a GUI, in all three cases the module's code is executed in the module's namespace.

So it is wrong to distinguish "from the file system" and "from (in) the module namespace" as two distinct cases. They are the same case.

The PR replaces the comment inside the if block:

# execute only if run as a script

with a comment above the `if` statement:

# Execute only if the module is not imported.

but the new comment is factually incorrect on two counts. Firstly, it is not correct that the if statement executes only if the module is not imported. There is no magic to the if statement. It always executes, regardless of whether the module is being run as a script or not. We can write code like this:

    if print("Hello, this always runs!") or __name__ == '__main__':
        # execute only if run as a script
        print('running as a script')
    else:
        # execute only if *not* run as a script
        print('not run as a script')

Placing the comment above the if, where it will apply to the entire if statement, is incorrect.

The second problem is that when running a module with -m it *is* imported. PEP-338 is clear about this:

"if -m is used to execute a module the PEP-302 import mechanisms are used to locate the module and retrieve its compiled code, before executing the module"

(in other words: import the module). We can test this, for example, if you create a package:

spam/
+-- __init__.py
+-- eggs.py

and then run python -m spam.eggs, not only __main__ (the eggs.py module) but also spam will be found in sys.modules. So the new comment is simply wrong.

There may be other issues with the PR.

maggyero · 2020-09-16T21:15:26Z

Thanks for your extended review Steven.

You state that these two phrases are from the runpy documentation:

"run from the module namespace"

"run from the file system"

but neither of those phrases appear in the runpy documentation here:

https://docs.python.org/3/library/runpy.html

I agree. Actually the first paragraph of the page uses the phrases:

"located using the module namespace";
"located using the file system",

so instead of saying:

"run a module located using the module namespace" to mean "python <file>
"run a module located using the file system" to mean "python -m <module>",

I simplified to:

"run from the module namespace"
"run from the file system"

But since the terminology is misleading I have used these phrases instead:

python: "module initialized from an interactive prompt";
python < <file>: "module initialized from standard input";
python <file>: "module initialized from a file argument";
python -c <code>: "module initialized from a -c argument";
python -m <module>: "module initialized from a -m argument";
import <module>: "module initialized from an import statement".

What the documentation tries to explain is that in all of these cases except the last one, code is executed in the __main__ module.

I have updated the PR.

----

The PR changes the term "scope" to "environment", but I think that is wrong. An environment is potentially greater than a scope. __main__ is a module namespace, hence a scope. The environment includes things outside of that scope, such as the builtins, environment variables, the current working directory, the python path, etc. We don't talk about modules being an environment, but as making up a scope.

I disagree. According to Wikipedia (https://en.wikipedia.org/wiki/Scope_(computer_science)), the term "scope" is the part of a program where a name binding is valid, while the term "environment" (synonym of "context") is the set of name bindings that are valid within a part of a program. Therefore "scope" is a property of a name binding (a name binding has a scope), and "environment" is a property of a part of a program (a part of a program has an environment).

And the term "environment" is actually already used in the original title and synopsis of the document (and it is correct):

:mod:`__main__` --- Top-level script environment

.. module:: __main__
:synopsis: The environment where the top-level script is run.

So my change to the body fixes the inconsistent and incorrect usage of "scope":

'__main__' is the name of the scope in which top-level code executes.

'__main__' is the name of the environment where top-level code is run.

A module can discover whether or not it is running in the main scope
+ A module can discover whether or not it is running in the main environment

----

Placing the comment above the if, where it will apply to the entire if statement, is incorrect.

I agree. Sometimes you see comments before if statements but they usually don't start with "execute".

I have updated the PR.

----

The second problem is that when running a module with -m it *is* imported. PEP-338 is clear about this:

I agree. I should have said "when the module is not initialized from an import statement".

But note that even before my change the original document already used the phrase "not imported":

executing code in a module when it is run as a script or with ``python
-m`` but not when it is imported::

executing code in a module when it is not imported::

execute only if run as a script

+ # Execute only if the module is not imported.

I have updated the PR.

terryjreedy · 2020-09-17T07:18:50Z

The main issue I have with the existing doc is its use of 'top-level' to mean the main, initial, startup module that first executes the user code for a python 'program'. We routinely use 'top-level' instead for the global scope of a module. Example: https://docs.python.org/3/glossary.html, 'qualified name' entry, line 2: "For top-level functions and classes, ..." Within '__main__', some code is top-level, but class and function bodies are not.

But this does not have to be part of this PR.

maggyero · 2020-09-17T09:43:11Z

I agree with you Terry. Another thing that bothers me: in the current document, the __main__ module is reduced to its environment (aka context or dictionary), whereas a module object has other important attributes such as its code.

So how about adding the following changes?

:mod:`__main__` --- Top-level code environment
==============================================
+ :mod:`__main__` --- Startup module
+ ==================================
:synopsis: The environment where top-level code is run.
+ :synopsis: The first module from which the code is executed at startup.
'__main__' is the name of the environment where top-level code is run.

'__main__' is the name of the startup module.

A module can discover whether or not it is running in the main environment
+ A module can discover whether or not it is initialized as the :mod:`__main__` module

iritkatriel · 2021-06-20T00:00:25Z

See also bpo-24632 and bpo-17359.

jdevries3133 · 2021-06-23T19:28:26Z

Hi All,

As I wrote on the PR::

I am picking up the torch on 39452, continuing where @maggyero left 
off, and also implementing my discourse proposal, which seemed to be 
well-liked.

Feel free to leave any feedback for me on the GitHub PR, I'm looking forward to continuing to develop this work based on community feedback.

jdevries3133 · 2021-08-10T17:48:36Z

Hi All,

I'm pinging everyone here on the bpo because my GitHub PR has been through a lot of revision and review. Maybe it's close to being ready to merge (I hope)!

Feel free to take a look if you are interested: #26883

ambv · 2021-08-24T17:01:49Z

New changeset 7cba231 by Jack DeVries in branch 'main':
bpo-39452: Rewrite and expand __main__.rst (bpo-26883)
7cba231

miss-islington · 2021-08-24T20:54:19Z

New changeset ec5a031 by Miss Islington (bot) in branch '3.10':
bpo-39452: Rewrite and expand __main__.rst (GH-26883)
ec5a031

ambv · 2021-08-24T20:55:28Z

Thanks a lot, Géry and Jack! ✨ 🍰 ✨

gvanrossum · 2021-08-30T21:31:57Z

Thanks, the rewrite is great!

I have one nit: did you consider which of these two idioms is better?

if __name__ == "__main__":
    main()

vs.

if __name__ == "__main__":
    sys.exit(main())

Your docs seem to promote the second, whereas I've usually preferred the former. Was this a considered choice on your part?

maggyero · 2021-08-31T21:31:37Z

@jack__d

Thanks for the rewrite! This is a great expansion. Unfortunately I didn’t have the time to review it before the merge. If I find something to be improved I will let you know.

@gvanrossum

Your docs seem to promote the second, whereas I've usually preferred the former.

Are you sure? Yet in your 2003 blog post Python main() functions you promoted the opposite idiom if __name__ == "__main__": sys.exit(main()) over the idiom if __name__ == "__main__": main():

Now the sys.exit() calls are annoying: when main() calls sys.exit(), your interactive Python interpreter will exit! The remedy is to let main()'s return value specify the exit status.

I am interested in the rationale if you changed your mind.

gvanrossum · 2021-08-31T21:40:03Z

You're right, I'm being inconsistent. :-( I withdraw my objection.

There are cases where sys.exit() is easier than returning an exit code, e.g. when the error is discovered deep inside some other code. But it's probably better to raise a dedicated exception in that case and catch it in main(), rather than just calling sys.exit() deep inside the other code. It's probably too fine a point for a tutorial. Sorry!

jdevries3133 · 2021-08-31T22:31:41Z

Your docs seem to promote the second, whereas I've usually preferred the
former. Was this a considered choice on your part?

First and foremost, stupid GitHub is not letting the permalink load for some
reason, but yes; this was discussed in the conversation with @graingert on
June 29th – it was his suggestion. Later, @pradyunsg from PyPa added some
suggestions about how the document described console script entrypoints,
and the documentation around this issue changed a bit again.

As far as my perspective, I also never personally use the sys.exit idiom
myself. After all, an exception is going to cause a non-zero exit code, and a
traceback is always going to have a lot more value than an exit code.

I was, however, surprised to learn how pip treats console script entry points
in the course of working on this document. Specifically, it generates an
executable script that does wrap the function in sys.exit.I definitely think
that the way the document communicates this fact while teaching the idiom is a
good thing, so I think that whole "Idiomatic Usage" section is good.

I do think we can tweak the document slightly to make it less prescriptive,
though, because in reality a lot of people _don't_ use this idiom, so
presenting it as a de-facto standard is misleading. Plus, it's not
Pythonic to dole out prescriptive boilerplate.

I attached a diff that steers in that direction. What do you all think? It is
a pretty slight change, but I think it better strikes a balance.

maggyero · 2021-08-31T22:55:18Z

No worries, it was almost twenty years ago.

But it's probably better to raise a dedicated exception in that case and catch it in main(), rather than just calling sys.exit() deep inside the other code.

Yes I agree, and I think you explained very clearly why it is better in the blog post:

Another refinement is to define a Usage() exception, which we catch in an except clause at the end of main():
[…]
This gives the main() function a single exit point, which is preferable over multiple return 2 statements.

So I think you made two independent points:

raising a dedicated exception instead of calling sys.exit inside nested functions and catching it inside main allows a single exit point;
calling sys.exit outside of main instead of inside prevents exiting the Python interpreter in an interactive session.

ncoghlan · 2021-09-09T07:48:17Z

These changes are excellent - thanks for the patch!

Something even the updated version doesn't cover yet is directory and zipfile execution, so I filed bpo-45149 as a follow up ticket for that (the info does exist elsewhere in the documentation, so it's mostly just a matter of adding it to the newly expanded page, and deciding what new cross-references, if any, would be appropriate)

ambv · 2021-11-06T18:09:27Z

New changeset 57457a1 by Andre Delfino in branch 'main':
bpo-39452: [doc] Change "must" to "can" on relative import style in __main__ (GH-29379)
57457a1

ambv · 2021-11-06T18:50:09Z

New changeset e53cb98 by Miss Islington (bot) in branch '3.10':
bpo-39452: [doc] Change "must" to "can" on relative import style in __main__ (GH-29379) (GH-29449)
e53cb98

maggyero mannequin added the 3.8 only security fixes label Jan 25, 2020

maggyero mannequin assigned docspython Jan 25, 2020

maggyero mannequin added docs Documentation in the Doc dir type-feature A feature request or enhancement 3.8 only security fixes labels Jan 25, 2020

maggyero mannequin assigned docspython Jan 25, 2020

maggyero mannequin added docs Documentation in the Doc dir type-feature A feature request or enhancement labels Jan 25, 2020

iritkatriel added 3.11 only security fixes and removed 3.8 only security fixes labels Jun 20, 2021

jdevries3133 mannequin added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes labels Jun 23, 2021

zware removed 3.7 (EOL) end of life 3.8 only security fixes labels Jun 23, 2021

zware removed the 3.8 only security fixes label Jun 23, 2021

ambv removed the 3.9 only security fixes label Aug 24, 2021

ambv closed this as completed Aug 24, 2021

ambv removed the 3.9 only security fixes label Aug 24, 2021

ambv closed this as completed Aug 24, 2021

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the main module documentation #83633

Improve the main module documentation #83633

maggyero mannequin commented Jan 25, 2020

maggyero mannequin commented Jan 25, 2020

stevendaprano commented Jan 25, 2020

maggyero mannequin commented Sep 16, 2020

execute only if run as a script

terryjreedy commented Sep 17, 2020

maggyero mannequin commented Sep 17, 2020

iritkatriel commented Jun 20, 2021

jdevries3133 mannequin commented Jun 23, 2021

jdevries3133 mannequin commented Aug 10, 2021

ambv commented Aug 24, 2021

miss-islington commented Aug 24, 2021

ambv commented Aug 24, 2021

gvanrossum commented Aug 30, 2021

maggyero mannequin commented Aug 31, 2021

gvanrossum commented Aug 31, 2021

jdevries3133 mannequin commented Aug 31, 2021

maggyero mannequin commented Aug 31, 2021

ncoghlan commented Sep 9, 2021

ambv commented Nov 6, 2021

ambv commented Nov 6, 2021

Improve the __main__ module documentation #83633

Improve the __main__ module documentation #83633

Comments

maggyero mannequin commented Jan 25, 2020

maggyero mannequin commented Jan 25, 2020

stevendaprano commented Jan 25, 2020

maggyero mannequin commented Sep 16, 2020

execute only if run as a script

terryjreedy commented Sep 17, 2020

maggyero mannequin commented Sep 17, 2020

iritkatriel commented Jun 20, 2021

jdevries3133 mannequin commented Jun 23, 2021

jdevries3133 mannequin commented Aug 10, 2021

ambv commented Aug 24, 2021

miss-islington commented Aug 24, 2021

ambv commented Aug 24, 2021

gvanrossum commented Aug 30, 2021

maggyero mannequin commented Aug 31, 2021

gvanrossum commented Aug 31, 2021

jdevries3133 mannequin commented Aug 31, 2021

maggyero mannequin commented Aug 31, 2021

ncoghlan commented Sep 9, 2021

ambv commented Nov 6, 2021

ambv commented Nov 6, 2021

Improve the main module documentation #83633

Improve the main module documentation #83633