Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add check for PYQGIS_STARTUP to avoid executing provided file twice #56827

Merged
merged 2 commits into from Apr 15, 2024

Conversation

ostr00000
Copy link
Contributor

Description

Add a check for PYQGIS_STARTUP to avoid executing the same file twice.

This happens on Linux when an absolute path is used in PYQGIS_STARTUP. See Problem source in issue #56823 description for more details.
I think a backport is not necessary, because there is a possible workaround for existing implementations.

fixes #56823

@github-actions github-actions bot added this to the 3.38.0 milestone Mar 12, 2024
@@ -60,7 +60,7 @@ bool QgsPythonUtilsImpl::checkSystemImports()
// isolating/loading the initial environ without requiring a virt env, e.g. homebrew or MacPorts installs on Mac
runString( QStringLiteral( "pyqgstart = os.getenv('PYQGIS_STARTUP')\n" ) );
runString( QStringLiteral( "if pyqgstart is not None and os.path.exists(pyqgstart):\n with open(pyqgstart) as f:\n exec(f.read())\n" ) );
runString( QStringLiteral( "if pyqgstart is not None and os.path.exists(os.path.join('%1', pyqgstart)):\n with open(os.path.join('%1', pyqgstart)) as f:\n exec(f.read())\n" ).arg( pythonPath() ) );
runString( QStringLiteral( "if pyqgstart is not None and os.path.join('%1', pyqgstart) != pyqgstart and os.path.exists(os.path.join('%1', pyqgstart)):\n with open(os.path.join('%1', pyqgstart)) as f:\n exec(f.read())\n" ).arg( pythonPath() ) );
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a while to understand this logic -- it's correct, but I think we could clean this up.

What about something like:

const QString script = R"""(pyqgstart = os.getenv('PYQGIS_STARTUP')
if pyqgstart is not None and os.path.exists(pyqgstart):
    with open(pyqgstart) as f:
        exec(f.read())
elif pyqgstart is not None and os.path.exists(os.path.join('%1', pyqgstart)):
    with open(os.path.join('%1', pyqgstart)) as f:\
        exec(f.read())
)""");
runString(script);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to clean this a bit - this one line is extremely long (I used to max 88 char in Python). Moreover, it is not obvious that these three runString calls are strictly related.

First thing I notice in your sample is elif statement. This will change current behavior where 2 files may be executed, but probably this is expected.
If a user wants to call multiple files, then it must be individually implemented. This is what I did, and I saw somebody else wanted this (third bullet), but this will be another issue - "poorly scalable startup script configuration".

Also, I think it would be very helpful to notify about invalid path. If somebody sets PYQGIS_STARTUP and that path does not exist, then something is misconfigured.

What about something like this? (only Python code for simplicity):

pyqgstart = os.getenv('PYQGIS_STARTUP')
if pyqgstart is not None:
    if os.path.exists(pyqgstart):
        with open(pyqgstart) as f:
            exec(f.read())
    elif os.path.exists('%1', pyqgstart):
        with open(os.path.exists('%1', pyqgstart)) as f:
            exec(f.read())
    else:
        print(
            f"Error: PYQGIS_STARTUP is set, but neither {pyqgstart} "
            f"nor {os.path.exists('%1', pyqgstart)} exist!", 
            file=sys.stderr,
        )

Maybe an empty value for PYQGIS_STARTUP also should be correct?


Out of curiosity, I frequently see in various project that in c++ python os.path is used instead of pathlib.Path. Is there any strong reason to use this low level API, or this is a leftover from previous Python 2 code?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to clean this a bit - this one line is extremely long (I used to max 88 char in Python). Moreover, it is not obvious that these three runString calls are strictly related.

I totally agree! 👍

First thing I notice in your sample is elif statement. This will change current behavior where 2 files may be executed, but probably this is expected.

Ah, that's a good point. Let's err on the side of caution and NOT break that, so we better avoid the elif here.

Your proposed version looks good, but with the elif removal.

Also, I think it would be very helpful to notify about invalid path. If somebody sets PYQGIS_STARTUP and that path does not exist, then something is misconfigured

That also sounds very reasonable! (You can use the QgsMessageLog class for this)

Out of curiosity, I frequently see in various project that in c++ python os.path is used instead of pathlib.Path. Is there any strong reason to use this low level API, or this is a leftover from previous Python 2 code?

It's 100% historic reasons only. Feel free to use pathlib and upgrade existing code to pathlib as you see fit!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I compiled QGIS to deeply test these changes (It was not so simple - I have issue Missing qt sip modules during compile #54184)

  2. Turns out that we cannot use in runString method a multiline python code. The error is:
    SyntaxError: multiple statements found while compiling a single statement

  3. Maybe there are some dirty tricks, but a function is also a statement, so I decided to create a new function.
    I gave it meaningful name, because may be useful to call this from user script (multiple startup scripts scenario).
    Sometimes function like that are deleted (del) after it is executed, but I consider it as bad practice.
    If a user wants to use the same function, then it is copied instead of reusing it.

  4. I keep the old name for pyqgstart for compatibility.

  5. I used compile built-in function to assign the executed code a name (second argument).
    This is very useful in stack traces (otherwise, empty string is used as file path).

  6. I tested my attempt on my existing startup scripts and to my surprise, there was NameError in my script.

My simplified code (using workaround from issue description) looks like this:

import __main__

if getattr(__main__, 'startupExecuted', False):
    def configure():
        pass

    def run():
        configure()
        print("here are some other calls")

    run()
    startupExecuted = True
  1. Finally, I found an error cause and a fix.
    Now we must provide a second argument (global) to exec function, otherwise, changes are performed in current scope, which is newly introduced function run_startup_script.
    Also note, we cannot provide third argument (local), because this is the same situation as without providing these arguments at all.

  2. We still must check for startup_path_2 != startup_path_1, because pathlib.Path behavior is similar to os.path.

  3. As suggested, I used QgsMessageLog.
    There was a runtime python error - we must add from qgis.core import Qgis, QgsMessageLog,
    because it is very beginning of python interpreter.
    I do not know how this early import may impact further code.
    Please look at this code (few lines below my changes):

    // import QGIS bindings
    QString error_msg = QObject::tr( "Couldn't load PyQGIS." ) + '\n' + QObject::tr( "Python support will be disabled." );
    if ( !runString( QStringLiteral( "from qgis.core import *" ), error_msg ) || !runString( QStringLiteral( "from qgis.gui import *" ), error_msg ) )
    {
    return false;
    }

I see possible alternatives:

  • use print - but no info in QGIS GUI,
  • move import after exec (give chance to a user to manipulate paths?),
  • keep QgsMessageLog only for not script_executed (but the import still exists).
  1. Please review these changes and make some decisions.
    This is starting to be a huge change for such a tiny fix 😄

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ostr00000

Turns out that we cannot use in runString method a multiline python code. The error is:
SyntaxError: multiple statements found while compiling a single statement

Ah, it's non obvious (and non-documented 🙄 ), but you need to set the single argument to runString to false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nyalldawson Can I request review?

Although I missed argument single (I peeked at runString but somehow I have ignored that argument 🤔 ),
I believe that proposed solution (first declaring a new function and then calling it) is a better solution, because the global scope is not polluted with internal variables.

The remaining question from my previous message is whether we are fine with importing from qgis.core.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ostr00000
The new approach looks good!

The remaining question from my previous message is whether we are fine with importing from qgis.core

That's an unknown 😄 I'd say there's a risk, but it's low risk and we'll only know if we make the change and then someone reports bugs. And it's simple enough to drop the logging if it does cause issues.

- use pathlib,
- change complex one-line condition to function,
- use `QgsMessageLog`,
Copy link

🪟 Windows builds ready!

Windows builds of this PR are available for testing here. Debug symbols for this build are available here.

(Built from commit b491b0e)

Copy link

github-actions bot commented Apr 6, 2024

The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check

  • that all unit tests are passing

  • that all comments by reviewers have been addressed

  • that there is enough information for reviewers, in particular

    • link to any issues which this pull request fixes

    • add a description of workflows which this pull request fixes

    • add screenshots if applicable

  • that you have written unit tests where possible
    In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this pull request.
    If there is no further activity on this pull request, it will be closed in a week.

@github-actions github-actions bot added stale Uh oh! Seems this work is abandoned, and the PR is about to close. and removed stale Uh oh! Seems this work is abandoned, and the PR is about to close. labels Apr 6, 2024
@nyalldawson nyalldawson merged commit 65ef22f into qgis:master Apr 15, 2024
28 checks passed
@agiudiceandrea
Copy link
Contributor

agiudiceandrea commented Apr 17, 2024

Hi @ostr00000, a recent Windows MinGW64 build (62c5427) displays the following error during the start up:

An error occurred during execution of following code:
run_startup_script(pyqgstart)


Traceback (most recent call last):
  File "", line 1, in 
RuntimeError: lost sys.stdout


Python version:
3.11.8 (main, Feb 16 2024, 00:00:00) [GCC 13.2.1 20230728 (Fedora MinGW 13.2.1-5.fc39)]

QGIS version:
3.37.0-Master 'Master', exported

May such error be due to this PR?

@ostr00000
Copy link
Contributor Author

ostr00000 commented Apr 17, 2024

Hi @agiudiceandrea,

I have never seen that error before.
The error position (File "", line 1, in ) indicates that this error is inside run_startup_script (Error in a user function should have a nice stacktrace).
According to this answer https://stackoverflow.com/a/65694243, the error may be caused by not fully initialized stdout. Maybe it is initialized in QGIS somewhere later in code? I will search for it.

May such error be due to this PR?

I think it may be possible - we were unsure about using loggers at that early program stage.
Unfortunately, I tested it only on Linux (I have no access to Windows) and there was no such error.

Have you created a new issue? I would prefer to continue the discussion in a new issue.

@agiudiceandrea
Copy link
Contributor

I would prefer to continue the discussion in a new issue.

@ostr00000, please see: #57173.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Python script set in PYQGIS_STARTUP environment variable runs twice
3 participants