Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIFO pipe code will hang indefinitely preventing any logging output #11453

Open
hans2520 opened this issue Dec 13, 2021 · 6 comments
Open

FIFO pipe code will hang indefinitely preventing any logging output #11453

hans2520 opened this issue Dec 13, 2021 · 6 comments

Comments

@hans2520
Copy link

hans2520 commented Dec 13, 2021

Originally posted by @hans2520 in #10489 (comment)

In debugging the above issue, I wrote some code that could lead to better reliability with the open_fifo_write method as well as debuggability.

This method is critical in the startup of an AWX playbook job, and if it hangs, it will give the user no indication whatsoever as to what is happening.

The below code was just my start at improving the method, before the ultimate root cause in my issue (a recent Crowdstrike policy update deploying script-based execution monitoring that blocked the FIFO pipe in the first place) was discovered. The basic goal of the improvement is to force-kill the job if it doesn't finish after a certain allotted time, capture the error, and thereby give the user something useful to work with as to the cause. Also, the flushing here should help reliability some especially with larger keys.

But it's a work in progress for someone else to finish. In particular, the join will not have the intended effect if indeed the pipe hasn't finished its writing task. Use of the 'daemon' mode may be required here, in addition, consider using multiprocessing library here as opposed to Thread, as that is more likely better suited for this task.

def open_fifo_write(path, data):
    '''open_fifo_write opens the fifo named pipe in a new thread.
    This blocks the thread until an external process (such as ssh-agent)
    reads data from the pipe.
    '''
    logger.debug(f'Pipe write! path: {path}')
    try:
        os.mkfifo(path, stat.S_IRUSR | stat.S_IWUSR)
        def write_data(p, d):
            with open(p, 'wb') as f:
                f.write(d)
                f.flush()
        t1 = threading.Thread(target=write_data, args=(path, data))
        t1.start()
        t1.join(3)
        logger.debug(f"thread {t1} finished writing")
    except Exception as e:
       logger.error(e)
       logger.exception(e)
@craph
Copy link
Contributor

craph commented Jan 11, 2022

I have a similar issue #11518 and it's seems liked to crowdstrike too but only when AWX is on K8S. With an older version (17.x) installed in local Docker and crowdstrike running too the project update is working.

Do you have any idea of the changes between version 17.x and 19.x about the playbook processing for the project updates ?

@hans2520
Copy link
Author

This area of code is unchanged from version 5.0.0, where it was originally found

@craph
Copy link
Contributor

craph commented Jan 11, 2022

This area of code is unchanged from version 5.0.0, where it was originally found

Why are you saying it was found in version 5.0.0 ? I don't see this in the referenced issue. It refers version 19.2.1 in #10489

@hans2520
Copy link
Author

hans2520 commented Jan 12, 2022 via email

@craph
Copy link
Contributor

craph commented Jan 12, 2022

@hans2520 Please can you give me the file where the issue is ?

@hans2520
Copy link
Author

The file is linked in the issue description

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants