Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Env vars missing - root cause of 101 issue #1

Open
NewtonChutney opened this issue Oct 13, 2023 · 3 comments
Open

Env vars missing - root cause of 101 issue #1

NewtonChutney opened this issue Oct 13, 2023 · 3 comments

Comments

@NewtonChutney
Copy link

NewtonChutney commented Oct 13, 2023

Hello @danielbeach,
I think there are separate temporary environments for each task run.. I've found with bash operator, that each run is in a different temporary folder.. Maybe python operator is similar?

UPDATE:
I noticed you were using a particular temporary directory to store, so this shouldn't be an issue

And when you run on a set of Kubernetes workers, shouldn't the download, chmod, and execution be in the same task on the DAG, so they get executed inside the same worker?

@NewtonChutney
Copy link
Author

image
The issue is caused by the lack of env variables

Had you tried check_output.. it would have been obvious.. 😅
image

@NewtonChutney NewtonChutney changed the title Possible environment difference Env vars missing - root cause of 101 issue Oct 14, 2023
@veloxl
Copy link

veloxl commented Oct 26, 2023

I was also going to suggest using subprocess.check_output rather than subprocess.check_call here:

subprocess.check_call(f"{temp_dir} {uri}", shell=True)

More info on stackoverflow.

The TL;DR is check_call() starts the external process and returns immediately without waiting for it to exit. check_output() waits for the background python processes to complete and then exits. As @NewtonChutney mentioned, check_output() should also provide the output from the child processes which should make debugging easier.

Also, you might be able to pass env variables to the subprocess command like this:

subprocess.check_output(f"{temp_dir} {uri}", shell=True, env={"KEY": "VALUE"})

I haven't tested it but might be worth a look.

@NewtonChutney
Copy link
Author

Yep, we have a similar setup, to inject env variables for bashOperator..
Currently, we're using it to run pytests inside Airflow's worker environment..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants