Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dora-daemon and dora-coordinator are background process that does not see cli environment variable #363

Open
haixuanTao opened this issue Oct 18, 2023 · 5 comments
Labels
bug Something isn't working cli CLI coordinator daemon

Comments

@haixuanTao
Copy link
Collaborator

Describe the bug
Sometimes an environment variable is updated before using the dora-cli. But because the coordinator and the daemon are background process. They do not see the updated environment variable. This can generate some confusion.

To Reproduce
Steps to reproduce the behavior:

  1. Dora start daemon: dora up
  2. Change an env variable
  3. Start a new dataflow: dora start dataflow.yaml
  4. bis. Expect the env variable to be the latest one but see the previous one or an empty one
  5. Stop dataflow: dora stop
  6. Destroy dataflow: dora destroy

Expected behavior
I would expect the cli to pass the latest value to the daemon and coordinator.

@github-actions github-actions bot added bug Something isn't working cli CLI coordinator daemon labels Oct 18, 2023
@phil-opp
Copy link
Collaborator

I would expect the cli to pass the latest value to the daemon and coordinator.

We need to keep deployment to remote machines in mind. We probably don't want the CLI to pass the local env variables to the remote deploy machine.

@phil-opp
Copy link
Collaborator

We already support setting environment variables through the dataflow yaml file, right? Would this be a possible alternative for your use case?

@haixuanTao
Copy link
Collaborator Author

So this is the thing, when we use environment variable within the YAML Graph, it uses old env variables.

In the case we think that environment variables should not be shared in a distributed environment which I can understand, maybe we should change our approach to using variables.

@haixuanTao
Copy link
Collaborator Author

One alternative would be to add a --env argument that enables user to pass env variable when running dataflows, in the manner of docker

@phil-opp
Copy link
Collaborator

Ah, understood! So I guess the main issue is the env expansion that we introduced in 933dadc. For some cases (such as the $HOME example in the commit description) we want that the expansion happens on the target machine, but in other cases we want it to happen on the CLI machine.

I'm fine with changing the behavior to do the env expansion on the CLI machine already, but I fear that there is always some chance of confusion, depending on what you expect. Not sure what we can do to avoid this though, aside from documenting it clearly....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cli CLI coordinator daemon
Projects
None yet
Development

No branches or pull requests

2 participants