Skip to content

Conversation

@qmlmoon
Copy link
Contributor

@qmlmoon qmlmoon commented Jun 25, 2014

No description provided.

@rmetzger
Copy link
Contributor

I guess the rewrite of the yaml file works well with this pull request. But I think this approach is not always working.
This approach assumes that the configuration file is shared among all cluster nodes (for example via NFS).
I would implement this by passing (via the cli invocation) the Jobmanager's hostname (or ip address) to the TaskManagers when starting them and leave the yaml file untouched.

@qmlmoon
Copy link
Contributor Author

qmlmoon commented Jun 27, 2014

I would like to rewrite at least the yaml file on the master node. Because when you run stratosphere, it also reads config file to get the jobmanager address. And CliFrontend already has an option -m to pass jobmanager address, which requires host:port. I think it won't be nice to add another option like default jobmanager address.

@qmlmoon
Copy link
Contributor Author

qmlmoon commented Jun 30, 2014

The newest commit using cli invocation to pass jobmanager's hostname to JobManager and TaskManager. It still try to rewrite the config file on master node since CliFrontend also reads config to get jobmanager address.

@rmetzger
Copy link
Contributor

I agree. Rewriting the config for the CliFrontend is good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the & removed on purpose?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StephanEwen
Copy link
Contributor

The rewriting of the config by the script seems fragile. Why is that even needed, when the taskmanagers get the jobmanager host as a parameter?

Also: What happens if someone put an address there (like a specific address to bind to? Seems that it will be just overwritten. That we cannot do.

@qmlmoon
Copy link
Contributor Author

qmlmoon commented Aug 15, 2014

The rewriting of the config is because CliFrontend also reads config to get jobmanager address. If someone wants to configure the address, it will be overwritten, this is the drawback of automatically configuring the address.

@StephanEwen
Copy link
Contributor

Then I think we need a bit of a different approach.

Can we remove the jobmanager rpc address from the config file, and have the script check whether there is an entry for the jobmanager address. If not, it uses the local hostname.

I would actually prefer to have something like a .jobmanager file in yarn, rather than rewriting the config. Seems much cleaner.

Conflicts:
	stratosphere-dist/src/main/stratosphere-bin/bin/config.sh
	stratosphere-dist/src/main/stratosphere-bin/bin/jobmanager.sh
	stratosphere-dist/src/main/stratosphere-bin/bin/start-cluster.sh
	stratosphere-dist/src/main/stratosphere-bin/bin/taskmanager.sh
	stratosphere-runtime/src/main/java/eu/stratosphere/nephele/jobmanager/JobManager.java
	stratosphere-runtime/src/main/java/eu/stratosphere/nephele/taskmanager/TaskManager.java
@StephanEwen
Copy link
Contributor

I am skeptical whether it is a good idea to modify the config file at all. I bet that many people make the implicit assumption that this file is constant. And modifying it when the host is "localhost" is hidden constraint that makes it tricky to understand why the script sometimes configures things for you and sometimes not.

I vote for an approach to add this as a command line parameter to the ssh-transmitted call to start the taskmanager. The command line parser should check for that argument similarly as it checks for the config directory.

@qmlmoon
Copy link
Contributor Author

qmlmoon commented Dec 1, 2014

I agree. Modifying the config file not sounds like a good approach. But otherwise user has to type like ./start-cluster.sh -m default to tell flink don't read job manager address from config file, use my hostname. Also in the run command ./flink run -m default. And the motivation of this issue was to make it more convenient.

@StephanEwen
Copy link
Contributor

Why don't we comment out the jm address in the config by default? Then no entry in the config would mean that the parameter appended to the cli command should be taken...

@qmlmoon
Copy link
Contributor Author

qmlmoon commented Dec 1, 2014

Ok. I have this in mind, will do it in another PR since most of the commits here would be irrelevant. Then this one could be closed.

@rmetzger
Copy link
Contributor

rmetzger commented Dec 1, 2014

I think only the author of the pull request (you) can close it.

@qmlmoon
Copy link
Contributor Author

qmlmoon commented Dec 1, 2014

Yeah. I meant it.

@uce
Copy link
Contributor

uce commented Apr 23, 2015

Thanks for the PR. I've looked into the other approach (#248) and I think you can safely close this on now.

@qmlmoon qmlmoon closed this Apr 23, 2015
tillrohrmann pushed a commit to tillrohrmann/flink that referenced this pull request Sep 27, 2018
zhijiangW pushed a commit to zhijiangW/flink that referenced this pull request Jul 23, 2019
tzulitai added a commit to tzulitai/flink that referenced this pull request Jan 15, 2021
dinchamion pushed a commit to dinchamion/flink that referenced this pull request Feb 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants