Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cylc get-config: memory usage and reduce use from Rose #22

Closed
arjclark opened this issue Oct 15, 2012 · 11 comments
Closed

cylc get-config: memory usage and reduce use from Rose #22

arjclark opened this issue Oct 15, 2012 · 11 comments
Assignees
Milestone

Comments

@arjclark
Copy link
Contributor

Calls to cylc get-config can kill the server. Need to investigate, and possibly further reduce our reliance on it.

This problem is less urgent now that we have reduced the number of calls of cylc get-config to 1 per task event. If and when we have the database for Cylc task event or status, we should be able to make further improvement to this.

@ghost ghost assigned matthewrmshin Oct 15, 2012
@matthewrmshin
Copy link
Member

Looks like this is still an issue for large suites, as it can take cylc get-config can take a long time and is very memory hungry. We should develop a workaround in the mean time.

@matthewrmshin
Copy link
Member

See cylc/cylc-flow#170.

@hjoliver
Copy link
Contributor

Matt, I've just pushed my config parsing changes to the cylc repository, for cylc/cylc-flow#170, having managed to get get-config() down to under 10s for your test suite: 1000 tasks, each of which configures almost 300 environment variables, going through 4 levels of inheritance (then we have to add default values for the complete [runtime] structure, for each task). Cylc holds all task configuration data, the result of inheritance processing, in memory ready for use (and has to add in defaults for each task too, for the entire [runtime] structure). So I think it's clear, particularly with Jinja2, it will always be easy to generate a suite that will take a lot of time and/or memory to parse. If your only problematic use of get-config() in extracting task host and owner in an event handler script (is it?) I would suggest again that cylc should just add task host and owner to the argument list supplied to task event handlers. Do you have a specific objection to that? Event handlers that don't need the extra information can ignore it.

@hjoliver
Copy link
Contributor

A comment on possible further efficiency gains that may be possible: cylc/cylc-flow#170 (comment)

matthewrmshin added a commit to matthewrmshin/rose that referenced this issue Nov 21, 2012
Support the new syntax of "cylc get-config", which can return owner and
host information of a task or all tasks in a single call.
benfitzpatrick added a commit that referenced this issue Nov 21, 2012
@matthewrmshin
Copy link
Member

cylc get-config and usage within Rose have been made much more efficient, so I am pushing back the milestone of this issue.

cylc/cylc-flow#86, cylc/cylc-flow#140, cylc/cylc-flow#181 may improve things.

@matthewrmshin
Copy link
Member

Several technical challenges remain. In order to pull the relevant job logs of a task at a given cycle time from a remote host, we'll need the user@host of each exited job of this task at this cycle time. I think we are not storing them in the suite run time DB at the moment.

@arjclark
Copy link
Contributor Author

@matthewrmshin - we have been storing user@host in the suite db for a while now, found in the task_states table. N.B. these are not stored in the task_events table.

@matthewrmshin
Copy link
Member

As discussed, we'll need the user@host for each submit for a generic solution.

@arjclark
Copy link
Contributor Author

@matthewrmshin - will put up a pull request shortly that does this.

@arjclark
Copy link
Contributor Author

cylc/cylc-flow#296 adds the required functionality for this.

@matthewrmshin
Copy link
Member

See #477.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants