Cluster_check

What:
Cluster_check is a collection of functions allowing any user of RStudio/RStudio server to probe the ongoing processes on a computer or a server.
Why:
We have developed these functions with the intention to raise awareness about resources available for those using RStudio server, and about the importance of sharing them in a friendly way.
We encourage all users who often execute commands in parallel on the server, to use these functions to have a look at the current server usage, and to evaluate the time they expect to need, before setting the number of cores/threads they want to use. Only 32 cores are available on the server!

Authors: SIMON M.¹, PAGEAUD Y.¹
Contributors: All contributors are welcome to join!
1- DKFZ - Division of Applied Bioinformatics, Germany.

R Compatibility: Version 3.6.0
Last Update: 14/11/2019

Content

This repository contains currently 2 scrips:

checkCluster.R which contains the function checkCluster().
ps_to_df.R which contains the function ps.to.df().

Prerequesites

If you plan to use the function checkCluster() you need to install the R package magrittr.
If you plan to use the function ps.to.df() no additional package installation is needed.

Documentation

checkCluster()

⚠️ Work in progress !

ps.to.df()

Description:
ps.to.df() returns the output from a ps command as a R data.frame. It works with different parameters to which you can supply different values in a similar way than when running a ps command in a classic Linux terminal.
Parameters:

simple.selection - A character string specifying a ps option listed as a 'SIMPLE PROCESS SELECTION' option (Default: simple.selection = '-A').
bylist.selection - A character string specifying a ps option listed as a 'PROCESS SELECTION BY LIST' option.
process.sort - A character string specifying one or multiple keywords to be used for sorting processes. If multiple keywords are specified, they should be comma separated. A keyword can be preceded by a '-' sign for decreasing order (Default: process.sort = '-%cpu').
top.rows - An integer to specify the number of top rows to keep in the final data.frame.
other - A character string to specify a ps option not related to the selection of processes (Value supported: other = 'L').

Value:
A R data.frame containing processes by row, and for all processes the percentage of CPU use, the percentage of Memory use, the PID and PPID, the user name, the command name, the date and time when the process has been created, the time since the process is running and status of the process.

Examples:
To list all on going processes:

ps.to.df()

To list all on going "rsession" processes:

ps.to.df(bylist.selection = '-C rsession')

To sort all on going processes by percentage of memory used in decreasing order:

ps.to.df(process.sort='-%mem')

To sort the top 10 on going processes by percentage of CPU usage in decreasing order:

ps.to.df(process.sort = '-%cpu', top.rows = 10)
ps.to.df(top.rows = 10) #Simplified - Processes are sorted by decreasing order of %CPU usage by default.

References:
Additional information can be found in the original documentation of ps: http://man7.org/linux/man-pages/man1/ps.1.html

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
LICENSE		LICENSE
README.md		README.md
checkCluster.R		checkCluster.R
checkClusterComplete.R		checkClusterComplete.R
clusterInfo.R		clusterInfo.R
ps_to_df.R		ps_to_df.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cluster_check

Content

Prerequesites

Documentation

checkCluster()

ps.to.df()

About

Releases

Packages

Contributors 3

Languages

License

mathosi/cluster_check

Folders and files

Latest commit

History

Repository files navigation

Cluster_check

Content

Prerequesites

Documentation

checkCluster()

ps.to.df()

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages