Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
tree: 1ca0bd4497
Fetching contributors…

Cannot retrieve contributors at this time

106 lines (95 sloc) 6.766 kB
xltop is a continuous Lustre monitor with batch scheduler intergation.
The organization of xltop is shown below and consists of a single
master process (xltop-master), an ncurses based client (xltop),
process(es) to push batch scheduler job mappings (xltop-clusd), and for
each MDS/OSS a daemon (xltop-servd) to monitor Lustre export stats via
/proc.
xltop-master
/ | \
/ | \
/ | \
xltop | xltop-servd
(ncurses client) | (MDS/OSS monitor daemon)
|
xltop-clusd
(job mapping daemon)
xltop-master maintains a hierarchy of flows between consumers (hosts,
jobs, clusters, the universe) and providers (targets (TODO), servers,
filesystems, and the universe).
The design of xltop assumes that:
0) hosts never run more than one job at a time,
1) file systems may be mounted by multiple clusters,
2) hosts may belong to multiple lnets,
3) two servers may use different NIDs for the same host,
4) the same lnet name (for example o2ib) may be used multiple times,
Client Usage: xltop [OPTIONS]... [EXPRESSION...]
Expression may be one of:
owner=OWNER, clus[=CLUS], job[=JOB], host[=HOST], fs[=FS], serv[=SERV]
Types may be given by their first character; use 'u' and 'v' for the universes.
OPTIONS:
-c, --conf-dir=DIR read configuration from DIR
-f, --full-names show full host, job, and server names
-h, --help display this help and exit
-i, --interval=SECONDS update every SECONDS sceonds
-k, --key=KEY1[,KEY2...] sort results by KEY1,...
-l, --limit=NUM limit responses to NUM results
-p, --remote-port=PORT connect to master at PORT
-r, --remote-host=HOST connect to master on HOST
-s, --show-sum show sums rather than rates
-u, --ubuntu look snazzy on my terminal (terrible on xterms)
SORTING:
Sort keys are case insensitive. Keys containing 'W' select bytes written
(as a rate or sum according to use of -s, --show-sum). Similarly keys
containing 'R' and 'Q' are interpreted as bytes read and requests.
EXAMPLES:
xltop job serv (or xltop j s) # Show traffic between jobs and servers
xltop h=i101-101.ranger.tacc.utexas.edu # Show i101-101 to each filesystem
xltop o=kbdcat h s # Show hosts in kbdcat's jobs to /scratch servers
xltop h fs=ranger-scratch s # All hosts to /scratch servers
xltop h s=oss23.ranger.tacc.utexas.edu # All hosts to oss23
xltop j=2411369@ranger s # Job 2411369 to all servers
Client screenshot:
FILESYSTEM MDS/T LOAD1 LOAD5 LOAD15 TASKS OSS/T LOAD1 LOAD5 LOAD15 TASKS NIDS
ranger-work 1/1 1.52 3.48 4.41 609 14/84 2.74 2.08 2.09 1347 4212
ranger-scratch 1/1 0.13 0.20 0.54 584 50/300 2.52 1.94 1.52 1348 4213
ranger-share 1/1 0.93 1.20 1.72 544 6/36 3.55 1.37 0.90 1203 3960
JOB FS WR_MB/S RD_MB/S REQS/S OWNER NAME HOSTS
2526717 ranger-scratch 321.557 5.994 3556.133 tg803155 NST3.28-r0 20
login4 ranger-scratch 38.489 55.054 469.943 NONE NONE 1
2530927 ranger-scratch 16.526 0.000 39.942 dkcira Parametric 1
2529449 ranger-work 11.754 0.000 24.088 bealing PE-OH 4
2530975 ranger-work 11.108 0.007 23.620 vishnam2 batch 16
2529064 ranger-scratch 10.823 0.091 30.231 jc64248 runwrf 12
2530142 ranger-scratch 9.418 0.948 49.386 subin hnbo 16
2530928 ranger-scratch 7.991 0.000 18.092 dkcira Parametric 1
2530142 ranger-work 7.066 8.501 410.608 subin hnbo 16
2528099 ranger-scratch 6.775 0.000 69.664 dvt5076 wrf_nolan 32
2529489 ranger-work 6.166 0.000 13.467 yhe tenWcross_ 32
2529788 ranger-work 6.116 0.000 13.332 yhe tenWcross_ 32
2528098 ranger-scratch 6.022 0.000 67.693 dvt5076 wrf_nolan 32
2530176 ranger-scratch 5.972 0.000 33.241 tg458470 d2o_mol_32 8
2530055 ranger-scratch 5.215 0.000 18.108 yping bivkp 9
2529790 ranger-work 5.125 0.000 12.169 yhe tenWcross_ 32
2529920 ranger-work 5.085 0.009 12.057 hli iascpl 16
2531057 ranger-work 3.835 0.001 8.347 tibor tt_thermo_ 60
2530526 ranger-scratch 2.572 0.000 20.855 pchen mguadtpc 61
2530179 ranger-scratch 2.554 0.000 24.010 tg458470 d2o_mol_32 8
2526200 ranger-scratch 2.128 0.000 5.035 tg803418 ca0.1v0.75 1
2529517 ranger-scratch 1.839 0.000 3.773 lianyuan usf_fvcom 32
2526222 ranger-scratch 1.751 0.000 4.738 tg803418 ca0.6v0.75 1
2528401 ranger-scratch 1.638 0.000 3.397 gnelson equil_outs 8
2531086 ranger-work 1.626 0.340 13.370 cliu0001 schmIin_de 15
2530273 ranger-work 1.563 0.000 5.624 amorris Thrust4.00 3
2528400 ranger-scratch 1.528 0.000 3.151 gnelson equil_midd 8
2522312 ranger-work 1.089 0.366 3.846 frederic roms_nep6_ 16
2528755 ranger-scratch 1.003 0.000 2.530 tg458315 h1.5dy0.35 1
2529262 ranger-scratch 0.992 0.000 2.532 tg459103 pip3 4
2526210 ranger-scratch 0.973 0.006 2.319 tg803418 ca0.3v0.75 1
2530934 ranger-scratch 0.913 0.000 5.108 pedram Shear19 24
2529294 ranger-scratch 0.895 0.000 2.798 tgaborek wat-PLL-56 5
2530294 ranger-share 0.848 0.000 10.359 psr53 CMS21st 4
2529399 ranger-scratch 0.834 0.000 2.173 tg459103 pip2 4
2531092 ranger-work 0.755 0.000 7.542 avm aka2n_74 4
2527596 ranger-scratch 0.721 0.000 36.401 gmcivor uli_256 16
1-37 out of 2730 xltop - Wed Apr 25 08:01:15 2012
Build requires ncurses, libcurl, libev-4.4, and libconfuse-2.7.
Jump to Line
Something went wrong with that request. Please try again.