<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Using-paramiko,-pandas-and-a-VM-class" data-toc-modified-id="Using-paramiko,-pandas-and-a-VM-class-1">Using paramiko, pandas and a VM class</a></span><ul class="toc-item"><li><span><a href="#Package-Imports" data-toc-modified-id="Package-Imports-1.1">Package Imports</a></span></li><li><span><a href="#Constants" data-toc-modified-id="Constants-1.2">Constants</a></span></li><li><span><a href="#Creating-1-VM-Class" data-toc-modified-id="Creating-1-VM-Class-1.3">Creating 1 VM Class</a></span></li><li><span><a href="#DataFrame-of-Classes" data-toc-modified-id="DataFrame-of-Classes-1.4">DataFrame of Classes</a></span></li></ul></li></ul></div>

# Using paramiko, pandas and a VM class
This notebook accompanies this medium blog post where I provide background and methodology. 

## Package Imports

In [1]:
#Data Science Stack
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

#virtualmachine.py python script
from virtualmachine import VM, clean_columns

#pandas notebook display formatting 
pd.options.display.max_columns = 100
pd.options.display.max_rows = 100

## Constants

In [2]:
HOST = '35.204.7.190'
USERNAME = 'louwjlabuschagne_gmail_com'
PUB_KEY = '/Users/louwrenslabuschagne/.ssh/id_rsa.pub'

## Creating 1 VM Class

In [3]:
vm = VM(HOST, USERNAME, PUB_KEY)

In [4]:
print(vm)

louwjlabuschagne_gmail_com@35.204.7.190 ✅


The default command for the `parse_key_value_output()` function is `lscpu`, but any colon separated bash output will be parsed into a DataFrame.

In [5]:
vm_lscpu = vm.parse_key_value_output('lscpu').T.head(10)
vm_lscpu.to_csv('../table_outputs/vm_lscpu.csv')
vm_lscpu

Unnamed: 0,0
ARCHITECTURE,x86_64
CPUOPMODES,"32-bit,64-bit"
BYTEORDER,LittleEndian
CPUS,1
ONLINECPUSLIST,0
THREADSPERCORE,1
CORESPERSOCKET,1
SOCKETS,1
NUMANODES,1
VENDORID,GenuineIntel


Like the `cat /proc/meminfo` command.

In [6]:
vm_meminfo = vm.parse_key_value_output('cat /proc/meminfo').T.head(10)
vm_meminfo.to_csv('../table_outputs/vm_meminfo.csv')
vm_meminfo

Unnamed: 0,0
MEMTOTAL,1020416kB
MEMFREE,869804kB
MEMAVAILABLE,832908kB
BUFFERS,9104kB
CACHED,53076kB
SWAPCACHED,0kB
ACTIVE,90008kB
INACTIVE,19704kB
ACTIVEANON,47680kB
INACTIVEANON,4060kB


I advice making predefined functions for bash command who's output isn't alway as straight forward, but which you'll use a lot.

In [7]:
vm_df = vm.cmd_df()
vm_df.to_csv('../table_outputs/vm1_df.csv', index=False)
vm_df

Unnamed: 0,FILESYSTEM,1KBLOCKS,USED,AVAILABLE,USE_percentage_,MOUNTED_ON
0,udev,499068,0,499068,0.0,/dev
1,tmpfs,102044,4212,97832,0.05,/run
2,/dev/sda1,10253588,1092372,8620648,0.12,/
3,tmpfs,510208,0,510208,0.0,/dev/shm
4,tmpfs,5120,0,5120,0.0,/run/lock
5,tmpfs,510208,0,510208,0.0,/sys/fs/cgroup


Below we see this VM has a total of 44.96 Tb of storage available.

In [8]:
print('%.2f Kb total storage'%(vm.cmd_df().AVAILABLE.astype(int).sum()/1e3))

10243.08 Kb total storage


Or any other command you can think of.

In [9]:
vm.exec_command('which python')

['/usr/bin/python']

Literally...

In [10]:
vm.exec_command('ls --help')[0:20]

['Usage: ls [OPTION]... [FILE]...',
 'List information about the FILEs (the current directory by default).',
 'Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.',
 '',
 'Mandatory arguments to long options are mandatory for short options too.',
 '  -a, --all                  do not ignore entries starting with .',
 '  -A, --almost-all           do not list implied . and ..',
 '      --author               with -l, print the author of each file',
 '  -b, --escape               print C-style escapes for nongraphic characters',
 '      --block-size=SIZE      scale sizes by SIZE before printing them; e.g.,',
 "                               '--block-size=M' prints sizes in units of",
 '                               1,048,576 bytes; see SIZE format below',
 '  -B, --ignore-backups       do not list implied entries ending with ~',
 '  -c                         with -lt: sort by, and show, ctime (time of last',
 '                               modification of file st

## DataFrame of Classes

Now... The magic begins when we merge the nice abilities of `DataFrame`'s with our VM Class.


First we create a `DataFrame` with all our VMs.

In [23]:
ips = ['35.204.7.190',
'35.204.197.82',
'35.204.4.10',
'35.204.193.154']

VMs = pd.DataFrame(dict(IP=ips))
VMs

Unnamed: 0,IP
0,35.204.7.190
1,35.204.197.82
2,35.204.4.10
3,35.204.193.154


Then we create a VM Class to store in the VM column in the VMs DataFrame

In [24]:
VMs['VM'] = VMs.apply(lambda row: VM(row.IP, USERNAME, PUB_KEY), axis=1)
VMs

Unnamed: 0,IP,VM
0,35.204.7.190,louwjlabuschagne_gmail_com@35.204.7.190 ✅
1,35.204.197.82,louwjlabuschagne_gmail_com@35.204.197.82 ✅
2,35.204.4.10,louwjlabuschagne_gmail_com@35.204.4.10 ✅
3,35.204.193.154,louwjlabuschagne_gmail_com@35.204.193.154 ✅


In [25]:
VMs.to_csv('../table_outputs/VM_connected_state.csv', index=False)

Now we can do some amazing things like getting the memory capabilities for all these machines.

In [26]:
lscpu = VMs.VM.apply(lambda vm: vm.parse_key_value_output('lscpu'))
lscpu

0      ARCHITECTURE     CPUOPMODES     BYTEORDER CP...
1      ARCHITECTURE     CPUOPMODES     BYTEORDER CP...
2      ARCHITECTURE     CPUOPMODES     BYTEORDER CP...
3      ARCHITECTURE     CPUOPMODES     BYTEORDER CP...
Name: VM, dtype: object

We just need a little helper function to convert our series of dataframes to a dataframe - this is an undesired artifact of using apply. Any suggestions welcome.

In [27]:
def series_to_df(series):
    df = pd.DataFrame()
    for d in series:
        df = pd.concat([df, d], axis=0, sort=True)
    df.reset_index(inplace=True, drop=True)
    return(df)

In [28]:
series_to_df(lscpu)

Unnamed: 0,ARCHITECTURE,BOGOMIPS,BYTEORDER,CORESPERSOCKET,CPUFAMILY,CPUMHZ,CPUOPMODES,CPUS,FLAGS,HYPERVISORVENDOR,L1DCACHE,L1ICACHE,L2CACHE,L3CACHE,MODEL,MODELNAME,NUMANODE0CPUS,NUMANODES,ONLINECPUSLIST,SOCKETS,STEPPING,THREADSPERCORE,VENDORID,VIRTUALIZATIONTYPE
0,x86_64,4000.37,LittleEndian,1,6,2000.186,"32-bit,64-bit",1,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,0,1,0,1,3,1,GenuineIntel,full
1,x86_64,4000.37,LittleEndian,1,6,2000.186,"32-bit,64-bit",2,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,1,1,1,1,3,2,GenuineIntel,full
2,x86_64,4000.36,LittleEndian,1,6,2000.184,"32-bit,64-bit",1,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,0,1,0,1,3,1,GenuineIntel,full
3,x86_64,4000.36,LittleEndian,1,6,2000.184,"32-bit,64-bit",2,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,1,1,1,1,3,2,GenuineIntel,full


Now we can merge it back with our VMs DataFrame.

In [29]:
VMs = pd.concat([VMs, series_to_df(lscpu)], axis=1)
VMs

Unnamed: 0,IP,VM,ARCHITECTURE,BOGOMIPS,BYTEORDER,CORESPERSOCKET,CPUFAMILY,CPUMHZ,CPUOPMODES,CPUS,FLAGS,HYPERVISORVENDOR,L1DCACHE,L1ICACHE,L2CACHE,L3CACHE,MODEL,MODELNAME,NUMANODE0CPUS,NUMANODES,ONLINECPUSLIST,SOCKETS,STEPPING,THREADSPERCORE,VENDORID,VIRTUALIZATIONTYPE
0,35.204.7.190,louwjlabuschagne_gmail_com@35.204.7.190 ✅,x86_64,4000.37,LittleEndian,1,6,2000.186,"32-bit,64-bit",1,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,0,1,0,1,3,1,GenuineIntel,full
1,35.204.197.82,louwjlabuschagne_gmail_com@35.204.197.82 ✅,x86_64,4000.37,LittleEndian,1,6,2000.186,"32-bit,64-bit",2,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,1,1,1,1,3,2,GenuineIntel,full
2,35.204.4.10,louwjlabuschagne_gmail_com@35.204.4.10 ✅,x86_64,4000.36,LittleEndian,1,6,2000.184,"32-bit,64-bit",1,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,0,1,0,1,3,1,GenuineIntel,full
3,35.204.193.154,louwjlabuschagne_gmail_com@35.204.193.154 ✅,x86_64,4000.36,LittleEndian,1,6,2000.184,"32-bit,64-bit",2,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,1,1,1,1,3,2,GenuineIntel,full


In [33]:
VMs.drop('FLAGS', axis=1).T.to_csv('../table_outputs/VMs_lscpu.csv')

An voila, we've got a `DataFrame` with our VM data populated. Another interesting command to use would be `lscpu`. This time we call `series_to_df` directly on the output series of DataFrames from `apply()`

In [18]:
meminfo = series_to_df(VMs.VM.apply(lambda vm: vm.parse_key_value_output('cat /proc/meminfo')))

meminfo

Concatenate the DataFrames again.

In [20]:
VMs = pd.concat([VMs, meminfo], axis=1)

In [21]:
VMs

Unnamed: 0,IP,VM,ACTIVE,ACTIVEANON,ACTIVEFILE,ANONHUGEPAGES,ANONPAGES,BOUNCE,BUFFERS,CACHED,COMMITLIMIT,COMMITTED_AS,DIRECTMAP1G,DIRECTMAP2M,DIRECTMAP4K,DIRTY,HARDWARECORRUPTED,HUGEPAGESIZE,HUGEPAGES_FREE,HUGEPAGES_RSVD,HUGEPAGES_SURP,HUGEPAGES_TOTAL,INACTIVE,INACTIVEANON,INACTIVEFILE,KERNELSTACK,MAPPED,MEMAVAILABLE,MEMFREE,MEMTOTAL,MLOCKED,NFS_UNSTABLE,PAGETABLES,SHMEM,SHMEMHUGEPAGES,SHMEMPMDMAPPED,SLAB,SRECLAIMABLE,SUNRECLAIM,SWAPCACHED,SWAPFREE,SWAPTOTAL,UNEVICTABLE,VMALLOCCHUNK,VMALLOCTOTAL,VMALLOCUSED,WRITEBACK,WRITEBACKTMP,ARCHITECTURE,BOGOMIPS,BYTEORDER,CORESPERSOCKET,CPUFAMILY,CPUMHZ,CPUOPMODES,CPUS,FLAGS,HYPERVISORVENDOR,L1DCACHE,L1ICACHE,L2CACHE,L3CACHE,MODEL,MODELNAME,NUMANODE0CPUS,NUMANODES,ONLINECPUSLIST,SOCKETS,STEPPING,THREADSPERCORE,VENDORID,VIRTUALIZATIONTYPE
0,35.204.7.190,louwjlabuschagne_gmail_com@35.204.7.190 ✅,90520kB,47692kB,42828kB,0kB,47564kB,0kB,9136kB,53356kB,510208kB,122912kB,0kB,1003520kB,45044kB,32kB,0kB,2048kB,0,0,0,0,19516kB,4060kB,15456kB,1164kB,25484kB,833352kB,870088kB,1020416kB,0kB,0kB,2964kB,4212kB,0kB,0kB,19924kB,11052kB,8872kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB,x86_64,4000.37,LittleEndian,1,6,2000.186,"32-bit,64-bit",1,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,0,1,0,1,3,1,GenuineIntel,full
1,35.204.197.82,louwjlabuschagne_gmail_com@35.204.197.82 ✅,91228kB,47804kB,43424kB,0kB,47648kB,0kB,9144kB,53096kB,1026232kB,113492kB,0kB,2052096kB,45044kB,8kB,0kB,2048kB,0,0,0,0,18664kB,4132kB,14532kB,1280kB,25028kB,1850344kB,1891216kB,2052468kB,0kB,0kB,2936kB,4288kB,0kB,0kB,21372kB,11176kB,10196kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB,x86_64,4000.37,LittleEndian,1,6,2000.186,"32-bit,64-bit",2,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,1,1,1,1,3,2,GenuineIntel,full
2,35.204.4.10,louwjlabuschagne_gmail_com@35.204.4.10 ✅,89964kB,47692kB,42272kB,0kB,47568kB,0kB,9096kB,53012kB,510208kB,122944kB,0kB,1005568kB,42996kB,12kB,0kB,2048kB,0,0,0,0,19688kB,4060kB,15628kB,1164kB,25584kB,832880kB,869824kB,1020416kB,0kB,0kB,2948kB,4212kB,0kB,0kB,19944kB,11020kB,8924kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB,x86_64,4000.36,LittleEndian,1,6,2000.184,"32-bit,64-bit",1,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,0,1,0,1,3,1,GenuineIntel,full
3,35.204.193.154,louwjlabuschagne_gmail_com@35.204.193.154 ✅,90576kB,47976kB,42600kB,0kB,47836kB,0kB,9088kB,53100kB,1026232kB,113444kB,0kB,2052096kB,45044kB,12kB,0kB,2048kB,0,0,0,0,19436kB,4132kB,15304kB,1328kB,25708kB,1849908kB,1890804kB,2052468kB,0kB,0kB,2976kB,4288kB,0kB,0kB,21496kB,11180kB,10316kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB,x86_64,4000.36,LittleEndian,1,6,2000.184,"32-bit,64-bit",2,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,1,1,1,1,3,2,GenuineIntel,full


In [22]:
df_2_save = VMs.rename(columns={'ip':'IP'})[['IP', 'ACTIVE', 'MEMAVAILABLE', 'MEMFREE', 'MEMTOTAL',
                            'CPUFAMILY', 'CPUMHZ', 'CPUOPMODES', 'CPUS', 'HYPERVISORVENDOR',
                            'VENDORID','HYPERVISORVENDOR', 'MODEL', 'MODELNAME', 'NUMANODE0CPUS', 
                             'NUMANODES', 'ONLINECPUSLIST','SOCKETS', 'STEPPING', 'THREADSPERCORE', 'VENDORID',
                             'VIRTUALIZATIONTYPE']].T
df_2_save.to_csv('../table_outputs/VMs_info.csv')
df_2_save

Unnamed: 0,0,1,2,3
IP,35.204.7.190,35.204.197.82,35.204.4.10,35.204.193.154
ACTIVE,90520kB,91228kB,89964kB,90576kB
MEMAVAILABLE,833352kB,1850344kB,832880kB,1849908kB
MEMFREE,870088kB,1891216kB,869824kB,1890804kB
MEMTOTAL,1020416kB,2052468kB,1020416kB,2052468kB
CPUFAMILY,6,6,6,6
CPUMHZ,2000.186,2000.186,2000.184,2000.184
CPUOPMODES,"32-bit,64-bit","32-bit,64-bit","32-bit,64-bit","32-bit,64-bit"
CPUS,1,2,1,2
HYPERVISORVENDOR,KVM,KVM,KVM,KVM
