<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Using-paramiko,-pandas-and-a-VM-class" data-toc-modified-id="Using-paramiko,-pandas-and-a-VM-class-1">Using paramiko, pandas and a VM class</a></span><ul class="toc-item"><li><span><a href="#Package-Imports" data-toc-modified-id="Package-Imports-1.1">Package Imports</a></span></li><li><span><a href="#Constants" data-toc-modified-id="Constants-1.2">Constants</a></span></li><li><span><a href="#Creating-1-VM-Class" data-toc-modified-id="Creating-1-VM-Class-1.3">Creating 1 VM Class</a></span></li><li><span><a href="#DataFrame-of-Classes" data-toc-modified-id="DataFrame-of-Classes-1.4">DataFrame of Classes</a></span></li></ul></li></ul></div>

# Using paramiko, pandas and a VM class
This notebook accompanies this medium blog post where I provide background and methodology. 

## Package Imports

In [1]:
#Data Science Stack
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

#virtualmachine.py python script
from virtualmachine import VM, clean_columns

#pandas notebook display formatting 
pd.options.display.max_columns = 100
pd.options.display.max_rows = 100

## Constants

In [2]:
HOST = '35.204.255.178'
USERNAME = 'louwjlabuschagne_gmail_com'
PUB_KEY = '/Users/louwrenslabuschagne/.ssh/id_rsa.pub'

## Creating 1 VM Class

In [3]:
vm = VM(HOST, USERNAME, PUB_KEY)

In [4]:
print(vm)

louwjlabuschagne_gmail_com@35.204.96.40 ✅


The default command for the `parse_key_value_output()` function is `lscpu`, but any colon separated bash output will be parsed into a DataFrame.

In [5]:
vm_lscpu = vm.parse_key_value_output('lscpu').T.head(10)
vm_lscpu.to_csv('../table_outputs/vm_lscpu.csv')
vm_lscpu

Unnamed: 0,0
ARCHITECTURE,x86_64
CPUOPMODES,"32-bit,64-bit"
BYTEORDER,LittleEndian
CPUS,1
ONLINECPUSLIST,0
THREADSPERCORE,1
CORESPERSOCKET,1
SOCKETS,1
NUMANODES,1
VENDORID,GenuineIntel


Like the `cat /proc/meminfo` command.

In [6]:
vm_meminfo = vm.parse_key_value_output('cat /proc/meminfo').T.head(10)
vm_meminfo.to_csv('../table_outputs/vm_meminfo.csv')
vm_meminfo

Unnamed: 0,0
MEMTOTAL,1020416kB
MEMFREE,872124kB
MEMAVAILABLE,835732kB
BUFFERS,9820kB
CACHED,52020kB
SWAPCACHED,0kB
ACTIVE,92332kB
INACTIVE,17356kB
ACTIVEANON,47996kB
INACTIVEANON,2784kB


I advice making predefined functions for bash command who's output isn't alway as straight forward, but which you'll use a lot.

In [7]:
vm_df = vm.cmd_df()
vm_df.to_csv('../table_outputs/vm1_df.csv', index=False)
vm_df

Unnamed: 0,FILESYSTEM,1KBLOCKS,USED,AVAILABLE,USE_percentage_,MOUNTED_ON
0,udev,499068,0,499068,0.0,/dev
1,tmpfs,102044,2936,99108,0.03,/run
2,/dev/sda1,10253588,1090856,8622164,0.12,/
3,tmpfs,510208,0,510208,0.0,/dev/shm
4,tmpfs,5120,0,5120,0.0,/run/lock
5,tmpfs,510208,0,510208,0.0,/sys/fs/cgroup


Below we see this VM has a total of 44.96 Tb of storage available.

In [8]:
print('%.2f Kb total storage'%(vm.cmd_df().AVAILABLE.astype(int).sum()/1e3))

10245.88 Kb total storage


Or any other command you can think of.

In [9]:
vm.exec_command('which python')

['/usr/bin/python']

Literally...

In [10]:
vm.exec_command('ls --help')[0:20]

['Usage: ls [OPTION]... [FILE]...',
 'List information about the FILEs (the current directory by default).',
 'Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.',
 '',
 'Mandatory arguments to long options are mandatory for short options too.',
 '  -a, --all                  do not ignore entries starting with .',
 '  -A, --almost-all           do not list implied . and ..',
 '      --author               with -l, print the author of each file',
 '  -b, --escape               print C-style escapes for nongraphic characters',
 '      --block-size=SIZE      scale sizes by SIZE before printing them; e.g.,',
 "                               '--block-size=M' prints sizes in units of",
 '                               1,048,576 bytes; see SIZE format below',
 '  -B, --ignore-backups       do not list implied entries ending with ~',
 '  -c                         with -lt: sort by, and show, ctime (time of last',
 '                               modification of file st

## DataFrame of Classes

Now... The magic begins when we merge the nice abilities of `DataFrame`'s with our VM Class.


First we create a `DataFrame` with all our VMs.

In [11]:
ips = ['35.204.255.178',
             '35.204.96.40',
             '35.204.213.24',
             '35.204.115.95']

VMs = pd.DataFrame(dict(IP=ips))
VMs

Unnamed: 0,IP
0,35.204.255.178
1,35.204.115.95
2,35.204.96.40
3,35.204.213.24


Then we create a VM Class to store in the VM column in the VMs DataFrame

In [12]:
VMs['VM'] = VMs.apply(lambda row: VM(row.IP, USERNAME, PUB_KEY), axis=1)
VMs

Unnamed: 0,IP,VM
0,35.204.255.178,louwjlabuschagne_gmail_com@35.204.255.178 ✅
1,35.204.115.95,louwjlabuschagne_gmail_com@35.204.115.95 ✅
2,35.204.96.40,louwjlabuschagne_gmail_com@35.204.96.40 ✅
3,35.204.213.24,louwjlabuschagne_gmail_com@35.204.213.24 ✅


In [13]:
VMs.to_csv('../table_outputs/VM_connected_state.csv', index=False)

Now we can do some amazing things like getting the memory capabilities for all these machines.

In [14]:
meminfo = VMs.VM.apply(lambda vm: vm.parse_key_value_output('cat /proc/meminfo'))
meminfo

We just need a little helper function to convert our series of dataframes to a dataframe - this is an undesired artifact of using apply. Any suggestions welcome.

In [16]:
def series_to_df(series):
    df = pd.DataFrame()
    for d in series:
        df = pd.concat([df, d], axis=0, sort=True)
    df.reset_index(inplace=True, drop=True)
    return(df)

In [17]:
series_to_df(meminfo)

Unnamed: 0,ACTIVE,ACTIVEANON,ACTIVEFILE,ANONHUGEPAGES,ANONPAGES,BOUNCE,BUFFERS,CACHED,COMMITLIMIT,COMMITTED_AS,DIRECTMAP1G,DIRECTMAP2M,DIRECTMAP4K,DIRTY,HARDWARECORRUPTED,HUGEPAGESIZE,HUGEPAGES_FREE,HUGEPAGES_RSVD,HUGEPAGES_SURP,HUGEPAGES_TOTAL,INACTIVE,INACTIVEANON,INACTIVEFILE,KERNELSTACK,MAPPED,MEMAVAILABLE,MEMFREE,MEMTOTAL,MLOCKED,NFS_UNSTABLE,PAGETABLES,SHMEM,SHMEMHUGEPAGES,SHMEMPMDMAPPED,SLAB,SRECLAIMABLE,SUNRECLAIM,SWAPCACHED,SWAPFREE,SWAPTOTAL,UNEVICTABLE,VMALLOCCHUNK,VMALLOCTOTAL,VMALLOCUSED,WRITEBACK,WRITEBACKTMP
0,93816kB,48268kB,45548kB,0kB,48136kB,0kB,10016kB,53496kB,510208kB,122748kB,0kB,1001472kB,47092kB,40kB,0kB,2048kB,0,0,0,0,17816kB,4060kB,13756kB,1116kB,25924kB,832100kB,868292kB,1020416kB,0kB,0kB,2924kB,4212kB,0kB,0kB,19976kB,11128kB,8848kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB
1,91424kB,47996kB,43428kB,0kB,47888kB,0kB,9640kB,53188kB,1026232kB,113008kB,0kB,2050048kB,47092kB,8kB,0kB,2048kB,0,0,0,0,19248kB,4132kB,15116kB,1248kB,25860kB,1850372kB,1890916kB,2052468kB,0kB,0kB,2952kB,4288kB,0kB,0kB,21360kB,11248kB,10112kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB
2,92428kB,48036kB,44392kB,0kB,47892kB,0kB,9860kB,52028kB,510208kB,113296kB,0kB,1005568kB,42996kB,20kB,0kB,2048kB,0,0,0,0,17348kB,2784kB,14564kB,1132kB,25828kB,833044kB,869408kB,1020416kB,0kB,0kB,2964kB,2936kB,0kB,0kB,19980kB,11128kB,8852kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB
3,92632kB,48252kB,44380kB,0kB,48088kB,0kB,9724kB,55768kB,1026232kB,125520kB,0kB,2050048kB,47092kB,8kB,0kB,2048kB,0,0,0,0,20960kB,6700kB,14260kB,1272kB,25948kB,1847852kB,1888348kB,2052468kB,0kB,0kB,3004kB,6856kB,0kB,0kB,21436kB,11248kB,10188kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB


Now we can merge it back with our VMs DataFrame.

In [18]:
VMs = pd.concat([VMs, series_to_df(meminfo)], axis=1)
VMs

Unnamed: 0,IP,VM,ACTIVE,ACTIVEANON,ACTIVEFILE,ANONHUGEPAGES,ANONPAGES,BOUNCE,BUFFERS,CACHED,COMMITLIMIT,COMMITTED_AS,DIRECTMAP1G,DIRECTMAP2M,DIRECTMAP4K,DIRTY,HARDWARECORRUPTED,HUGEPAGESIZE,HUGEPAGES_FREE,HUGEPAGES_RSVD,HUGEPAGES_SURP,HUGEPAGES_TOTAL,INACTIVE,INACTIVEANON,INACTIVEFILE,KERNELSTACK,MAPPED,MEMAVAILABLE,MEMFREE,MEMTOTAL,MLOCKED,NFS_UNSTABLE,PAGETABLES,SHMEM,SHMEMHUGEPAGES,SHMEMPMDMAPPED,SLAB,SRECLAIMABLE,SUNRECLAIM,SWAPCACHED,SWAPFREE,SWAPTOTAL,UNEVICTABLE,VMALLOCCHUNK,VMALLOCTOTAL,VMALLOCUSED,WRITEBACK,WRITEBACKTMP
0,35.204.255.178,louwjlabuschagne_gmail_com@35.204.255.178 ✅,93816kB,48268kB,45548kB,0kB,48136kB,0kB,10016kB,53496kB,510208kB,122748kB,0kB,1001472kB,47092kB,40kB,0kB,2048kB,0,0,0,0,17816kB,4060kB,13756kB,1116kB,25924kB,832100kB,868292kB,1020416kB,0kB,0kB,2924kB,4212kB,0kB,0kB,19976kB,11128kB,8848kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB
1,35.204.115.95,louwjlabuschagne_gmail_com@35.204.115.95 ✅,91424kB,47996kB,43428kB,0kB,47888kB,0kB,9640kB,53188kB,1026232kB,113008kB,0kB,2050048kB,47092kB,8kB,0kB,2048kB,0,0,0,0,19248kB,4132kB,15116kB,1248kB,25860kB,1850372kB,1890916kB,2052468kB,0kB,0kB,2952kB,4288kB,0kB,0kB,21360kB,11248kB,10112kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB
2,35.204.96.40,louwjlabuschagne_gmail_com@35.204.96.40 ✅,92428kB,48036kB,44392kB,0kB,47892kB,0kB,9860kB,52028kB,510208kB,113296kB,0kB,1005568kB,42996kB,20kB,0kB,2048kB,0,0,0,0,17348kB,2784kB,14564kB,1132kB,25828kB,833044kB,869408kB,1020416kB,0kB,0kB,2964kB,2936kB,0kB,0kB,19980kB,11128kB,8852kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB
3,35.204.213.24,louwjlabuschagne_gmail_com@35.204.213.24 ✅,92632kB,48252kB,44380kB,0kB,48088kB,0kB,9724kB,55768kB,1026232kB,125520kB,0kB,2050048kB,47092kB,8kB,0kB,2048kB,0,0,0,0,20960kB,6700kB,14260kB,1272kB,25948kB,1847852kB,1888348kB,2052468kB,0kB,0kB,3004kB,6856kB,0kB,0kB,21436kB,11248kB,10188kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB


An voila, we've got a `DataFrame` with our VM data populated. Another interesting command to use would be `lscpu`. This time we call `series_to_df` directly on the output series of DataFrames from `apply()`

In [19]:
lcspu = series_to_df(VMs.VM.apply(lambda vm: vm.parse_key_value_output('lscpu')))

In [20]:
lcspu

Unnamed: 0,ARCHITECTURE,BOGOMIPS,BYTEORDER,CORESPERSOCKET,CPUFAMILY,CPUMHZ,CPUOPMODES,CPUS,FLAGS,HYPERVISORVENDOR,L1DCACHE,L1ICACHE,L2CACHE,L3CACHE,MODEL,MODELNAME,NUMANODE0CPUS,NUMANODES,ONLINECPUSLIST,SOCKETS,STEPPING,THREADSPERCORE,VENDORID,VIRTUALIZATIONTYPE
0,x86_64,4000.34,LittleEndian,1,6,2000.17,"32-bit,64-bit",1,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,0,1,0,1,3,1,GenuineIntel,full
1,x86_64,4000.34,LittleEndian,1,6,2000.174,"32-bit,64-bit",2,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,1,1,1,1,3,2,GenuineIntel,full
2,x86_64,4000.36,LittleEndian,1,6,2000.184,"32-bit,64-bit",1,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,0,1,0,1,3,1,GenuineIntel,full
3,x86_64,4000.4,LittleEndian,1,6,2000.2,"32-bit,64-bit",2,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,1,1,1,1,3,2,GenuineIntel,full


Concatenate the DataFrames again.

In [21]:
VMs = pd.concat([VMs, lcspu], axis=1)

In [22]:
VMs

Unnamed: 0,IP,VM,ACTIVE,ACTIVEANON,ACTIVEFILE,ANONHUGEPAGES,ANONPAGES,BOUNCE,BUFFERS,CACHED,COMMITLIMIT,COMMITTED_AS,DIRECTMAP1G,DIRECTMAP2M,DIRECTMAP4K,DIRTY,HARDWARECORRUPTED,HUGEPAGESIZE,HUGEPAGES_FREE,HUGEPAGES_RSVD,HUGEPAGES_SURP,HUGEPAGES_TOTAL,INACTIVE,INACTIVEANON,INACTIVEFILE,KERNELSTACK,MAPPED,MEMAVAILABLE,MEMFREE,MEMTOTAL,MLOCKED,NFS_UNSTABLE,PAGETABLES,SHMEM,SHMEMHUGEPAGES,SHMEMPMDMAPPED,SLAB,SRECLAIMABLE,SUNRECLAIM,SWAPCACHED,SWAPFREE,SWAPTOTAL,UNEVICTABLE,VMALLOCCHUNK,VMALLOCTOTAL,VMALLOCUSED,WRITEBACK,WRITEBACKTMP,ARCHITECTURE,BOGOMIPS,BYTEORDER,CORESPERSOCKET,CPUFAMILY,CPUMHZ,CPUOPMODES,CPUS,FLAGS,HYPERVISORVENDOR,L1DCACHE,L1ICACHE,L2CACHE,L3CACHE,MODEL,MODELNAME,NUMANODE0CPUS,NUMANODES,ONLINECPUSLIST,SOCKETS,STEPPING,THREADSPERCORE,VENDORID,VIRTUALIZATIONTYPE
0,35.204.255.178,louwjlabuschagne_gmail_com@35.204.255.178 ✅,93816kB,48268kB,45548kB,0kB,48136kB,0kB,10016kB,53496kB,510208kB,122748kB,0kB,1001472kB,47092kB,40kB,0kB,2048kB,0,0,0,0,17816kB,4060kB,13756kB,1116kB,25924kB,832100kB,868292kB,1020416kB,0kB,0kB,2924kB,4212kB,0kB,0kB,19976kB,11128kB,8848kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB,x86_64,4000.34,LittleEndian,1,6,2000.17,"32-bit,64-bit",1,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,0,1,0,1,3,1,GenuineIntel,full
1,35.204.115.95,louwjlabuschagne_gmail_com@35.204.115.95 ✅,91424kB,47996kB,43428kB,0kB,47888kB,0kB,9640kB,53188kB,1026232kB,113008kB,0kB,2050048kB,47092kB,8kB,0kB,2048kB,0,0,0,0,19248kB,4132kB,15116kB,1248kB,25860kB,1850372kB,1890916kB,2052468kB,0kB,0kB,2952kB,4288kB,0kB,0kB,21360kB,11248kB,10112kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB,x86_64,4000.34,LittleEndian,1,6,2000.174,"32-bit,64-bit",2,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,1,1,1,1,3,2,GenuineIntel,full
2,35.204.96.40,louwjlabuschagne_gmail_com@35.204.96.40 ✅,92428kB,48036kB,44392kB,0kB,47892kB,0kB,9860kB,52028kB,510208kB,113296kB,0kB,1005568kB,42996kB,20kB,0kB,2048kB,0,0,0,0,17348kB,2784kB,14564kB,1132kB,25828kB,833044kB,869408kB,1020416kB,0kB,0kB,2964kB,2936kB,0kB,0kB,19980kB,11128kB,8852kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB,x86_64,4000.36,LittleEndian,1,6,2000.184,"32-bit,64-bit",1,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,0,1,0,1,3,1,GenuineIntel,full
3,35.204.213.24,louwjlabuschagne_gmail_com@35.204.213.24 ✅,92632kB,48252kB,44380kB,0kB,48088kB,0kB,9724kB,55768kB,1026232kB,125520kB,0kB,2050048kB,47092kB,8kB,0kB,2048kB,0,0,0,0,20960kB,6700kB,14260kB,1272kB,25948kB,1847852kB,1888348kB,2052468kB,0kB,0kB,3004kB,6856kB,0kB,0kB,21436kB,11248kB,10188kB,0kB,0kB,0kB,0kB,0kB,34359738367kB,0kB,0kB,0kB,x86_64,4000.4,LittleEndian,1,6,2000.2,"32-bit,64-bit",2,fpuvmedepsetscmsrpaemcecx8apicsepmtrrpgemcacmo...,KVM,32K,32K,256K,56320K,85,Intel(R)Xeon(R)CPU@2.00GHz,1,1,1,1,3,2,GenuineIntel,full


In [23]:
df_2_save = VMs.rename(columns={'ip':'IP'})[['IP', 'ACTIVE', 'MEMAVAILABLE', 'MEMFREE', 'MEMTOTAL',
                            'CPUFAMILY', 'CPUMHZ', 'CPUOPMODES', 'CPUS', 'HYPERVISORVENDOR',
                            'VENDORID','HYPERVISORVENDOR', 'MODEL', 'MODELNAME', 'NUMANODE0CPUS', 
                             'NUMANODES', 'ONLINECPUSLIST','SOCKETS', 'STEPPING', 'THREADSPERCORE', 'VENDORID',
                             'VIRTUALIZATIONTYPE']].T
df_2_save.to_csv('../table_outputs/VMs_info.csv')
df_2_save

Unnamed: 0,0,1,2,3
IP,35.204.255.178,35.204.115.95,35.204.96.40,35.204.213.24
ACTIVE,93816kB,91424kB,92428kB,92632kB
MEMAVAILABLE,832100kB,1850372kB,833044kB,1847852kB
MEMFREE,868292kB,1890916kB,869408kB,1888348kB
MEMTOTAL,1020416kB,2052468kB,1020416kB,2052468kB
CPUFAMILY,6,6,6,6
CPUMHZ,2000.170,2000.174,2000.184,2000.200
CPUOPMODES,"32-bit,64-bit","32-bit,64-bit","32-bit,64-bit","32-bit,64-bit"
CPUS,1,2,1,2
HYPERVISORVENDOR,KVM,KVM,KVM,KVM
