
### Assignment : Monitoring Resources Using Psutil

## Learning Objectives

At the end of the experiment, you will be able to:
 
- Understand, what does monitoring your device mean
- Explore various functions of `psutil` package
- Explore `multiprocessing` package
- Evaluate the advantage of parallelism using Psutil

### What does monitoring your device mean?

It means to keep track of various resources in the system and their utilization


Resources such as:
- CPU
- GPU 
- Memory (RAM, Swap space, and Hard disk space)
- Disks 
- Network 
- Sensors

### Why do we want to monitor various resources?

1. Monitoring and measuring allows us to get understand resource allocation better
2. Monitoring helps to regularly evaluate the performance of the critical system resources.
3. Helps to identify the process that is using the maximum resources.
4. It helps to evaluate if the current system's resources are sufficient to execute a particular task.
5. To reduce escalation of issues.

### Psutil

Psutil is a Python cross-platform library used to access system details and process utilities.
This library is used for system monitoring, profiling, limiting process resources, and the management of running processes.

Click [here](https://pypi.org/project/psutil/) to proceed to the official documentation of Psutil.

Let us explore various functions of Psutil.

### Import required packages

In [None]:
# Importing libraries
import psutil
import platform

### System profile




Here we will explore the Psutil functions that helps us explore about the system.

Profile your system to know the system name, OS version, if the system is a 64-bit architecture or 32-bit architecture, number of physical and virtual cores, and the max and min frequency of the CPU.

In [None]:
#Windows or Linux
uname = platform.uname()
print(f"System: {uname.system}")  

System: Linux


In [None]:
# System name
print(f"Node Name: {uname.node}") 

Node Name: f623167558d9


In [None]:
# OS release version like  10(Windows) or 5.4.0-72-generic(linux)
print(f"Release: {uname.release}") 

Release: 5.4.144+


In [None]:
print(f"Version: {uname.version}")

Version: #1 SMP Tue Dec 7 09:58:10 PST 2021


In [None]:
# machine can be AMD64 or x86-64
print(f"Machine: {uname.machine}")  

Machine: x86_64


In [None]:
#  Intel64 Family 6 or x86_64
print(f"Processor: {uname.processor}") 

Processor: x86_64


In [None]:
#Number of physical cores
print("Physical cores:", psutil.cpu_count(logical=False))

Physical cores: 1


In [None]:
print("Total cores:", psutil.cpu_count(logical=True))

Total cores: 2


* user – time spent by normal processes executing in user mode.
* system – time spent by processes executing in kernel mode.
* idle – time when system was idle.
* nice – time spent by priority processes executing in user mode.
* iowait – time spent waiting for I/O to complete. This is not accounted in idle time counter.
* irq – time spent for servicing hardware interrupts.
* softirq – time spent for servicing software interrupts.
* steal – time spent by other operating systems running in a virtualized environment.
* guest – time spent running a virtual CPU for guest operating systems under the control of the Linux kernel.

In [None]:
print(psutil.cpu_times())

scputimes(user=41.53, nice=0.0, system=18.53, idle=28.48, iowait=8.83, irq=0.0, softirq=0.55, steal=0.05, guest=0.0, guest_nice=0.0)


This function calculates the current system-wide percentage CPU utilization. It is recommended to provide time interval (seconds) as parameter to the function over which the average CPU usage will be calculated; ignoring the interval parameter could result in high variation in usage values.

In [None]:
print(psutil.cpu_percent(1))

13.0


* ctx_switches – number of context switches since boot.
* interrupts – number of interrupts since boot.
* soft_interrupts – number of software interrupts since boot.
* syscalls – number of system calls since boot. Always set to 0 in Ubuntu.

In [None]:
print("CPU Statistics", psutil.cpu_stats())

CPU Statistics scpustats(ctx_switches=386044, interrupts=218633, soft_interrupts=226355, syscalls=0)


In [None]:
print(psutil.boot_time())

1646920632.0


 This function returns the system boot time which is expressed in seconds since the epoch. 

### Monitoring and Limiting Memory

Virtual memory is a combination of RAM and the disk space that all the processes running on the CPU use, while Swap space is the portion of virtual memory on the hard disk used by the running processes when the RAM is full.

* total – total physical memory excluding swap.
* available – the memory that can be given instantly to processes without the system going into swap.
* used – memory used.
* free – memory not used at and is readily available
* active – memory currently in use or very recently used.
* inactive – memory that is marked as not used.
* buffers – cache data like file system metadata.
* cached – cached data
* shared – memory that may be accessed by multiple processes.

In [None]:
print(psutil.virtual_memory())

svmem(total=13622181888, available=12801183744, percent=6.0, used=549330944, free=11013804032, active=996847616, inactive=1388367872, buffers=122408960, cached=1936637952, shared=1208320, slab=157732864)


* total – total swap memory in bytes
* used – used swap memory in bytes
* free – free swap memory in bytes
* percent – the percentage usage that is calculated as (total – available) / total * 100
* sin – the number of bytes the system has swapped in from disk
* sout – the number of bytes the system has swapped out from disk

In [None]:
print(psutil.swap_memory())

sswap(total=0, used=0, free=0, percent=0, sin=0, sout=0)


In [None]:
def get_size(bytes, suffix="B"):
    """
    Scale bytes to its proper format- KB, MB, GB, TB and PB
    """
    factor = 1024
    for unit in ["", "K", "M", "G", "T", "P"]:
        if bytes < factor:
            return f"{bytes:.2f}{unit}{suffix}"
        bytes /= factor

In [None]:
print("Virtual memory")
svmem = psutil.virtual_memory()
print(f"Total: {get_size(svmem.total)}")
print(f"Available: {get_size(svmem.available)}")
print(f"Used: {get_size(svmem.used)}")
print(f"Percentage: {svmem.percent}%")


Virtual memory
Total: 12.69GB
Available: 11.92GB
Used: 524.44MB
Percentage: 6.0%


In [None]:
# get the swap memory details (if exists)
swap = psutil.swap_memory()
print("SWAP memory")
print(f"Total: {get_size(swap.total)}")
print(f"Free: {get_size(swap.free)}")
print(f"Used: {get_size(swap.used)}")
print(f"Percentage: {swap.percent}%")

SWAP memory
Total: 0.00B
Free: 0.00B
Used: 0.00B
Percentage: 0%


### Monitoring and Limiting Hard Disk Space

This function provides the details of all mounted disk partitions as a list of tuples including device, mount point and filesystem type.

In [None]:
print(psutil.disk_partitions())

[sdiskpart(device='/dev/root', mountpoint='/sbin/docker-init', fstype='ext2', opts='ro,relatime'), sdiskpart(device='/dev/sda1', mountpoint='/etc/resolv.conf', fstype='ext4', opts='rw,nosuid,nodev,relatime,commit=30'), sdiskpart(device='/dev/sda1', mountpoint='/etc/hostname', fstype='ext4', opts='rw,nosuid,nodev,relatime,commit=30'), sdiskpart(device='/dev/sda1', mountpoint='/etc/hosts', fstype='ext4', opts='rw,nosuid,nodev,relatime,commit=30')]


 This function gives disk usage statistics as a tuple for a given path. Total, used and free space are expressed in bytes, along with the percentage usage.

In [None]:
print(psutil.disk_usage('/'))

sdiskusage(total=115658190848, used=45016252416, free=70625161216, percent=38.9)


In [None]:
print( "Hard Disk Information")
print("Partitions and Usage:")
# get all disk partitions on the device
partitions = psutil.disk_partitions()
for partition in partitions:
    print("Device:",partition.device)
    print("Partition Mountpoint: ",partition.mountpoint)
    print("Partition File system type",partition.fstype)
    try:
        partition_usage = psutil.disk_usage(partition.mountpoint)
    except PermissionError:
        continue
    print("Total Size: ", get_size(partition_usage.total))
    print("Used Space: ", get_size(partition_usage.used))
    print("Free hard disk Space", get_size(partition_usage.free))
    print("Hard disk Used Percentage: ", partition_usage.percent, "%")
    if(partition_usage.percent >82):
        print("Disk space nearing full")

Hard Disk Information
Partitions and Usage:
Device: /dev/root
Partition Mountpoint:  /sbin/docker-init
Partition File system type ext2
Total Size:  1.91GB
Used Space:  1.11GB
Free hard disk Space 816.61MB
Hard disk Used Percentage:  58.3 %
Device: /dev/sda1
Partition Mountpoint:  /etc/resolv.conf
Partition File system type ext4
Total Size:  80.69GB
Used Space:  45.66GB
Free hard disk Space 35.01GB
Hard disk Used Percentage:  56.6 %
Device: /dev/sda1
Partition Mountpoint:  /etc/hostname
Partition File system type ext4
Total Size:  80.69GB
Used Space:  45.66GB
Free hard disk Space 35.01GB
Hard disk Used Percentage:  56.6 %
Device: /dev/sda1
Partition Mountpoint:  /etc/hosts
Partition File system type ext4
Total Size:  80.69GB
Used Space:  45.66GB
Free hard disk Space 35.01GB
Hard disk Used Percentage:  56.6 %


### Monitoring and Limiting Network Usage

All network protocols are associated with a specific address family. An address family provides services like packet fragmentation and reassembly, routing, addressing, and transporting. The address family provides interprocess communication between processes that run on the same system or different systems.

An address family is normally comprised of several protocols, one per socket type.

Different networks address families and their purpose:
* AF_INET: IPv4 Internet protocols
* AF_INET6: IPv6 Internet protocols
* AF_NETLINK: Kernel user interface device
* AF_PACKET: Low-level packet interface

* family – the socket family, either AF_INET or AF_INET6
* address – the primary NIC address
* netmask – the netmask address
* broadcast – the broadcast address.
* ptp – “point to point” it is the destination address on a point to point interface.

In [None]:
print(psutil.net_if_addrs())

{'lo': [snicaddr(family=<AddressFamily.AF_INET: 2>, address='127.0.0.1', netmask='255.0.0.0', broadcast=None, ptp=None), snicaddr(family=<AddressFamily.AF_PACKET: 17>, address='00:00:00:00:00:00', netmask=None, broadcast=None, ptp=None)], 'eth0': [snicaddr(family=<AddressFamily.AF_INET: 2>, address='172.28.0.2', netmask='255.255.0.0', broadcast='172.28.255.255', ptp=None), snicaddr(family=<AddressFamily.AF_PACKET: 17>, address='02:42:ac:1c:00:02', netmask=None, broadcast='ff:ff:ff:ff:ff:ff', ptp=None)]}


In [None]:
print( "Network Information")
# get all network interfaces (virtual and physical)
if_addrs = psutil.net_if_addrs()
for interface_name, interface_addresses in if_addrs.items():
    for address in interface_addresses:
        print(" Interface: ", interface_name)
        if str(address.family) == 'AddressFamily.AF_INET':
            print("  IP Address: ", address.address)
            print("  Netmask: ", address.netmask)
            print("  Broadcast IPv4: ",address.broadcast)
        elif str(address.family) == 'AddressFamily.AF_PACKET':
            print("  MAC Address: {address.address}")
            print("  Netmask: {address.netmask}")
            print("  Broadcast MAC: {address.broadcast}")
        elif str(address.family) == 'AddressFamily.AF_INET6':
            print("  IP Address: ", address.address)
            print("  Netmask: ", address.netmask)
            print("  Broadcast IPv6: ",address.broadcast)

Network Information
 Interface:  lo
  IP Address:  127.0.0.1
  Netmask:  255.0.0.0
  Broadcast IPv4:  None
 Interface:  lo
  MAC Address: {address.address}
  Netmask: {address.netmask}
  Broadcast MAC: {address.broadcast}
 Interface:  eth0
  IP Address:  172.28.0.2
  Netmask:  255.255.0.0
  Broadcast IPv4:  172.28.255.255
 Interface:  eth0
  MAC Address: {address.address}
  Netmask: {address.netmask}
  Broadcast MAC: {address.broadcast}


 Return system-wide network I/O statistics like bytes sent, bytes received, incoming packets that were dropped, or outgoing packets dropped

* bytes_sent – number of bytes sent
* bytes_recv – number of bytes received
* packets_sent – number of packets sent
* packets_recv – number of packets received
* errin – total number of errors while receiving
* errout – total number of errors while sending
* dropin – total number of incoming packets which were dropped
* dropout – total number of outgoing packets which were dropped

In [None]:
print(psutil.net_io_counters())

snetio(bytes_sent=354705, bytes_recv=379976, packets_sent=1044, packets_recv=1093, errin=0, errout=0, dropin=0, dropout=0)


In [None]:
net_io = psutil.net_io_counters()
print("Total Bytes Sent: ", get_size(net_io.bytes_sent))
print("Total Bytes Received: ", get_size(net_io.bytes_recv))
print("Total outgoing packets dropped: ", net_io.dropin)
print("Total incoming packets dropped:", net_io.dropout)
print("Total outgoing errors: ", net_io.errout)
print("Total incoming errors:", net_io.errin)

Total Bytes Sent:  355.30KB
Total Bytes Received:  379.98KB
Total outgoing packets dropped:  0
Total incoming packets dropped: 0
Total outgoing errors:  0
Total incoming errors: 0


This function gives the list of socket connections of a system as a named tuples.

* fd – the socket file descriptor.
* family – the socket family, either AF_INET, AF_INET6 or AF_UNIX.
* type – the socket type, either SOCK_STREAM, SOCK_DGRAM or SOCK_SEQPACKET.
* laddr – the local address as a (ip, port) named tuple
* raddr – the remote address as a (ip, port) named tuple
* status – represents the status of a TCP connection.
* pid – the PID of the process which opened the socket, if retrievable, else None.

In [None]:
print(psutil.net_connections())

[sconn(fd=46, family=<AddressFamily.AF_INET: 2>, type=<SocketKind.SOCK_STREAM: 1>, laddr=addr(ip='127.0.0.1', port=53421), raddr=(), status='LISTEN', pid=66), sconn(fd=28, family=<AddressFamily.AF_INET6: 10>, type=<SocketKind.SOCK_STREAM: 1>, laddr=addr(ip='::ffff:172.28.0.2', port=8080), raddr=addr(ip='::ffff:172.28.0.1', port=47280), status='ESTABLISHED', pid=7), sconn(fd=50, family=<AddressFamily.AF_INET: 2>, type=<SocketKind.SOCK_STREAM: 1>, laddr=addr(ip='127.0.0.1', port=41981), raddr=addr(ip='127.0.0.1', port=60336), status='ESTABLISHED', pid=66), sconn(fd=5, family=<AddressFamily.AF_INET: 2>, type=<SocketKind.SOCK_STREAM: 1>, laddr=addr(ip='127.0.0.1', port=18824), raddr=addr(ip='127.0.0.1', port=60738), status='ESTABLISHED', pid=86), sconn(fd=57, family=<AddressFamily.AF_INET: 2>, type=<SocketKind.SOCK_STREAM: 1>, laddr=addr(ip='127.0.0.1', port=42318), raddr=addr(ip='127.0.0.1', port=43379), status='ESTABLISHED', pid=66), sconn(fd=21, family=<AddressFamily.AF_INET: 2>, type=<

### Exploring Multiprocessing

Let us implement multiprocessing and try to utilize the maximum capacity of the machine using multiprocessing package. 

1. We will make use of `multiprocessing` module to overcome the issue of python global interpreter lock.

2. We will try to monitor the resources of the machine using `psutil`.


One of the key objectives for a developer is to make the code run faster. However, many tasks take time to be processed, even on fast computers with several cores. Partially it happens because of Python GIL (Python Global Interpreter Lock) that allows only one thread to take control over Python Interpreter, so we end up never using the whole power of the machine by just executing a function or method.

The `multiprocessing` package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine.

Refer to the official documentation related to 'multiprocessing' [here](https://docs.python.org/3/library/multiprocessing.html)

In [None]:
# Importing the libraries
from datetime import datetime
import multiprocessing as mp

#### Without Multiprocessing 

Here we will make use of loop and use ranges as high as "300000000" so that the computational time increases and the difference between using Multiprocessing and not using Multiprocessing becomes evident.

In [None]:
start = datetime.now()
def loop(r):
    for n in range(r):
        result = (n*(n+1))/2
    return result
ranges = [100000000, 200000000, 300000000]
results = []
if __name__ == "__main__":
    for r in ranges:
        results.append(loop(r))
    print(f"Result: {sum(results)}")
    print(f"Time spent: {datetime.now() - start}")

Result: 6.99999997e+16
Time spent: 0:01:49.398577


Here we can see that the time taken for the above code to execute is almost 2 minutes. Now, we want to see the advantage of using the Multiprocessing package by making use of more than one core to run the code.

#### With Multiprocessing and using Psutil to evaluate the process.

In [None]:
start = datetime.now()
def loop(core, r):
    proc = psutil.Process()
    proc.cpu_affinity([core])
    for n in range(r):
        result = (n*(n+1))/2
    return result
cores = [0, 1]
ranges = [100000000, 200000000, 300000000]
results = []
if __name__ == "__main__":
    with mp.Pool() as pool:
        for core in cores:
            p = pool.apply_async(func=loop, args=(core, ranges[core],))
            results.append(p)
        pool.close()
        pool.join()
    result = 0
    for p in results:
        result = result + p.get()
    print(f"Result: {result}")
    print(f"Time spent: {datetime.now() - start}")

Result: 2.499999985e+16
Time spent: 0:00:51.558776


Here we can see that the time taken for the same code to run by making use of Multiprocessing package has come down to 50 seconds.

This package helps us to overcome the challenge posed by python's global interpreter lock so that multiple cores can work on sub parts of the task.

**Question:** What is the method that multiprocessing package makes use of to implement parallelism in python?

A. By using subprocesses instead of threads

B. By taking control over the specific core

C. By making use of multiple threads

Answer: A

### References:

1. Renu Khandelwal, 2021. Monitoring your devices in Python. _Medium_
2. Rupani Sweety, Psutil module in Python. _GeeksforGeeks_
3. Julio Souto, 2021. Parallelism with Python Multiprocessing. _Medium_