# Lab 8: Resource Management

In this notebook, we propose some exploratory exercises about resource management in Python. The goal is to learn good programming practices more than solving concrete programming problems.


## List of exercises

1. Let's explore the information about our processes in Python. To do so, the module `psutil` will be used to gather the process information. 

* Learn more about this module in the following link: https://psutil.readthedocs.io/en/latest/

In [0]:
from psutil import Process
from datetime import datetime
from os import getpid

#Getting information from the current process
p = Process(getpid())
print(p)
print("Name: " + p.name())
print("ID: {}".format(p.pid))
print("ID parent: {}".format(p.ppid))
print("Memory info: {}".format(p.memory_info))
print("User: {}".format(p.username))
d = p.__dict__
for k, v in d.items():
  print(k," :", v)

In [0]:
import psutil
#Getting some information from the CPU
psutil.cpu_stats()


scpustats(ctx_switches=330932, interrupts=228824, soft_interrupts=161918, syscalls=0)

In [0]:
import psutil
#Getting some informatin about the virtual memory
#Learn more: https://psutil.readthedocs.io/en/latest/#psutil.virtual_memory
mem = psutil.virtual_memory()
mem

svmem(total=13653561344, available=12787277824, percent=6.3, used=578035712, free=10951147520, active=730005504, inactive=1709690880, buffers=76378112, cached=2048000000, shared=925696, slab=165543936)

In [0]:
import psutil
#Getting some information from the disk
#Learn more: https://psutil.readthedocs.io/en/latest/#psutil.disk_partitions
psutil.disk_partitions()

[sdiskpart(device='/dev/sda1', mountpoint='/etc/resolv.conf', fstype='ext4', opts='rw,nosuid,nodev,relatime,commit=30'),
 sdiskpart(device='/dev/sda1', mountpoint='/etc/hostname', fstype='ext4', opts='rw,nosuid,nodev,relatime,commit=30'),
 sdiskpart(device='/dev/sda1', mountpoint='/etc/hosts', fstype='ext4', opts='rw,nosuid,nodev,relatime,commit=30')]

In [0]:
import psutil
#Getting some information from the network
#Learn more: https://psutil.readthedocs.io/en/latest/#psutil.net_io_counters
psutil.net_io_counters()

snetio(bytes_sent=600719, bytes_recv=604583, packets_sent=1786, packets_recv=1966, errin=0, errout=0, dropin=0, dropout=0)

In [0]:
psutil.net_connections()

In [0]:
import psutil
#Getting some information from the network
#Learn more: https://psutil.readthedocs.io/en/latest/#psutil.sensors_temperatures
psutil.sensors_temperatures()

{}

2. Gathering information of the system and processes.

In [0]:
import psutil, datetime
psutil.boot_time()

In [0]:
import psutil
#Processes ids
psutil.pids()

In [0]:
import psutil
#Process table
for proc in psutil.process_iter(['pid', 'name', 'username']):
  print(proc.info)

3. In this exploratory exercise, the low-level code of a Python program is presented. To do so, the `dis` module will be used.

Learn more: https://docs.python.org/3/library/dis.html

To proceed, let's follow the next steps:

* Create a simple program to add two numbers.
* Save the file.
* Run the following command: `python -m dis dis_example.py`

In [0]:
#Program version 1
if __name__=="__main__":  
    a = 2
    b = 3
    c = a + b
    print(c)

This is the output, our Python code in assembly code (34 lines):

* Three variables are created: `a, b and c`
* `a` and `b` are initialized with constant values
* `c` is assigned with the value of `a+b`.

```
8             0 LOAD_NAME                0 (__name__)
              2 LOAD_CONST               0 ('__main__')
              4 COMPARE_OP               2 (==)
              6 POP_JUMP_IF_FALSE       32

 10           8 LOAD_CONST               1 (2)
             10 STORE_NAME               1 (a)

 11          12 LOAD_CONST               2 (3)
             14 STORE_NAME               2 (b)

 12          16 LOAD_NAME                1 (a)
             18 LOAD_NAME                2 (b)
             20 BINARY_ADD
             22 STORE_NAME               3 (c)

 13          24 LOAD_NAME                4 (print)
             26 LOAD_NAME                3 (c)
             28 CALL_FUNCTION            1
             30 POP_TOP
        >>   32 LOAD_CONST               3 (None)
             34 RETURN_VALUE

```



In [0]:
#Program version 2
if __name__=="__main__":
    a = 2
    b = 3
    print(a+b)

Here, the output is quite similar but we have saved 4 instructions.

```
 8           0 LOAD_NAME                0 (__name__)
              2 LOAD_CONST               0 ('__main__')
              4 COMPARE_OP               2 (==)
              6 POP_JUMP_IF_FALSE       28

 10           8 LOAD_CONST               1 (2)
             10 STORE_NAME               1 (a)

 11          12 LOAD_CONST               2 (3)
             14 STORE_NAME               2 (b)

 12          16 LOAD_NAME                3 (print)
             18 LOAD_NAME                1 (a)
             20 LOAD_NAME                2 (b)
             22 BINARY_ADD
             24 CALL_FUNCTION            1
             26 POP_TOP
        >>   28 LOAD_CONST               3 (None)
             30 RETURN_VALUE
```



* There are some code optimizations that can be done with techniques called "partial evaluation". These are advanced techniques that can save a lot of time by analyzing which parts of the code can be improved. For instance, taking a look to the "peval" module, it is focused in the following optimizations:

  * constant propagation
  * constant folding
  * unreachable code elimination
  * function inlining

* Take a look to these introductory slides: 

 * http://www.cs.utexas.edu/~wcook/presentations/2011-PartialEval-simple.pdf

4. Let's take a look now to the time that is spent in the different parts of our code. To do so, the `cProfile` module of the CPython interpreter will be used.

Learn more: https://docs.python.org/3/library/profile.html

To proceed, let's follow the next steps:

* Create a simple program to invoke a function and wait five seconds.
* Save the file.
* Run the following command: `python -m cProfile profiling_example.py`

In [0]:
import time

def wait():
    time.sleep(5)

if __name__=="__main__":
    wait()

Here, we can see the calls and the time to execute each one. Our function is taking 5 seconds.

```
         5 function calls in 5.006 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    5.006    5.006 profiling_example.py:7(<module>)
        1    0.000    0.000    5.006    5.006 profiling_example.py:9(wait)
        1    0.000    0.000    5.006    5.006 {built-in method builtins.exec}
        1    5.006    5.006    5.006    5.006 {built-in method time.sleep}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
```



* There is an interesting library for getting information about the memory.
  * [Guppy](https://pypi.org/project/guppy3/).

5. Let's explore now the number of object references. This is the information managed by the garbage collector. It is convenient recall here how references are increased:
  * Assign it to another variable.
  * Pass the object as an argument.
  * Include the object in a list.

Notice that the call to the refcount increases the number of references.

In [0]:
import sys
def f(param):
  print(sys.getrefcount(a))

a = []
b = a
#f(a)
print(sys.getrefcount(a))

3


6. Let's explore in detail the information of the garbage collector.

In [0]:
import gc

gc.set_debug(gc.DEBUG_SAVEALL)

print(gc.get_count())
lst = []
lst.append(lst)
list_id = id(lst)
del lst
gc.collect()
for item in gc.garbage:
    print(item)
    print(list_id == id(item))

7. One of the keypoints to improve the performance of our programs relies on the use of the Python standard library. To do so, let's compare the execution time between some operations with lists.

In [0]:
import time

def create_list(n):
  alist = []
  i = 0
  while i<n:
    alist.append(i)
    i = i + 1

def create_list_2(n):
  return [i for i in range(n)]


n = 10000000
print("Manual list creation")
t = time.time()
create_list(n)
print("\n\tTime Taken: %.5f sec\n" % (time.time()-t))

print("Comprehension list creation")
t = time.time()
create_list_2(n)
print("\n\tTime Taken: %.5f sec\n" % (time.time()-t))

8. Let's iterate over a list. In this case, there is no difference. However, the use of a range is associated with a better memory management, since it will only keep a reference to the next value.

In [0]:
import time

def iterate_values(alist):
  for v in alist:
    pass

def iterate_range(alist, n):
  for i in range(n):
    pass
    
def create_list_2(n):
  return [i for i in range(n)]


n = 1000000
alist = create_list_2(n)
print("Iterate using values")
t = time.time()
iterate_values(alist)
print("\n\tTime Taken: %.5f sec" % (time.time()-t))

print("Iterate using range")
t = time.time()
iterate_range(alist, n)
print("\n\tTime Taken: %.5f sec" % (time.time()-t))

Iterate using values

	Time Taken: 0.01137 sec
Iterate using range

	Time Taken: 0.02347 sec


9. In this example, we are going to show how a list of uppercase strings can be created in different manners having impact in the execution time. Again, it is important to remark the use of built-in functions.

In [0]:
n = 1000000
alist=[str(i) for i in range(n)]
upper_list = []
print("Manual conversion to uppercase")
t = time.time()
for word in alist:
      upper_list.append(word.upper())
print("\n\tTime Taken: %.5f sec" % (time.time()-t))

print("Using builtins to uppercase")
t = time.time()
upper_list_2=map(str.upper,alist)
print("\n\tTime Taken: %.5f sec" % (time.time()-t))

10. In this example, we will show that we should extract (if possible) from a loop those operations that are repetitive and time consuming.

In [7]:
import time

s = "Hello"
n = 1000000

print("N times evaluation")
t = time.time()
for i in range(n):
  a = (len(s)+i)
print("\n\tTime Taken: %.5f sec" % (time.time()-t))

print("N times NO evaluation")
t = time.time()
l = len(s)
for i in range(n):
  a = (l+i)
print("\n\tTime Taken: %.5f sec" % (time.time()-t))

N times evaluation

	Time Taken: 0.16597 sec
N times NO evaluation

	Time Taken: 0.10354 sec


11. In this example, we will explore the different ways of reversing a Python string.

In [0]:
import time

def reverse_1(s):
  reverse = ""
  for i in range(len(s)-1,0,-1):
    reverse = reverse + s[i]
  return reverse

def reverse_2(s):
  reverse = ""
  for i in range(len(s)):
    reverse = s[i] + reverse 
  return reverse

def reverse_slicing(s):
  return s[::-1]

def reverse_list(s):
  sl = list(s)
  sl.reverse()
  return ''.join(sl)

def reverse_builtin(s):
  return ''.join(reversed(s))

s = "Hello"
print("Backward")
t = time.time()
reverse_1(s)
print("\n\tTime Taken: %.8f sec" % (time.time()-t))

print("Forward")
t = time.time()
reverse_2(s)
print("\n\tTime Taken: %.8f sec" % (time.time()-t))

print("Slicing")
t = time.time()
reverse_slicing(s)
print("\n\tTime Taken: %.8f sec" % (time.time()-t))

print("Reverse list")
t = time.time()
reverse_list(s)
print("\n\tTime Taken: %.8f sec" % (time.time()-t))

print("Reverse builtin")
t = time.time()
reverse_builtin(s)
print("\n\tTime Taken: %.8f sec" % (time.time()-t))


In [0]:
#To run from the command line
if __name__ == '__main__':
    test_to_code = '''
reverse_1('abcdefghijklmnopqrstuvwxyz')
    '''
    import timeit
    print(timeit.repeat(stmt=test_to_code, setup="from __main__ import reverse_1", repeat=5))

12. String concatenation. A common discussion in the Python community is which is the best manner of concatenating strings since they are inmutable and, in each operation, a new string will be generated consuming memory.

Take a look to the official documentation: 
* https://docs.python.org/3/faq/programming.html#what-is-the-most-efficient-way-to-concatenate-many-strings-together

Other options are:
* The use of StringIO.
* The use of f-string.

In [0]:
msg = "value 1\n"
msg +="value 2\n"

my_msg=["line1","line2"]
msg2 = "\n".join(my_msg)
# Because strings are immutable, every time you add an element to a string, Python creates a new string and a new address.

In [0]:
import time

def concat_1(s1, s2): 
    return s1 + s2 

def concat_2(s1, s2): 
    return "%s%s" % (s1, s2) 

def concat_3(s1, s2): 
    return "{0}{1}".format(s1, s2) 


s1 = "123"
s2 = "abc"

print("Short strings...")

print("Concat +")
t = time.time()
concat_1(s1, s2)
print("\n\tTime Taken: %.8f sec\n" % (time.time()-t))

print("Concat %")
t = time.time()
concat_2(s1, s2)
print("\n\tTime Taken: %.8f sec\n" % (time.time()-t))


print("Concat format")
t = time.time()
concat_3(s1, s2)
print("\n\tTime Taken: %.8f sec\n" % (time.time()-t))

s1 = "123" * 100000
s2 = "abc" * 100000

print("Large strings...")

print("Concat +")
t = time.time()
concat_1(s1, s2)
print("\n\tTime Taken: %.8f sec\n" % (time.time()-t))

print("Concat %")
t = time.time()
concat_2(s1, s2)
print("\n\tTime Taken: %.8f sec\n" % (time.time()-t))


print("Concat format")
t = time.time()
concat_3(s1, s2)
print("\n\tTime Taken: %.8f sec\n" % (time.time()-t))

In general, some tips for improvement our Python programs are:

* Use builtin functions.
* Try to refactor large calculations in loops.
* Use `while 1` instead of `while True` in infinite loops.
* Use generators like `range`.
* Try to use constants when possible.
* Be careful with operations in immutable types like strings.
* Use an up-to-date Python version.
* ... Optimize what can be optimized not more.


#References

* Fluent Python book.
* https://wiki.python.org/moin/PythonSpeed/PerformanceTips/