## Understanding Slowness

top : CPU/ memory
iotop : disk io usage
iftop : network bandwidth

ab -n 500 site.example.com/
ssh webserver
clear
top
q

nice : set priority 0-19 (0 is the most priority)
renice : change priority
pidof

for pid in $(pidof ffmpeg); do renice 19 $pid; done

ps ax
less

ps ax | less
/ffmpeg
locate static/001.webm
cd /srv/deploy_videos/
ls -l
grep ffmpeg *
vim deploy.sh
delete daemonize

clear
killall -STOP ffmpeg
for pid in $(pidof ffmpeg); do while kill -CONT $pid; dp sleep 1; done; done

## Slow Code

gprof : C program
Cprofile : Python program

List : fast for add/remove data at the end
        fast for retrieve data from index
        slow for add data in the middel
        slow for retrieve data from unknown position
* If you need to access elements by position, or will always iterate through all the elements, use a list to store them
       
Dictionary (Hash/map) : fast to find key value in one operation
* If we need to loop up the elements using a key, we'll use a dictionary.

If you do an expensive operation inside a loop, you multiply the time it takes to do the expensive operation by the amount of times you repeat the loop.
Make sure that the list of elements that you're iterating through is only as long as you really need it to be.
Another thing to remember about loops is to break out of the loop once you've found what you were looking for. (break)

time ./send_reminders.py "2020-01-13|Example|test1"

Real : The amount of actual time that it took to execute the command (Wall-clock time)
User : The time spent doing operations in the user space
Sys : the time spent doing system-level operations

time ./send_reminders.py "2020-01-13|Example|test1,test2,test3,test4,test5,test6,test7,test8,test9"

pprofile3 -f callgrind -o profile.out ./send_reminders.py "2020-01-13|Example|test1,test2,test3,test4,test5,test6,test7,test8,test9"

kcachegrind profile.out
clear
atom send_reminders.py
get_name function

## When Slowness Problems Get Complex

Python : Module Threading/ AsyncIO These modules let us specify which parts of the code we want to run in separate threads or as separate asynchronous events

In [None]:
from concurrent import future
import argparse
import logging
import os
import sys

import PIL
import PIL.Image

from tqdm import tqdm


def process_options():

    kwargs = {
        'format': '[%(levelname)s] %(message)s',
    }

    parser = argparse.ArgumentParser(
        description='Thumbnail generator',
        fromfile_prefix_chars='@'
    )
    parser.add_argument('--debug', action='store_true')
    parser.add_argument('-v', '--verbose', action='store_true')
    parser.add_argument('-q', '--quiet', action='store_true')

    options = parser.parse_args()

    if options.debug:
        kwargs['level'] = logging.DEBUG
    elif options.verbose:
        kwargs['level'] = logging.INFO
    elif options.quiet:
        kwargs['level'] = logging.ERROR
    else:
        kwargs['level'] = logging.WARN

    logging.basicConfig(**kwargs)

    return options


def process_file(root, basename):
    filename = f'{root}/{basename}'
    image = PIL.Image.open(filename)

    size = (128, 128)
    image.thumbnail(size)

    new_name = f'thumbnails/{basename}'
    image.save(new_name, "JPEG")
    return new_name


def progress_bar(files):
    return tqdm(files, desc='Processing', total=len(files), dynamic_ncols=True)


def main():

    process_options()

    # Create the thumbnails directory
    if not os.path.exists('thumbnails'):
        os.mkdir('thumbnails')

#     executor - futures.ThreadPoolExecutor()
    executor - futures.ProcessPoolExecutor()
    for root, _, files in os.walk('images'):
        for basename in progress_bar(files):
            if not basename.endswith('.jpg'):
                continue
            executor.submit(process_file, root, basename)
#             process_file(root, basename)
    print("Waiting for all threads to finish.")
    executor.shutdown()
    return 0


if __name__ == "__main__":
    sys.exit(main())

Executor : The process that's in charge of distributing the work among the different workers
Futures modeul : Provides a couple of different executors; one for using threads and another for using processes

Threads use a bunch of safety features to avoid having two threads that try to write to the same variable. And this means that when using threads, they may end up waiting for their turn to write to variables for a few milliseconds, adding up to the small difference between the two approaches.

## Fix a slow system with Python

sudo apt install python3-pip
pip3 install psutil
python3
import psutil
psutil.cpu_percent()
psutil.disk_io_counters()
psutil.net_io_counters()

rsync [Options] [Source-Files-Dir] [Destination]
rsync -zvh [Source-Files-Dir] [Destination]
rsync -zavh [Source-Files-Dir] [Destination]
rsync -zrvh [Source-Files-Dir] [Destination]

python3
import subprocess
src = "<source-path>" # replace <source-path> with the source directory
dest = "<destination-path>" # replace <destination-path> with the destination directory
subprocess.call(["rsync", "-arq", src, dest])
    
ls ~/scripts
sudo chmod +x ~/scripts/multisync.py
./scripts/multisync.py

nano ~/scripts/dailysync.py
sudo chmod +x ~/scripts/dailysync.py
./scripts/dailysync.py

In [None]:
# multisync.py
#!/usr/bin/env python3
from multiprocessing import Pool
def run(task):
  # Do something with task here
    print("Handling {}".format(task))
if __name__ == "__main__":
  tasks = ['task1', 'task2', 'task3']
  # Create a pool of specific number of CPUs
  p = Pool(len(tasks))
  # Start each task within the pool
  p.map(run, tasks)

In [None]:
# dailysync.py
#!/usr/bin/env python
import subprocess
src = "/data/prod/"
dest = "/data/prod_backup/"
subprocess.call(["rsync", "-arq", src, dest])

In [None]:
#!/usr/bin/env python
import subprocess
from multiprocessing import Pool
import os


def backup(src):
    dest = os.getcwd() + "/data/prod_backup/"
    print("Backing up {} into {}".format(src, dest))
    subprocess.call(["rsync", "-arq", src, dest])


if __name__ == "__main__":
    src = os.getcwd() + "/data/prod/"
    list_of_files = os.listdir(src)
    all_files = []

    for value in list_of_files:
        full_path = os.path.join(src, value)
        all_files.append(full_path)

#     with Pool(len(all_files)) as pool:
        pool = Pool(len(all_files))
        pool.map(backup, all_files)