# heptabot inference time and RAM usage

In this notebook we will show the running specifications of heptabot. This was tested on a [vast.ai](https://vast.ai/console/create/) instance created using `tensorflow/tensorflow:2.3.0-gpu-jupyter` image and our [Install](https://github.com/lcl-hse/heptabot/blob/master/notebooks/Install.ipynb) script. As *heptabot* is currently hosted on a NVidia GeForce GTX 1080 Ti graphics card with 32 GB total system RAM, the results will be shown for the same system.

First, we check Python version and enter our working directory. Keep in mind that the code is executed within `heptabot` virtual environment.

In [1]:
!python --version

Python 3.6.9


In [2]:
%cd heptabot

/tf/heptabot


Let's get the current load on the GPU:

In [3]:
!nvidia-smi

Wed Oct  7 17:03:16 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05    Driver Version: 455.23.05    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  GeForce GTX 108...  On   | 00000000:04:00.0 Off |                  N/A |
| 21%   38C    P8    10W / 300W |      2MiB / 11178MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

Next, the total amount of used RAM:

In [4]:
!free -h

              total        used        free      shared  buff/cache   available
Mem:            31G         13G        7.9G         57M         10G         17G
Swap:           14G        1.8G         13G


And the current CPU tasks:

In [5]:
!top -H -n 1

[?1h=[H[2J[mtop - 17:03:22 up 4 days,  1:47,  0 users,  load average: 15.35, 12.93, 12.33[m[m[m[m[K
Threads:[m[m[1m  30 [m[mtotal,[m[m[1m   1 [m[mrunning,[m[m[1m  29 [m[msleeping,[m[m[1m   0 [m[mstopped,[m[m[1m   0 [m[mzombie[m[m[m[m[K
%Cpu(s):[m[m[1m 17.5 [m[mus,[m[m[1m 13.7 [m[msy,[m[m[1m  1.0 [m[mni,[m[m[1m 67.4 [m[mid,[m[m[1m  0.3 [m[mwa,[m[m[1m  0.0 [m[mhi,[m[m[1m  0.0 [m[msi,[m[m[1m  0.0 [m[mst[m[m[m[m[K
KiB Mem :[m[m[1m 32899856 [m[mtotal,[m[m[1m  8255364 [m[mfree,[m[m[1m 14028320 [m[mused,[m[m[1m 10616172 [m[mbuff/cache[m[m[m[m[K
KiB Swap:[m[m[1m 15624188 [m[mtotal,[m[m[1m 13715856 [m[mfree,[m[m[1m  1908332 [m[mused.[m[m[1m 18368528 [m[mavail Mem [m[m[m[m[K
[K
[7m  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND      [m[m[K
[m   27 root      20   0  364836  76272  17340 S  6.7  0.2   0:11.32 jupyter-not+ [m[m[K


There is a way to place the models into CPU RAM: to do this, execute the code in the following cell. We, hovewer, want to test the model on GPU, so we will comment this code.

In [None]:
# import os

# os.environ["MODEL_PLACE"] = "cpu"
# os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

Now we import our functions. This cell will also download the missing `sentence_transformers` model in case it had not been done earlier.

In [6]:
%%time

from models import batchify, process_batch, result_to_div




100%|██████████| 245M/245M [01:09<00:00, 3.54MB/s] 


Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.


Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.


INFO:tensorflow:Restoring parameters from models/savemodel/variables/variables


INFO:tensorflow:Restoring parameters from models/savemodel/variables/variables


CPU times: user 57.9 s, sys: 15.7 s, total: 1min 13s
Wall time: 1min 58s


Note that the models are placed on GPU now:

In [7]:
!nvidia-smi

Wed Oct  7 17:05:29 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05    Driver Version: 455.23.05    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  GeForce GTX 108...  On   | 00000000:04:00.0 Off |                  N/A |
| 21%   42C    P2    59W / 300W |  10657MiB / 11178MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

It's time to choose a text and run our code!

In [8]:
text = """Crime committed by the youth is an actual problem of the modern world. The facts discover that the number of these crimes is higher nowadays than ever before. I would like to look at the reasons of this terrifying trend and search some adequate solutions for this problem. 
It is clear that alcohol and drugs are the reasons of many crimes. From my point of view, advertising and many ways of distribution of both are one of the most important reasons of the crime growth. Many teens and young people addicted to alcohol and drugs become criminals. There is factual evidence that the highest percentage of imprisoners are drug dealers and drug holders. The second reason is a bad financial state of many families and young men. Such negative financial environment destructively influences development of children and teenagers. They become criminals because it is the only way for them to survive and get some money. The third reason is a luck of education. Young men with weak education cannot find a well-paid job and meet different life difficulties that can incline them to a criminal path. 
As far as I am concerned, each of the reasons is important and should be solved carefully. The best way to deal with alcohol-and-drugs problem is to control students from schools and universities and advertise a good and healthy lifestyle. There should be some control tests among student and undergraduates to indicate drug addiction. Some lessons and sport events should be organized to make active life and sport common among youngsters. The second problem of poverty can be handled through regular revision of families in terms of their financial stability. Children from poor families should be given some financial support. Government should organize different institutions to help young graduates to find a well-paid job. The final reason, lack of education, can be solved through the wide range of different scholarships that will allow young men to go to university. All schools and higher schools should be able to provide some standards of the quality of education so that their student an compete with others. 
To sum up, the problem of crime is actual among young people nowadays. This problem can be explained through 3 main reasons: alcohol and drugs, poverty lack of education. Although the problem is difficult, it can be solved through adequate means of the governors and local authorities: a health lifestyle promotion and support of educational programs. We shouldn't close our eyes and should do our best to help young people to choose the right way."""

In [9]:
%%time

def process_text(text, task_type="correction"):
    processed = []
    batches, delims = batchify(text, task_type)

    if len(batches) > 50:
        raise InputOverflow(task_type)

    for batch in batches:
        processed.append(process_batch(batch))

    plist = [item for subl in processed for item in subl] 
    response = result_to_div(text, plist, delims, task_type)
    return response


print(process_text(text))
print("\n")
print("Processed text")

<span>Crime committed by the youth is an actual problem of the modern world. The facts </span><div style="display: inline;" onmouseover="showcomment(this, event);" onmouseleave="hidecomment(this);"><del class="hidden vocab" style="cursor: pointer;" onclick="showhide(this);">discover</del><div class="vocab error-hider" onclick="showhide(this);"></div><ins class="vocab" style="cursor: pointer;" onclick="showhide(this);">show</ins><hgroup class="vocab error-type" style="left: 709.15px; visibility: hidden; top: 70.5px; --left-pos:NaNpx;"><span>Vocabulary error</span></hgroup></div><span> that the number of these crimes is higher nowadays than ever before. I would like to look at the reasons </span><div style="display: inline;" onmouseover="showcomment(this, event);" onmouseleave="hidecomment(this);"><del class="hidden gram" style="cursor: pointer;" onclick="showhide(this);">of</del><div class="gram error-hider" onclick="showhide(this);"></div><ins class="gram" style="cursor: pointer;" oncl

Let's check if something has changed on the GPU:

In [11]:
!nvidia-smi

Wed Oct  7 17:06:53 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05    Driver Version: 455.23.05    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  GeForce GTX 108...  On   | 00000000:04:00.0 Off |                  N/A |
| 37%   58C    P2    76W / 140W |  10979MiB / 11178MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

Now, repeat the procedure. It should run a little bit faster this time:

In [10]:
%%time


print(process_text(text))
print("\n")
print("Processed text")

<span>Crime committed by the youth is an actual problem of the modern world. The facts </span><div style="display: inline;" onmouseover="showcomment(this, event);" onmouseleave="hidecomment(this);"><del class="hidden vocab" style="cursor: pointer;" onclick="showhide(this);">discover</del><div class="vocab error-hider" onclick="showhide(this);"></div><ins class="vocab" style="cursor: pointer;" onclick="showhide(this);">show</ins><hgroup class="vocab error-type" style="left: 709.15px; visibility: hidden; top: 70.5px; --left-pos:NaNpx;"><span>Vocabulary error</span></hgroup></div><span> that the number of these crimes is higher nowadays than ever before. I would like to look at the reasons </span><div style="display: inline;" onmouseover="showcomment(this, event);" onmouseleave="hidecomment(this);"><del class="hidden gram" style="cursor: pointer;" onclick="showhide(this);">of</del><div class="gram error-hider" onclick="showhide(this);"></div><ins class="gram" style="cursor: pointer;" oncl

In [14]:
!nvidia-smi

Wed Oct  7 17:08:12 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05    Driver Version: 455.23.05    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  GeForce GTX 108...  On   | 00000000:04:00.0 Off |                  N/A |
| 35%   48C    P8    11W / 145W |  10979MiB / 11178MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

Finally, let's check our RAM and running processes again:

In [12]:
!free -h

              total        used        free      shared  buff/cache   available
Mem:            31G         18G        2.3G         85M         10G         12G
Swap:           14G        1.8G         13G


In [13]:
!top -H -n 1

[?1h=[H[2J[mtop - 17:06:56 up 4 days,  1:51,  0 users,  load average: 13.58, 13.46, 12.67[m[m[m[m[K
Threads:[m[m[1m 191 [m[mtotal,[m[m[1m   1 [m[mrunning,[m[m[1m 190 [m[msleeping,[m[m[1m   0 [m[mstopped,[m[m[1m   0 [m[mzombie[m[m[m[m[K
%Cpu(s):[m[m[1m 17.5 [m[mus,[m[m[1m 13.8 [m[msy,[m[m[1m  1.0 [m[mni,[m[m[1m 67.4 [m[mid,[m[m[1m  0.3 [m[mwa,[m[m[1m  0.0 [m[mhi,[m[m[1m  0.0 [m[msi,[m[m[1m  0.0 [m[mst[m[m[m[m[K
KiB Mem :[m[m[1m 32899856 [m[mtotal,[m[m[1m  2467672 [m[mfree,[m[m[1m 19063336 [m[mused,[m[m[1m 11368848 [m[mbuff/cache[m[m[m[m[K
KiB Swap:[m[m[1m 15624188 [m[mtotal,[m[m[1m 13716112 [m[mfree,[m[m[1m  1908076 [m[mused.[m[m[1m 13305120 [m[mavail Mem [m[m[m[m[K
[K
[7m  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND      [m[m[K
[m 3494 root      20   0 41.769g 5.759g 847620 S  6.2 18.4   1:16.75 python       [m[m[K


And that's it – our benchmark ends here!