# PERO-OCR

## About PERO OCR
PERO OCR is an advanced optical character recognition technology developed by the PERO project team at Brno University of Technology.The technical advantage of PERO OCR is its high adaptability and accuracy for all types of documents and its ability to handle low-quality printed and handwritten documents. It supports most European languages, including Latin, and can handle older documents in German and Czech Fraktur fonts as well as similar fonts, mainly supporting handwritten documents in Czech. It also provides an efficient text correction interface and several transcription formats for download, such as ALTO, PAGE XML and plain text.PERO OCR can be used in several ways: APIs, web applications and Python packages.

## Usage of PERO OCR (for this project)
Previously, in the test of the WEB application, we found that PERO-OCR demonstrated outstanding [results](./path/to/local/image.jpg) is not only very fast (the average time to recognize the layout of each newspaper page is 3 seconds, and the time to recognize the text is 30 seconds), but also has a very high correctness rate (the correctness rate of recognizing the text of each newspaper page is basically higher than about 95%).

Nevertheless, PERO OCR's WEB application only supports single page uploads, does not support mass processing, and has limitations on the size of the images themselves.

PERO OCR's[API](https://app.swaggerhub.com/apis-docs/LachubCz/PERO-API/1.0.4#/external/post_processing_request) is designed for automation and integration. However, after testing, it was found that PERO OCR's API needs to provide URL links to images when uploading images to be processed, rather than uploading images directly from local. This means that you need to upload the pre-processed images to the cloud storage service first, and then provide the URL links of these images to the API. in addition, since the API is a pre-defined set of functions and interfaces, you can't modify its internal functions, which will cause inconvenience to the subsequent adjustment of the image size limit. For the above reasons, we give up using the API for the time being.

Therefore, finally it was decided to call the [Python Package](https://pypi.org/project/pero-ocr/0.2/). According to PERO OCR's [guidance](https://github.com/DCGM/pero-ocr/blob/master/README.md#integration-of-the-pero-ocr-python-module) on Github: A series of packages need to be installed, then the publicly available [model with its configuration file](https://github.com/DCGM/pero-ocr#available-models) can be downloaded as needed, and after completing the above run the following code and make adjustments to the source code as necessary.

##

In [None]:
import os
import configparser
import cv2
import numpy as np
from pero_ocr.document_ocr.layout import PageLayout
from pero_ocr.document_ocr.page_parser import PageParser

# Read config file.
config_path = "./config_file.ini"
config = configparser.ConfigParser()
config.read(config_path)

# Init the OCR pipeline. 
# You have to specify config_path to be able to use relative paths
# inside the config file.
page_parser = PageParser(config, config_path=os.path.dirname(config_path))

# Read the document page image.
input_image_path = "page_image.jpg"
image = cv2.imread(input_image_path, 1)

# Init empty page content. 
# This object will be updated by the ocr pipeline. id can be any string and it is used to identify the page.
page_layout = PageLayout(id=input_image_path,
     page_size=(image.shape[0], image.shape[1]))

# Process the image by the OCR pipeline
page_layout = page_parser.process_page(image, page_layout)

page_layout.to_pagexml('output_page.xml') # Save results as Page XML.
page_layout.to_altoxml('output_ALTO.xml') # Save results as ALTO XML.

# Render detected text regions and text lines into the image and
# save it into a file.
rendered_image = page_layout.render_to_image(image) 
cv2.imwrite('page_image_render.jpg', rendered_image)

# Save each cropped text line in a separate .jpg file.
for region in page_layout.regions:
  for line in region.lines:
     cv2.imwrite(f'file_id-{line.id}.jpg', line.crop.astype(np.uint8))

In [1]:
import os
import configparser
import cv2
import numpy as np
from pero_ocr.document_ocr.layout import PageLayout
from pero_ocr.document_ocr.page_parser import PageParser


numba available, importing jit


In [4]:
# Read config file.
config_path = "./config.ini"
config = configparser.ConfigParser()
config.read(config_path)

# Init the OCR pipeline. 
# You have to specify config_path to be able to use relative paths
# inside the config file.
page_parser = PageParser(config, config_path=os.path.dirname(config_path))

# Read the document page image.
input_image_path = "/home/vivek/Desktop/pero_eu_cz_print_newspapers_2022-09-26/01_01_00000001_cropped.jpg"
image = cv2.imread(input_image_path, 1)

# Init empty page content. 
# This object will be updated by the ocr pipeline. id can be any string and it is used to identify the page.
page_layout = PageLayout(id=input_image_path,
     page_size=(image.shape[0], image.shape[1]))

# Process the image by the OCR pipeline
page_layout = page_parser.process_page(image, page_layout)

page_layout.to_pagexml('output_page.xml') # Save results as Page XML.
page_layout.to_altoxml('output_ALTO.xml') # Save results as ALTO XML.

# Render detected text regions and text lines into the image and
# save it into a file.
rendered_image = page_layout.render_to_image(image) 
cv2.imwrite('page_image_render.jpg', rendered_image)

# Save each cropped text line in a separate .jpg file.
for region in page_layout.regions:
  for line in region.lines:
     cv2.imwrite(f'file_id-{line.id}.jpg', line.crop.astype(np.uint8))

LayoutEngine params are line_end_weight:1.0 vertical_line_connection_range:3 smooth_line_predictions:False line_detection_threshold:0.2 adaptive_downsample:True
NET INPUT 258048 Mpx.
NET INPUT 2408448 Mpx.
GET MAPS TIME: 0.7413203716278076
MAP RES: (1773, 1281, 5)


  return cascaded_union(triangles)
  for poly in region_poly:
  line.baseline = SmartRegionSorter.rotate_line(line.baseline, angle)
Compilation is falling back to object mode WITH looplifting enabled because Function "reverse_line_mapping" failed type inference due to: non-precise type pyobject
During: typing of argument at /home/vivek/anaconda3/lib/python3.8/site-packages/pero_ocr/document_ocr/crop_engine.py (104)

File "../../anaconda3/lib/python3.8/site-packages/pero_ocr/document_ocr/crop_engine.py", line 104:
    def reverse_line_mapping(self, forward_mapping, sample_positions, sampled_values):
        backward_mapping = np.zeros_like(sample_positions)
        ^

  @jit
Compilation is falling back to object mode WITHOUT looplifting enabled because Function "reverse_line_mapping" failed type inference due to: cannot determine Numba type of <class 'numba.core.dispatcher.LiftedLoop'>

File "../../anaconda3/lib/python3.8/site-packages/pero_ocr/document_ocr/crop_engine.py", line 106:
  

In [6]:
from configparser import ConfigParser
config = ConfigParser()
config.read(config_path)

# 确认是否成功读取了 PAGE_PARSER 部分
if 'PAGE_PARSER' in config:
    page_parser = PageParser(config, config_path=os.path.dirname(config_path))
else:
    print("未找到 PAGE_PARSER 部分")


LayoutEngine params are line_end_weight:1.0 vertical_line_connection_range:3 smooth_line_predictions:False line_detection_threshold:0.2 adaptive_downsample:True


## Environment configuration issues
In theory, the above code is very simple to execute. However, in practice, the most difficult part is the environment configuration.
This project is based on the Anaconda environment.

### Create a new environment
conda create -n pero-ocr python=3.9
conda activate pero-ocr
The first step is to open a new environment in Anaconda in the terminal and activate it, in order to create as clean an environment as possible for this project, so as to facilitate the installation of subsequent packages and avoid some version conflict problems with existing packages.

### Install the necessary packages to use PERO OCR
conda install -c anaconda jupyter

pip install pero_ocr

During the installation of shapely and libgeos, we encountered version conflict issues, so after uninstalling both we reassigned the compatible versions

conda remove shapely

conda remove libgeos

conda install -c conda-forge shapely=1.7.1

conda install -c conda-forge libgeos=3.8.1

A note on version conflict issues:
Since the pero_ocr used is version 0.6.1 released on October 20, 2022, the wrapped version should be earlier than this date, and try to choose a stable version for multiple tests.

### NumPy version downgrade issues
After successfully installing the above necessary packages, the runtime encountered a NumPy versioning issue reporting an error:

AttributeError: module 'numpy' has no attribute 'float'.
np.float was a deprecated alias for the builtin float. To avoid this error in existing code, use float by itself. Doing this will not modify any behavior and If you specifically wanted the numpy scalar type, use np.float64 here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at.
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

According to the error message, the NumPy version needs to be lower than 1.20, however, after the NumPy version was lowered to 1.19.5, SciPy and Numba used the latest version, which again conflicted with the NumPy version, and this was not resolved after several downgrades.

In [14]:
import os
import cv2
import numpy as np
import configparser
from pero_ocr.document_ocr.layout import PageLayout
from pero_ocr.document_ocr.page_parser import PageParser

def process_image(image_path, output_dir, page_parser):
    image = cv2.imread(image_path, 1)
    page_layout = PageLayout(id=image_path, page_size=(image.shape[0], image.shape[1]))
    page_layout = page_parser.process_page(image, page_layout)

    # 创建输出子文件夹
    os.makedirs(output_dir, exist_ok=True)

    # 保存Page XML和ALTO XML
    page_layout.to_pagexml(os.path.join(output_dir, 'output_page.xml'))
    page_layout.to_altoxml(os.path.join(output_dir, 'output_ALTO.xml'))

    # 渲染图像并保存
    rendered_image = page_layout.render_to_image(image)
    cv2.imwrite(os.path.join(output_dir, 'page_image_render.jpg'), rendered_image)

    # 保存每个裁剪的文本行
    for region in page_layout.regions:
        for line in region.lines:
            cv2.imwrite(os.path.join(output_dir, f'{line.id}.jpg'), line.crop.astype(np.uint8))

# 配置OCR
config_path = "./config.ini"
config = configparser.ConfigParser()
config.read(config_path)
page_parser = PageParser(config, config_path=os.path.dirname(config_path))

# 指定包含图像文件的目录
base_directory = '/home/vivek/Desktop/result'

for folder in sorted(os.listdir(base_directory)):
    if folder.endswith('_bestresult'):
        input_folder = os.path.join(base_directory, folder)
        output_folder_base = os.path.join(base_directory, folder.split('_')[0] + '_ocr')

        for file in sorted(os.listdir(input_folder)):
            if file.endswith('_cropped.jpg'):
                input_path = os.path.join(input_folder, file)
                output_dir = os.path.join(output_folder_base, file.replace('_cropped.jpg',""))

                process_image(input_path, output_dir, page_parser)


LayoutEngine params are line_end_weight:1.0 vertical_line_connection_range:3 smooth_line_predictions:False line_detection_threshold:0.2 adaptive_downsample:True
NET INPUT 258048 Mpx.
NET INPUT 2408448 Mpx.
GET MAPS TIME: 0.5423362255096436
MAP RES: (1773, 1281, 5)
NET INPUT 2179072 Mpx.
GET MAPS TIME: 0.40151381492614746
MAP RES: (1764, 1212, 5)
NET INPUT 2408448 Mpx.
GET MAPS TIME: 0.08796572685241699
MAP RES: (1777, 1283, 5)
NET INPUT 2179072 Mpx.
GET MAPS TIME: 0.45198655128479004
MAP RES: (1773, 1212, 5)
NET INPUT 2293760 Mpx.
GET MAPS TIME: 0.0980064868927002
MAP RES: (1778, 1219, 5)
NET INPUT 2293760 Mpx.
GET MAPS TIME: 0.08708620071411133
MAP RES: (1771, 1229, 5)
NET INPUT 2179072 Mpx.
NET INPUT 4423680 Mpx.
GET MAPS TIME: 0.22426223754882812
MAP RES: (2503, 1701, 5)


  areas = np.array([poly.area for poly in textline_is])
  textline_is = textline_is[np.argmax(areas)]


NET INPUT 4313088 Mpx.
GET MAPS TIME: 0.1549825668334961
MAP RES: (2482, 1723, 5)
NET INPUT 4423680 Mpx.
GET MAPS TIME: 0.15709209442138672
MAP RES: (2499, 1707, 5)
NET INPUT 4313088 Mpx.
GET MAPS TIME: 0.1544816493988037
MAP RES: (2486, 1728, 5)
NET INPUT 4423680 Mpx.
GET MAPS TIME: 0.1558675765991211
MAP RES: (2497, 1706, 5)
NET INPUT 4587520 Mpx.
GET MAPS TIME: 0.16545748710632324
MAP RES: (2499, 1741, 5)
NET INPUT 4153344 Mpx.
GET MAPS TIME: 0.14337992668151855
MAP RES: (2496, 1660, 5)
NET INPUT 4472832 Mpx.
GET MAPS TIME: 0.15700244903564453
MAP RES: (2485, 1733, 5)
NET INPUT 4313088 Mpx.
GET MAPS TIME: 0.15517091751098633
MAP RES: (2494, 1698, 5)


  lengths = np.array([line.length for line in baseline_is])
  baseline_is = baseline_is[np.argmax(lengths)]


NET INPUT 4587520 Mpx.
GET MAPS TIME: 0.16600990295410156
MAP RES: (2497, 1738, 5)
NET INPUT 4313088 Mpx.
GET MAPS TIME: 0.15271472930908203
MAP RES: (2492, 1701, 5)
NET INPUT 4472832 Mpx.
GET MAPS TIME: 0.1529233455657959
MAP RES: (2493, 1730, 5)
NET INPUT 4313088 Mpx.
GET MAPS TIME: 0.152848482131958
MAP RES: (2481, 1668, 5)
NET INPUT 4472832 Mpx.
GET MAPS TIME: 0.1552739143371582
MAP RES: (2495, 1756, 5)
NET INPUT 4313088 Mpx.
GET MAPS TIME: 0.14705777168273926
MAP RES: (2481, 1698, 5)
NET INPUT 4313088 Mpx.
GET MAPS TIME: 0.15154004096984863
MAP RES: (2486, 1716, 5)
NET INPUT 4313088 Mpx.
GET MAPS TIME: 0.15316343307495117
MAP RES: (2481, 1692, 5)
NET INPUT 4472832 Mpx.
GET MAPS TIME: 0.15662264823913574
MAP RES: (2492, 1743, 5)
NET INPUT 4313088 Mpx.
GET MAPS TIME: 0.16516828536987305
MAP RES: (2482, 1693, 5)
NET INPUT 4313088 Mpx.
GET MAPS TIME: 0.1587224006652832
MAP RES: (2481, 1726, 5)
NET INPUT 4313088 Mpx.
GET MAPS TIME: 0.16374969482421875
MAP RES: (2476, 1693, 5)
NET INPUT

  line_coords = self.get_crop_inputs(baseline, heights, self.line_height)
  line_coords = crop_engine.get_crop_inputs(line.baseline, line.heights, 16)


NET INPUT 4358144 Mpx.
GET MAPS TIME: 0.15303683280944824
MAP RES: (2416, 1732, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.14709091186523438
MAP RES: (2401, 1700, 5)
NET INPUT 4358144 Mpx.
GET MAPS TIME: 0.15265941619873047
MAP RES: (2415, 1737, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.145859956741333
MAP RES: (2404, 1700, 5)
NET INPUT 4358144 Mpx.
GET MAPS TIME: 0.154249906539917
MAP RES: (2415, 1729, 5)
NET INPUT 4046848 Mpx.
GET MAPS TIME: 0.140397310256958
MAP RES: (2409, 1636, 5)
NET INPUT 4358144 Mpx.
GET MAPS TIME: 0.15424251556396484
MAP RES: (2421, 1736, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.14322757720947266
MAP RES: (2400, 1707, 5)
NET INPUT 4358144 Mpx.
GET MAPS TIME: 0.15292906761169434
MAP RES: (2419, 1732, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.14733171463012695
MAP RES: (2384, 1688, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.14695310592651367
MAP RES: (2411, 1723, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.1465771198272705
MAP RES: (2398, 1702, 5)
NET INPUT 4

  line_coords = self.get_crop_inputs(baseline, heights, self.line_height)
  line_coords = crop_engine.get_crop_inputs(line.baseline, line.heights, 16)


NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.14685726165771484
MAP RES: (2388, 1718, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.14847898483276367
MAP RES: (2389, 1726, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.14891529083251953
MAP RES: (2388, 1708, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.15073037147521973
MAP RES: (2389, 1724, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.15085291862487793
MAP RES: (2391, 1711, 5)
NET INPUT 4046848 Mpx.
GET MAPS TIME: 0.13994455337524414
MAP RES: (2385, 1648, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.15040063858032227
MAP RES: (2388, 1683, 5)
NET INPUT 4358144 Mpx.
GET MAPS TIME: 0.15889334678649902
MAP RES: (2385, 1733, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.14747071266174316
MAP RES: (2391, 1712, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.1505885124206543
MAP RES: (2388, 1726, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.1486060619354248
MAP RES: (2387, 1711, 5)
NET INPUT 4358144 Mpx.
GET MAPS TIME: 0.15884184837341309
MAP RES: (2385, 1729, 5)
NET IN

  line_coords = self.get_crop_inputs(baseline, heights, self.line_height)
  line_coords = crop_engine.get_crop_inputs(line.baseline, line.heights, 16)


NET INPUT 4046848 Mpx.
GET MAPS TIME: 0.14107131958007812
MAP RES: (2396, 1603, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.14958786964416504
MAP RES: (2385, 1686, 5)
NET INPUT 3891200 Mpx.
GET MAPS TIME: 0.13425898551940918
MAP RES: (2392, 1598, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.1470789909362793
MAP RES: (2387, 1681, 5)
NET INPUT 4046848 Mpx.
GET MAPS TIME: 0.14192914962768555
MAP RES: (2392, 1623, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.14894843101501465
MAP RES: (2394, 1681, 5)
NET INPUT 4046848 Mpx.
GET MAPS TIME: 0.14611315727233887
MAP RES: (2392, 1620, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.14972162246704102
MAP RES: (2389, 1681, 5)
NET INPUT 4046848 Mpx.
GET MAPS TIME: 0.14708185195922852
MAP RES: (2387, 1609, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.1515655517578125
MAP RES: (2388, 1681, 5)
NET INPUT 4046848 Mpx.
GET MAPS TIME: 0.14302849769592285
MAP RES: (2396, 1646, 5)
NET INPUT 4202496 Mpx.
GET MAPS TIME: 0.146071195602417
MAP RES: (2393, 1685, 5)
NET INPU

  line_coords = self.get_crop_inputs(baseline, heights, self.line_height)
  line_coords = crop_engine.get_crop_inputs(line.baseline, line.heights, 16)


NET INPUT 3108864 Mpx.
GET MAPS TIME: 0.11125612258911133
MAP RES: (2087, 1451, 5)
NET INPUT 3108864 Mpx.
NET INPUT 5107712 Mpx.
GET MAPS TIME: 0.26009583473205566
MAP RES: (2692, 1849, 5)
NET INPUT 4988928 Mpx.
GET MAPS TIME: 0.1761000156402588
MAP RES: (2661, 1842, 5)
NET INPUT 4988928 Mpx.
GET MAPS TIME: 0.17500019073486328
MAP RES: (2684, 1847, 5)
NET INPUT 5160960 Mpx.
GET MAPS TIME: 0.1811199188232422
MAP RES: (2685, 1858, 5)
NET INPUT 5160960 Mpx.
GET MAPS TIME: 0.18631315231323242
MAP RES: (2685, 1862, 5)
NET INPUT 4988928 Mpx.
GET MAPS TIME: 0.17668747901916504
MAP RES: (2675, 1848, 5)
NET INPUT 5160960 Mpx.
GET MAPS TIME: 0.1882181167602539
MAP RES: (2680, 1865, 5)
NET INPUT 4988928 Mpx.
NET INPUT 3203072 Mpx.
GET MAPS TIME: 0.2726740837097168
MAP RES: (2122, 1461, 5)
NET INPUT 3342336 Mpx.
GET MAPS TIME: 0.11647295951843262
MAP RES: (2122, 1477, 5)
NET INPUT 3203072 Mpx.
GET MAPS TIME: 0.11294722557067871
MAP RES: (2124, 1471, 5)
NET INPUT 3342336 Mpx.
GET MAPS TIME: 0.11597

  line_coords = self.get_crop_inputs(baseline, heights, self.line_height)


NET INPUT 2973696 Mpx.
GET MAPS TIME: 0.10639762878417969
MAP RES: (2056, 1404, 5)
NET INPUT 3244032 Mpx.
GET MAPS TIME: 0.11777448654174805
MAP RES: (2057, 1505, 5)
NET INPUT 2973696 Mpx.
GET MAPS TIME: 0.10826826095581055
MAP RES: (2051, 1404, 5)
NET INPUT 3244032 Mpx.
NET INPUT 5160960 Mpx.
GET MAPS TIME: 0.27140283584594727
MAP RES: (2634, 1896, 5)
NET INPUT 4702208 Mpx.
GET MAPS TIME: 0.16570401191711426
MAP RES: (2614, 1784, 5)
NET INPUT 5038080 Mpx.
NET INPUT 3244032 Mpx.
GET MAPS TIME: 0.2822587490081787
MAP RES: (2058, 1498, 5)
NET INPUT 2883584 Mpx.
GET MAPS TIME: 0.10543966293334961
MAP RES: (2046, 1408, 5)
NET INPUT 3244032 Mpx.
GET MAPS TIME: 0.11890292167663574
MAP RES: (2074, 1509, 5)
NET INPUT 2883584 Mpx.
GET MAPS TIME: 0.10557317733764648
MAP RES: (2048, 1401, 5)
NET INPUT 3244032 Mpx.
GET MAPS TIME: 0.11511898040771484
MAP RES: (2074, 1510, 5)
NET INPUT 2973696 Mpx.
GET MAPS TIME: 0.10875487327575684
MAP RES: (2052, 1398, 5)
NET INPUT 3244032 Mpx.
NET INPUT 5038080 M

  line_coords = self.get_crop_inputs(baseline, heights, self.line_height)


NET INPUT 5038080 Mpx.
GET MAPS TIME: 0.1782674789428711
MAP RES: (2609, 1891, 5)
NET INPUT 4870144 Mpx.
GET MAPS TIME: 0.17124271392822266
MAP RES: (2604, 1833, 5)
NET INPUT 5206016 Mpx.
GET MAPS TIME: 0.19191932678222656
MAP RES: (2590, 1931, 5)
NET INPUT 4870144 Mpx.
GET MAPS TIME: 0.16856074333190918
MAP RES: (2600, 1809, 5)
NET INPUT 5038080 Mpx.
GET MAPS TIME: 0.1811811923980713
MAP RES: (2614, 1901, 5)
NET INPUT 4870144 Mpx.
GET MAPS TIME: 0.16950368881225586
MAP RES: (2603, 1816, 5)
NET INPUT 5038080 Mpx.
GET MAPS TIME: 0.17996668815612793
MAP RES: (2622, 1903, 5)
NET INPUT 4870144 Mpx.
GET MAPS TIME: 0.1725754737854004
MAP RES: (2596, 1816, 5)
NET INPUT 5038080 Mpx.
GET MAPS TIME: 0.18022441864013672
MAP RES: (2599, 1887, 5)
NET INPUT 4870144 Mpx.
GET MAPS TIME: 0.1679849624633789
MAP RES: (2604, 1824, 5)
NET INPUT 5038080 Mpx.
NET INPUT 3244032 Mpx.
GET MAPS TIME: 0.28159427642822266
MAP RES: (2087, 1524, 5)
NET INPUT 3108864 Mpx.
GET MAPS TIME: 0.11014771461486816
MAP RES: (

In [11]:
pwd

'/home/vivek/Desktop'

'/home/vivek/Desktop/pero_eu_cz_print_newspapers_2022-09-26'

In [8]:
cd  

'/home/vivek/Desktop/pero_eu_cz_print_newspapers_2022-09-26'

In [8]:
vivek@technik-MS-7D53:~$ pip list
Package                       Version
----------------------------- ------------
anaconda-anon-usage           0.4.3
anaconda-client               1.12.2
anaconda-cloud-auth           0.1.4
anaconda-navigator            2.5.2
anaconda-project              0.11.1
anyio                         3.5.0
arabic-reshaper               3.0.0
archspec                      0.2.1
argon2-cffi                   21.3.0
argon2-cffi-bindings          21.2.0
asttokens                     2.0.5
async-lru                     2.0.4
attrs                         23.1.0
Babel                         2.11.0
backcall                      0.2.0
backports.functools-lru-cache 1.6.4
backports.tempfile            1.0
backports.weakref             1.0.post1
beautifulsoup4                4.12.2
bleach                        4.1.0
boltons                       23.0.0
brnolm                        0.3.0
Brotli                        1.0.9
certifi                       2023.11.17
cffi                          1.16.0
chardet                       4.0.0
charset-normalizer            2.0.4
click                         8.1.7
clyent                        1.2.2
comm                          0.1.2
conda                         23.11.0
conda-build                   3.28.4
conda-content-trust           0.2.0
conda_index                   0.3.0
conda-libmamba-solver         23.12.0
conda-pack                    0.6.0
conda-package-handling        2.2.0
conda_package_streaming       0.9.0
conda-repo-cli                1.0.75
conda-token                   0.4.0
conda-verify                  3.4.2
contourpy                     1.1.1
cryptography                  41.0.7
cycler                        0.12.1
debugpy                       1.6.7
decorator                     5.1.1
defusedxml                    0.7.1
distro                        1.8.0
executing                     0.8.3
fastjsonschema                2.16.2
filelock                      3.13.1
fonttools                     4.47.2
fsspec                        2023.12.2
future                        0.18.3
idna                          3.4
imageio                       2.33.1
imgaug                        0.4.0
importlib-metadata            7.0.1
importlib-resources           6.1.1
ipykernel                     6.28.0
ipython                       8.12.2
jaraco.classes                3.2.1
jedi                          0.18.1
jeepney                       0.7.1
Jinja2                        3.1.2
joblib                        1.3.2
json5                         0.9.6
jsonpatch                     1.32
jsonpointer                   2.1
jsonschema                    4.19.2
jsonschema-specifications     2023.7.1
jupyter_client                8.6.0
jupyter_core                  5.5.0
jupyter-events                0.8.0
jupyter-lsp                   2.2.0
jupyter_server                2.10.0
jupyter_server_terminals      0.4.4
jupyterlab                    4.0.8
jupyterlab-pygments           0.1.2
jupyterlab_server             2.25.1
keyring                       23.13.1
kiwisolver                    1.4.5
lazy_loader                   0.3
Levenshtein                   0.23.0
libarchive-c                  2.9
libmambapy                    1.5.6
llvmlite                      0.33.0
lmdb                          1.4.1
lxml                          5.1.0
MarkupSafe                    2.1.3
matplotlib                    3.2.0
matplotlib-inline             0.1.6
menuinst                      2.0.1
mistune                       2.0.4
more-itertools                10.1.0
mpmath                        1.3.0
navigator-updater             0.4.0
nbclient                      0.8.0
nbconvert                     7.10.0
nbformat                      5.9.2
nest-asyncio                  1.5.6
networkx                      3.1
notebook                      7.0.6
notebook_shim                 0.2.3
numba                         0.50.0
numpy                         1.18.1
nvidia-cublas-cu12            12.1.3.1
nvidia-cuda-cupti-cu12        12.1.105
nvidia-cuda-nvrtc-cu12        12.1.105
nvidia-cuda-runtime-cu12      12.1.105
nvidia-cudnn-cu12             8.9.2.26
nvidia-cufft-cu12             11.0.2.54
nvidia-curand-cu12            10.3.2.106
nvidia-cusolver-cu12          11.4.5.107
nvidia-cusparse-cu12          12.1.0.106
nvidia-nccl-cu12              2.18.1
nvidia-nvjitlink-cu12         12.3.101
nvidia-nvtx-cu12              12.1.105
opencv-python                 4.5.4.60
overrides                     7.4.0
packaging                     23.1
pandocfilters                 1.5.0
parso                         0.8.3
pathlib                       1.0.1
pero-ocr                      0.6.1
pexpect                       4.8.0
pickleshare                   0.7.5
Pillow                        10.0.1
pip                           23.3.1
pkce                          1.0.3
pkginfo                       1.9.6
pkgutil_resolve_name          1.3.10
platformdirs                  3.10.0
pluggy                        1.0.0
ply                           3.11
pooch                         1.8.0
prometheus-client             0.14.1
prompt-toolkit                3.0.43
psutil                        5.9.0
ptyprocess                    0.7.0
pure-eval                     0.2.2
pyamg                         5.0.1
pycosat                       0.6.6
pycparser                     2.21
pydantic                      1.10.12
Pygments                      2.15.1
PyJWT                         2.4.0
pyOpenSSL                     23.2.0
pyparsing                     3.1.1
PyQt5                         5.15.10
PyQt5-sip                     12.13.0
PySocks                       1.7.1
python-dateutil               2.8.2
python-dotenv                 0.21.0
python-json-logger            2.0.7
pytz                          2023.3.post1
PyWavelets                    1.4.1
PyYAML                        6.0.1
pyzmq                         25.1.2
QtPy                          2.4.1
rapidfuzz                     3.6.1
referencing                   0.30.2
requests                      2.31.0
requests-toolbelt             1.0.0
rfc3339-validator             0.1.4
rfc3986-validator             0.1.1
rpds-py                       0.10.6
ruamel.yaml                   0.17.21
ruamel.yaml.clib              0.2.6
ruamel-yaml-conda             0.17.21
safe-gpu                      1.5.1
scikit-image                  0.17.1
scikit-learn                  1.3.2
scipy                         1.5.0
SecretStorage                 3.3.1
semver                        2.13.0
Send2Trash                    1.8.2
setuptools                    68.2.2
Shapely                       1.8.0
sip                           6.7.12
six                           1.16.0
sniffio                       1.3.0
soupsieve                     2.5
stack-data                    0.2.0
sympy                         1.12
terminado                     0.17.1
threadpoolctl                 3.2.0
tifffile                      2023.7.10
tinycss2                      1.2.1
tomli                         2.0.1
torch                         2.1.2
tornado                       6.3.3
tqdm                          4.65.0
traitlets                     5.7.1
triton                        2.1.0
typing_extensions             4.9.0
ujson                         5.4.0
urllib3                       1.26.18
wcwidth                       0.2.5
webencodings                  0.5.1
websocket-client              0.58.0
wheel                         0.41.2
zipp                          3.17.0
zstandard                     0.19.0


'/home/vivek/Desktop/pero_eu_cz_print_newspapers_2022-09-26'

# It is running now, please do not close it until wednesday, thanks a lot!

##