# Dealing with rotated documents

Sometimes, you have to deal with rotations on page level or multi-oriented text inside a document. This notebook shows how to deal with such cases using the `docTR` library.

In [None]:
# Install docTR
#!pip install python-doctr[viz]
# From source
!pip install python-doctr[viz]@git+https://github.com/mindee/doctr.git

In [2]:
# Imports
import requests
import cv2

from doctr.io import DocumentFile
from doctr.models import ocr_predictor

Let's load such an possible example and see how we can deal with it.

In [None]:
# Download a sample
!wget https://github.com/mindee/doctr/releases/download/v0.1.0/back_cover.jpg

# Display the image with matplotlib
import matplotlib.pyplot as plt

img = plt.imread('back_cover.jpg')
plt.imshow(img); plt.axis('off'); plt.show()

As we can see our document is sligthly rotated.

We have several options to deal with it.

First we should set `assume_straight_pages` to `False` to indicate that the predictor has to deal with possible rotations.
Second we should set `detect_orientation` to `True` to get the orientation appended to our results.

If we deal only with small rotations in the range ~45 to -45 we can additionally disable the page orientation classification by setting `disable_page_orientation` to `True` and the same for `disable_crop_orientatio` if our document contains only horizontal text to speed up the pipeline.

In [None]:
doc = DocumentFile.from_images(['back_cover.jpg'])
predictor = ocr_predictor(
    pretrained=True,
    det_arch="fast_base",
    reco_arch="parseq",
    assume_straight_pages=False,
    detect_orientation=True,
    disable_crop_orientation=True,
    disable_page_orientation=True,
    straighten_pages=False
)  # .cuda().half() uncomment this line if we run on GPU
result = predictor(doc)

# Visualize the result
result.show()

# Export the result to json like dictionary
json_export = result.export()
print(f"Detected orientation: {json_export['pages'][0]['orientation']['value']} degrees")

Let's see how it looks if we have to deal with higher rotations and enabled page orientation classification.

In [None]:
from doctr.utils.geometry import rotate_image

doc = DocumentFile.from_images(['back_cover.jpg'])
# Let's rotate the document by 180 degrees
doc = [rotate_image(doc[0], 180, expand=False)]

predictor = ocr_predictor(
    pretrained=True,
    det_arch="fast_base",
    reco_arch="parseq",
    assume_straight_pages=False,
    detect_orientation=True,
    disable_crop_orientation=False,
    disable_page_orientation=False,
    straighten_pages=False
)  # .cuda().half() uncomment this line if we run on GPU
result = predictor(doc)

# Visualize the result
result.show()

# Export the result to json like dictionary
json_export = result.export()
print(f"Detected orientation: {json_export['pages'][0]['orientation']['value']} degrees")

Now let's correct this by setting `straighten_pages` to `True`.

In [None]:
from doctr.utils.geometry import rotate_image

doc = DocumentFile.from_images(['back_cover.jpg'])
# Let's rotate the document by 180 degrees
doc = [rotate_image(doc[0], 180, expand=False)]

predictor = ocr_predictor(
    pretrained=True,
    det_arch="fast_base",
    reco_arch="parseq",
    assume_straight_pages=False,
    detect_orientation=True,
    disable_crop_orientation=False,
    disable_page_orientation=False,
    straighten_pages=True
)  # .cuda().half() uncomment this line if we run on GPU
result = predictor(doc)

# Visualize the result
result.show()

# Export the result to json like dictionary
json_export = result.export()
print(f"Detected orientation: {json_export['pages'][0]['orientation']['value']} degrees")
print()
print(f"Extracted text:\n{result.render()}")