From b7caeecc9ef8bdaaa5bae4aa0a38777f2ddcb0cf Mon Sep 17 00:00:00 2001
From: nitink <nitinkumar96@live.com>
Date: Tue, 6 Feb 2024 15:19:48 +0530
Subject: [PATCH 1/3] Updated ImageToTextV2 documentation

---
 docs/en/ocr_pipeline_components.md | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/docs/en/ocr_pipeline_components.md b/docs/en/ocr_pipeline_components.md
index 9277022fde..db8a90a548 100644
--- a/docs/en/ocr_pipeline_components.md
+++ b/docs/en/ocr_pipeline_components.md
@@ -3211,6 +3211,8 @@ others. One could almost say they feed on and grow on ideas.
 
 `ImageToTextV2` can work on CPU, but GPU is preferred in order to achieve acceptable performance.
 
+`ImageToTextV2` can be used with caching enabled.
+
 `ImageToTextV2` can receive regions representing single line texts, or regions coming from a text detection model.
 
 </div><div class="h3-box" markdown="1">
@@ -3221,6 +3223,7 @@ others. One could almost say they feed on and grow on ideas.
 | Param name | Type | Default | Column Data Description |
 | --- | --- | --- | --- |
 | inputCols | Array[string] | [image] | Can use as input image struct ([Image schema](ocr_structures#image-schema))  and regions. |
+| regionsColumn | string | regions | Input column containing regions to be processed. |
 
 </div><div class="h3-box" markdown="1">
 
@@ -3232,6 +3235,14 @@ others. One could almost say they feed on and grow on ideas.
 | lineTolerance | integer | 15 | Line tolerance in pixels. It's used for grouping text regions by lines. |
 | borderWidth | integer | 5 | A value of more than 0 enables to border text regions with width equal to the value of the parameter. |
 | spaceWidth | integer | 10 | A value of more than 0 enables to add white spaces between words on the image. |
+| limitMultiplier | float | 1.5 | Used to control the length of the final output text ,a higher value will result in longer text sequence if available. Defaults to 1.5 |
+| maxImageRatio | float | 11.25 | Value for the width/height ratio of images that are fed to the model. Large values reduce inference time, but may cause the model to diverge. Defaults to 11.25. |
+| groupImages | boolean | True | Whether to group images to maximize detection quality or not. |
+| batchSize | integer | 3 | Number of text patches to feed the model at the same time. |
+| taskParallelism | integer | 8 | How many threads to use when processing a single region. |
+| useGPU | boolean | False | Enable to use GPU. |
+| useCaching | boolean | True | Enable to use caching. |
+| keepInput | boolean | True | Enable to preserve input column. |
 
 </div><div class="h3-box" markdown="1">
 
@@ -3240,7 +3251,9 @@ others. One could almost say they feed on and grow on ideas.
 {:.table-model-big}
 | Param name | Type | Default | Column Data Description |
 | --- | --- | --- | --- |
-| outputCol | string | text | Recognized text |
+| outputCol | string | text | Recognized text. |
+| positionsCol | string | positions | Position Col. |
+| outputFormat | Enum | OcrOutputFormat.TEXT | Return output type. |
 
 **Example:**
 
@@ -3251,6 +3264,7 @@ others. One could almost say they feed on and grow on ideas.
 ```python
 from pyspark.ml import PipelineModel
 from sparkocr.transformers import *
+from sparkocr.enums import *
 
 imagePath = "path to image"
 
@@ -3271,7 +3285,11 @@ text_detector = ImageTextDetectorV2 \
     .setSizeThreshold(20)
 
 ocr = ImageToTextV2.pretrained("ocr_base_printed", "en", "clinical/ocr") \
-    .setInputCols(["image", "text_regions"]) \
+    .setInputCols(["image"]) \
+    .setRegionsColumn("text_regions") \
+    .setUseGPU(True) \
+    .setUseCaching(True) \
+    .setOutputFormat(OcrOutputFormat.TEXT) \
     .setOutputCol("text")
 
 # Define pipeline

From cd0a52685914108c4276063f86ddab1333bfda6c Mon Sep 17 00:00:00 2001
From: nitink <nitinkumar96@live.com>
Date: Tue, 6 Feb 2024 15:28:57 +0530
Subject: [PATCH 2/3] updates ImageToTextV2

---
 docs/en/ocr_pipeline_components.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/en/ocr_pipeline_components.md b/docs/en/ocr_pipeline_components.md
index db8a90a548..ea19f6cd4f 100644
--- a/docs/en/ocr_pipeline_components.md
+++ b/docs/en/ocr_pipeline_components.md
@@ -3235,8 +3235,8 @@ others. One could almost say they feed on and grow on ideas.
 | lineTolerance | integer | 15 | Line tolerance in pixels. It's used for grouping text regions by lines. |
 | borderWidth | integer | 5 | A value of more than 0 enables to border text regions with width equal to the value of the parameter. |
 | spaceWidth | integer | 10 | A value of more than 0 enables to add white spaces between words on the image. |
-| limitMultiplier | float | 1.5 | Used to control the length of the final output text ,a higher value will result in longer text sequence if available. Defaults to 1.5 |
-| maxImageRatio | float | 11.25 | Value for the width/height ratio of images that are fed to the model. Large values reduce inference time, but may cause the model to diverge. Defaults to 11.25. |
+| limitMultiplier | float | 1.5 | Used to control the length of the final output text ,a higher value will result in longer text sequence if available. |
+| maxImageRatio | float | 11.25 | Value for the width/height ratio of images that are fed to the model. Large values reduce inference time, but may cause the model to diverge. |
 | groupImages | boolean | True | Whether to group images to maximize detection quality or not. |
 | batchSize | integer | 3 | Number of text patches to feed the model at the same time. |
 | taskParallelism | integer | 8 | How many threads to use when processing a single region. |

From a29fe9d0ee4e0fdf96402c9ef77a7b6878591c9a Mon Sep 17 00:00:00 2001
From: Nitin Kumar <72322393+nogifeet@users.noreply.github.com>
Date: Wed, 14 Feb 2024 17:00:35 +0530
Subject: [PATCH 3/3] Update ocr_pipeline_components.md

Remove edge case param limitMultiplier
---
 docs/en/ocr_pipeline_components.md | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/docs/en/ocr_pipeline_components.md b/docs/en/ocr_pipeline_components.md
index ea19f6cd4f..873bf9dcd9 100644
--- a/docs/en/ocr_pipeline_components.md
+++ b/docs/en/ocr_pipeline_components.md
@@ -3235,7 +3235,6 @@ others. One could almost say they feed on and grow on ideas.
 | lineTolerance | integer | 15 | Line tolerance in pixels. It's used for grouping text regions by lines. |
 | borderWidth | integer | 5 | A value of more than 0 enables to border text regions with width equal to the value of the parameter. |
 | spaceWidth | integer | 10 | A value of more than 0 enables to add white spaces between words on the image. |
-| limitMultiplier | float | 1.5 | Used to control the length of the final output text ,a higher value will result in longer text sequence if available. |
 | maxImageRatio | float | 11.25 | Value for the width/height ratio of images that are fed to the model. Large values reduce inference time, but may cause the model to diverge. |
 | groupImages | boolean | True | Whether to group images to maximize detection quality or not. |
 | batchSize | integer | 3 | Number of text patches to feed the model at the same time. |
@@ -4409,4 +4408,4 @@ Output:
 
 ```
 
-</div>
\ No newline at end of file
+</div>