Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

我无法使用 CLI。 我已经尝试了很多天,但找不到任何解决方案。 #523

Closed
1 task done
alysonhower opened this issue May 26, 2024 · 3 comments
Closed
1 task done

Comments

@alysonhower
Copy link

alysonhower commented May 26, 2024

Issues

  • I have browsed through the Issues. 我已浏览过Issues,确定没有重复提问。

Umi-OCR version 程序版本

2.1.0

Windows version 系统版本

win11

OCR plugins Used 使用的OCR插件

PaddleOCR, RapidOCR

Reproduction steps 复现步骤

我按照链接 https://github.com/hiroi-sora/Umi-OCR/blob/main/docs/README_CLI.md 按照 CLI 教程进行操作

Problem screenshots or related files (optional) 问题截图或相关文件(可选)

我为可怕的中国人谦卑地道歉。 不幸的是,我仍在学习中,而且我处于非常基础的水平,能够传达复杂的信息,因此我将使用通用翻译器从葡萄牙语/英语翻译为简体中文。

CLI 可能没有问题,但不幸的是,我无法按照高级说明添加 PDF 文档和双层 OCR 文本。 我没有得到回复,或者如果他想要文件,我可以让他处理。 我试图按照( https://github.com/hiroi-sora/Umi-OCR/blob/main/docs/README_CLI.md )的说明进行操作,但是,我相信我一定做错了什么。 看,我正在尝试使用语法: Umi-OCR.exe --call_qml [name] --func [function] [..paras]

您能否向我展示一个简单的命令,用于添加一两个 PDF 文档,然后执行对它们应用双层 OCR 文本的过程?

谢谢! 您的应用程序可能是免费甚至封闭的解决方案中最好的!

@hiroi-sora
Copy link
Owner

hiroi-sora commented May 27, 2024

I apologize for the imperfections in our English documentation. In the future, we will continue to focus on providing convenience for users of multiple languages.

Below is the solution to your request:

Objective: Use CLI to add a PDF document task and generate a double-layer searchable PDF.

This task is not within the preset quick commands, so it is relatively cumbersome and requires multiple consecutive commands to achieve.

Procedure:

  1. (Optional) If the Batch Documents OCR tab is not currently open, open it:

    • 1.1. Query all current page templates:
    ./umi-ocr --all_pages
    • 1.2. It is known that the template_index of the BatchDOC tab is 3. Create this tab:
    ./umi-ocr --add_page 3
    • 1.3. Check if the BatchDOC module already exists:
    ./umi-ocr --all_modules
    • If BatchDOC_1 is found in Qml modules, then it is correct.
  2. Input the paths of multiple documents into the software:

    • Suppose you want to add the following files:
    C:\Users\My\Desktop\111.epub
    C:\Users\My\Desktop\222.pdf
    • Use the following command to input the document paths (the \ in the path needs to be changed to /):
    ./umi-ocr --call_qml BatchDOC --func addDocs '[\"C:/Users/My/Desktop/111.epub\",\"C:/Users/My/Desktop/222.pdf\"]'
    • Note that the format of the addDocs parameter is: '[\"document path\",\"document path\"...]'. Also, backticks \ cannot be used in the path; / must be used instead.
  3. Start the task:

    ./umi-ocr --call_qml BatchDOC --func docStart
    • Currently, it is not possible to change the file type to be saved via CLI (the default is layered.pdf Double-layer Searchable Document). To add other save types, you must check them in the software interface.

@alysonhower
Copy link
Author

./umi-ocr --call_qml BatchDOC --func docStart

Thank you for the quick response and detailed step-by-step instructions! Problem solved! With your help I was able to process the documents despite having experienced some difficulties when using a more recent version of PowerShell (version 7.4.2); so to make it work I have to run the commands using Windows PowerShell version 5.1.22621.2506 (the default version preinstalled on Windows 11). If you are curious, please see how I proceeded below:

  1. Initially, when running the commands as directed using PowerShell version 7.4.2, only the following commands appear to execute correctly: ".\Umi-OCR.exe --all_pages" and ".\Umi-OCR.exe --add_page 3". The above commands launch the application and open the Batch Documents OCR page respectively.

  2. After running the previous commands using PowerShell version 7.4.2 the following commands DO NOT run or DO NOT appear to run: ".\Umi-OCR.exe --call_qml BatchDOC --func addDocs '["C:/Users /account/Downloads/example.pdf"]'" and ".\Umi-OCR.exe --call_qml BatchDOC --func docStart". Although they do not execute effectively the message 'Calling "docStart" in main thread.' is returned as if something where happening, but I can't see any CPU stress or memory usage and no document is created.

After failing to execute several attempts using the previous commands, I realized I was using the Windows Terminal which in turn was launching PowerShell (version 7.4.2); So, I tried launching the same commands in the same order however this time using Windows PowerShell version 5.1.22621.2506 and luckily everything worked correctly!

Thank you for your help! You're doing an awesome job and making my life easier so please wait for me to buy you some coffees or provide you with any help here if you need! Much love and affection from your friend in Brazil

@hiroi-sora
Copy link
Owner

commands DO NOT run or DO NOT appear to run

This issue is most likely caused by incorrect parsing of double quotes " by Windows PowerShell. Additionally, the parsing rules differ between PowerShell and Terminal, resulting in different formats for the paths array that we need to input:

  • In PowerShell, the outermost layer of the array should be enclosed in single quotes ', and there must be a space before each double quote. That is: addDocs '[■\"path_1\",■\"path_2\",■\"path_3\"]' (replace with a space ).
  • In Terminal, the outermost layer of the array should be enclosed in double quotes ". That is: addDocs "[\"path_1\",\"path_2\",\"path_3\"]".

The above commands have been tested and work on the latest Windows 11. There may be slight differences in other versions of the system.

Microsoft make something as simple as quoting strings a convoluted mess. 😂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants