Note: This update includes information about using the local
pix2text
server, instructions for starting the server, and details about the new settings options.
Install Pix2Text
pip install pix2text
Start the server
p2t serve -l en -H 0.0.0.0 -p 8503
Download the zip file from releases
Import Plugin :
Logseq > Plugins > Load unpacked plugin
and point to the unzip folder
Now you can use it offline without any external API.
Convert LaTeX formula images from clipboard to LaTeX code in Logseq using Transformers.
Use cases:
- Preparation of scientific presentations or papers
- Transcribing lectures
- Technical reports
- Self-study
(For me it was useful because I hate copying formulas by hand and I hate pasting screenshots of formulas into Logseq😅)
/display-formula-ocr
: Insert LaTeX code on a new line/inline-formula-ocr
: Insert LaTeX code within a paragraph
Notes:
- The image in the clipboard must be a LaTex formula image
- Initial use may be slow due to model loading
- With the free Hugging Face plan you can make about 30k calls per month
-
Manual + Hugging Face
- Requirements: Node.js, Yarn, Parcel, Hugging Face User Access Token
- Clone repo:
git clone https://github.com/olmobaldoni/logseq-formula-ocr-plugin.git
- Install dependencies:
cd logseq-formula-ocr-plugin && yarn && yarn build
- Enable developer mode:
Logseq > Settings > Advanced > Developer mode
- Import Plugin:
Logseq > Plugins > Load unpacked plugin
and point to the cloned repo
-
Marketplace + Hugging Face
- Requirements: Hugging Face User Access Token
- Search for
LaTeX Formula OCR
in the Logseq marketplace and install directly
-
Marketplace + Docker (Recommended)
- Requirements: Docker
- Search for
LaTeX Formula OCR
in the Logseq marketplace and install directly - Pull image:
docker pull olmobaldoni/nougat-ocr-api:latest
- Run container:
docker run -d -p 80:80 olmobaldoni/nougat-ocr-api:latest
-
Manual + Pix2Text (Offline)
- Requirements: Node.js, Yarn, Parcel, Pix2Text
- Clone repo:
git clone https://github.com/vikasmistry/logseq-formula-ocr-plugin.git
- Install dependencies:
cd logseq-formula-ocr-plugin && yarn && yarn build
- Install Pix2Text Python package
- Start the server, eg.
p2t serve -l en -H 0.0.0.0 -p 8503
- Enable developer mode:
Logseq > Settings > Advanced > Developer mode
- Import Plugin:
Logseq > Plugins > Load unpacked plugin
and point to the cloned repo - In the plugin settings, enable the "Use Local API" option and set the "Local API Address" to the appropriate IP address and port (default is http://0.0.0.0:8503)
Note: For more information on how to use the other local API visit: https://github.com/olmobaldoni/LaTex-Formula-OCR-API
-
Hugging Face
- In Hugging Face:
Settings > Access Tokens > New Token > Name+Role(read) > Generate a token
- In Logseq:
Plugins Settings > LaTex Formula OCR > Hugging Face User Access Token
and paste the token.
- In Hugging Face:
-
Local API
- In Logseq:
Plugins Settings > LaTex Formula OCR > Use Local API
to switch between Hugging Face and local
- In Logseq:
Hugging Face API may truncate responses (see Issuee #2 and Issue #487)
Note: Docker or Local(Pix2Text) method recommended for full functionality
This plugin is based on nougat-latex-base, a fine-tuning of facebook/nougat-base with im2latex-100k, and made by NormXU.
Pix2Text: Used for the local OCR server.
In addition, this plugin was also inspired by xxchan and its plugin logseq-ocr
MIT