New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build ffmpeg with the --enable-libtesseract option #297
Comments
OK got as far as this:
but it gives
hmm |
OK overcame that particular problem with |
next was fix |
hopefully got it in 9261995 |
Thank you for working on this. I will try it out over the weekend. |
Have you actually tested it yourself? Half a year ago I've already made a test compile with Tesseract myself and it wasn't particularly easy.
I can compile Leptonica perfectly without it, but maybe it depends on the system, I don't know.
At first I also thought I could compile Leptonica and Tesseract without any external image library, because hey... there's libavcodec, right? You've probably already noticed LibTiff is obligatory, or Tesseract won't compile at all. But this way you're only satisfying the Tesseract compilationprocess, because you're not compiling Leptonica with LibTiff. This results in:
LibTiff seems to be rather crucial. The
My code: build_libleptonica() {
do_git_checkout https://github.com/DanBloomberg/leptonica.git
cd leptonica_git
export PKG_CONFIG="pkg-config --static" # Automatically detect all Leptonica's dependencies.
generic_configure "--disable-programs"
do_make_and_make_install
unset PKG_CONFIG
cd ..
}
build_libtesseract() {
do_git_checkout https://github.com/tesseract-ocr/tesseract.git
cd tesseract_git
if [[ ! -f tesseract.pc.in.bak ]]; then
sed -i.bak "s/-lpthread/-lpthread -lstdc++ -lws2_32/" tesseract.pc.in
fi
generic_configure_make_install
cd ..
}
export LIBJP2K_CFLAGS="-DOPJ_STATIC -I$mingw_w64_x86_64_prefix/include/openjpeg-2.2 -I$mingw_w64_x86_64_prefix/include"
export LIBJP2K_LIBS="-L$mingw_w64_x86_64_prefix/lib -lopenjp2"
|
Ahh I was probably missing the
OPJ_STATIC
there.
Thanks for the help!
How do you compile leptonics with libtiff/png is it automatic?
bizarrely my libtiff-4.pc doesn't mention lzma even though it's
present...hmm...
…On Sun, Feb 25, 2018 at 8:54 AM, Reino Wijnsma ***@***.***> wrote:
hopefully got it in 9261995
<9261995>
Have you actually tested it yourself? Half a year ago I've already made a
test compile with Tesseract myself and it wasn't particularly easy.
+ # autoconf-archive is just for leptonica FWIW
I can compile Leptonica perfectly without it, but maybe it depends on the
system, I don't know.
+build_leptonica() {+ do_git_checkout https://github.com/DanBloomberg/leptonica.git + cd leptonica_git+ generic_configure "--without-libopenjpeg"+ do_make_and_make_install+ cd ..+}++build_libtiff() {+ generic_download_and_make_and_install ftp://download.osgeo.org/libtiff/tiff-4.0.9.tar.gz+}++build_libtesseract() {+ build_leptonica+ build_libtiff # no disable option? odd...+ do_git_checkout_and_make_install https://github.com/tesseract-ocr/tesseract.git+ sed -i.bak 's/-ltesseract.*$/-ltesseract -lstdc++ -lws2_32/' $PKG_CONFIG_PATH/tesseract.pc # why does it needs winsock? LOL+}
At first I also thought I could compile Leptonica and Tesseract without
any external image library, because hey... there's libavcodec, right?
You've probably already noticed LibTiff is obligatory, or Tesseract won't
compile at all. But this way you're only satisfying the Tesseract
compilationprocess, because you're not compiling Leptonica with LibTiff.
This results in:
ffprobe.exe -hide_banner -show_entries frame_tags=lavfi.ocr.text -f lavfi -i "movie='input.png',ocr"
Error in pixReadMemTiff: function not present
Error in pixReadMem: tiff: no pix returned
Error in pixaGenerateFontFromString: pix not made
Error in bmfCreate: font pixa not made
Error in pixWriteMemPng: function not present
ObjectCache(020BB810)::~ObjectCache(): WARNING! LEAK! object 02ABD388 still has count 1 (id [...]\tessdata/eng.traineddatalstm-punc-dawg)
ObjectCache(020BB810)::~ObjectCache(): WARNING! LEAK! object 02B92130 still has count 1 (id [...]\tessdata/eng.traineddatalstm-word-dawg)
ObjectCache(020BB810)::~ObjectCache(): WARNING! LEAK! object 02B84AC0 still has count 1 (id [...]\tessdata/eng.traineddatalstm-number-dawg)
LibTiff seems to be rather crucial. The pixWriteMemPng message also
worried me, so I compiled Leptonica with LibTiff *and* LibPNG. This
results in:
ffprobe.exe -hide_banner -show_entries frame_tags=lavfi.ocr.text -f lavfi -i "movie='input.png',ocr"
Input #0, lavfi, from 'movie='input.png',ocr':
Duration: N/A, start: 0.000000, bitrate: N/A
Stream #0:0: Video: rawvideo (444P / 0x50343434), yuv444p, 636x131 [SAR 1:1 DAR 636:131], 25 tbr, 25 tbn, 25 tbc
[FRAME]
TAG:lavfi.ocr.text=Er komt oorlog.
Ik Weet niet wanneer ... maar er komt oorlog.
[/FRAME]
My code:
build_libleptonica() {
do_git_checkout https://github.com/DanBloomberg/leptonica.git
cd leptonica_git
export PKG_CONFIG="pkg-config --static" # Automatically detect all Leptonica's dependencies.
generic_configure "--disable-programs"
do_make_and_make_install
unset PKG_CONFIG
cd ..
}
build_libtesseract() {
do_git_checkout https://github.com/tesseract-ocr/tesseract.git
cd tesseract_git
if [[ ! -f tesseract.pc.in.bak ]]; then
sed -i.bak "s/-lpthread/-lpthread -lstdc++ -lws2_32/" tesseract.pc.in
fi
generic_configure_make_install
cd ..
}
- I've removed OpenJPEG some time ago in my repo. FFmpeg has full
support for it already, so there's no need for it anymore, if you ask me.
If you still want to use it and what to compile Leptonica with it
(eventhough you don't need to), then I suggest you recompile OpenJPEG.
Nowadays OpenJPEG automatically generates a pkgconfig file, which Leptonica
in turn is looking for. I've successfully compiled Leptonica this way, so
there's no need for --without-libopenjpeg.
Last year I still had to do this:
export LIBJP2K_CFLAGS="-DOPJ_STATIC -I$mingw_w64_x86_64_prefix/include/openjpeg-2.2 -I$mingw_w64_x86_64_prefix/include"export LIBJP2K_LIBS="-L$mingw_w64_x86_64_prefix/lib -lopenjp2"
- export PKG_CONFIG="pkg-config --static" is necessary, or the
Tesseract compilationprocess will complain it can't find LibTiff's Libs.private:
-llzma!
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#297 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAw0EtrL6LRCJB5e1xx38ZEVi1YLhNqks5tYYJJgaJpZM4RuW_R>
.
|
As you can see above, I don't have to manually setup the path to LibTiff or LibPNG (nor is it needed anymore for OpenJPEG), so yes it is:
The initial 'lept.pc' (on my system at least):
After
Strange. This is when I build LibTiff with 'liblzma.pc' in
|
How hard would it be to build ffmpeg.exe to include the
--enable-libtesseract
option? I am interested in using the OCR feature. I am cross-compiling from Ubuntu 16.04 to win64. I successfully ran thecross_compile_ffmpeg.sh
script to produce a 64-bit ffmpeg.exe. This was just an "out-of-the-box" build, without adding in--enable-libtesseract
as I just want to make sure my build system was properly working.FWIW, OSX has ffmpeg via homebrew with tesseract and OCR:
https://twitter.com/dericed/status/786965160762155009
I tried the OSX version with OCR and it works great, but I really need a win64 version.
Thanks.
The text was updated successfully, but these errors were encountered: