Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 96 additions & 0 deletions docs/OCR.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@

Overview
========
OCR (Optical Character Recognisation ) is an technique used to
extract text from images. In the World of Subtile, subtitle stored
in bitmap format are common and even neccassary. for converting subtile
in bitmap format to subtilte in text format ocr is used.

Dependecy
=========
Tesseract (OCR library by google)
Leptonica (image processing library)

How to compile ccextractor on linux with OCR
=============================================

Download and Install Leptonnica.
-------------------------------
This package is available, you need liblept-devel library.

If Leptonica isn't available for your distribution, or you want to use a newer version
than they offer, you can compile your own.

you can download lib leptonica from http://www.leptonica.com/download.html

Download and Install Tesseract.
-------------------------------
Tesseract is available directly from many Linux distributions. The package is generally
called 'tesseract' or 'tesseract-ocr' - search your distribution's repositories to
find it. Packages are also generally available for language training data (search the
repositories,) but if not you will need to download the appropriate training data,
unpack it, and copy the .traineddata file into the 'tessdata' directory, probably
/usr/share/tesseract-ocr/tessdata or /usr/share/tessdata.

If Tesseract isn't available for your distribution, or you want to use a newer version
than they offer, you can compile your own.

If you compile Tesseract then following command in its source code are enough
./autogen.sh
./configure
make
sudo make install
sudo ldconfig

Note:
1) CCExtractor is tested with Tesseract 3.02.02 version.

you can download tesseract from https://drive.google.com/folderview?id=0B7l10Bj_LprhQnpSRkpGMGV2eE0&usp=sharing



Compile CCextractor passing flags like following
-------------------------------------------------
make ENABLE_OCR=yes


How to compile ccextractor on Windows with OCR
=============================================

Download prebuild library of leptonica from following link
http://www.leptonica.com/source/leptonica-1.68-win32-lib-include-dirs.zip

Download prebuild library of tesseract from following tesseract official link
https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-3.02.02-win32-lib-include-dirs.zip

put the path of libs/include of leptonica and tesseract in library paths.
step 1) In visual studio 2013 right click <Project> and select property.
step 2) Select Configuration properties in left panel(column) of property.
step 3) Select VC++ Directory.
step 4) In the right pane, in the right-hand column of the VC++ Directory property,
open the drop-down menu and choose Edit.
Step 5) Add path of Directory where you have kept uncompressed library of leptonica
and tesseract.


Set preprocessor flag ENABLE_OCR=1
Step 1)In visual studio 2013 right click <Project> and select property.
Step 2)In the left panel, select Configuration Properties, C/C++, Preprocessor.
Step 3)In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
Step 4)In the Preprocessor Definitions dialog box, add ENABLE_OCR=1. Choose OK to save your changes.

Add library in linker
step 1)Open property of project
Step 2)Select Configuration properties
Step 3)Select Linker in left panel(column)
Step 4)Select Input
Step 5)Select Additional dependencies in right panel
Step 6)Add libtesseract302.lib in new line
Step 7)Add liblept168.lib in new line

Download language data from following link
https://code.google.com/p/tesseract-ocr/downloads/list
after downloading the tesseract-ocr-3.02.eng.tar extract the tar file and put
tessdata folder where you have kept ccextractor executable

Copy the tesseract and leptonica dll in the folder of executable or in system32.
5 changes: 5 additions & 0 deletions linux/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,11 @@ INSTLALL = cp -f -p
INSTLALL_PROGRAM = $(INSTLALL)
DESTDIR = /usr/bin

ifeq ($(ENABLE_OCR),yes)
CFLAGS+=-I/usr/local/include/tesseract -I/usr/local/include/leptonica
CFLAGS+=-DENABLE_OCR
LDFLAGS+= $(shell pkg-config --libs tesseract)
endif
.PHONY: all
all: objs_dir $(TARGET)

Expand Down
105 changes: 70 additions & 35 deletions src/dvb_subtitle_decoder.c
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@

#include "dvb_subtitle_decoder.h"
#include "spupng_encoder.h"
#include "ocr.h"
#define DEBUG

#ifdef DEBUG
Expand Down Expand Up @@ -255,9 +256,17 @@ int mapclut_paletee(png_color *palette, png_byte *alpha, uint32_t *clut,
}
return 0;
}

/*
* @param alpha out
* @param intensity in
* @param palette out should be already initialized
* @param bitmap in
* @param size in size of bitmap
* @param max_color in
* @param nb_color in
*/
int quantize_map(png_byte *alpha, uint8_t *intensity, png_color *palette,
uint8_t *bitmap, int h, int w, int max_color, int nb_color)
uint8_t *bitmap, int size, int max_color, int nb_color)
{
/*
* occurrence of color in image
Expand Down Expand Up @@ -301,12 +310,9 @@ int quantize_map(png_byte *alpha, uint8_t *intensity, png_color *palette,
memset(mcit, 0, nb_color * sizeof(uint32_t));

/* calculate histogram of image */
for (int i = 0; i < h; i++)
for (int i = 0; i < size; i++)
{
for (int j = 0; j < w; j++)
{
histogram[bitmap[i * w + (j)]]++;
}
histogram[bitmap[i]]++;
}
sort_intensity_wise((uint8_t*) alpha, (uint8_t*) intensity, iot, nb_color);

Expand Down Expand Up @@ -365,52 +371,65 @@ int quantize_map(png_byte *alpha, uint8_t *intensity, png_color *palette,
freep(&iot);
return ret;
}
static int save_spupng(const char *filename, uint8_t *bitmap, int w, int h,
uint32_t *clut, uint8_t *luit,uint8_t depth)
{
FILE *f = NULL;
png_structp png_ptr = NULL;
png_infop info_ptr = NULL;
png_bytep* row_pointer = NULL;
int i, j, ret;
int k = 0;

png_color *palette = NULL;
png_byte *alpha = NULL;

static int pre_process_bitmap(png_color **palette, png_byte **alpha, int size,
uint32_t *clut, uint8_t *luit, uint8_t *bitmap, uint8_t depth)
{
/*local pointer to palette */
png_color *lpalette = NULL;
/* local pointer to alpha */
png_byte *lalpha = NULL;
int nb_color = (1<< depth);
int ret = 0;

if(!h)
h = 1;
if(!w)
w = 1;

palette = (png_color*) malloc(nb_color * sizeof(png_color));
if(!palette)
lpalette = (png_color*) malloc(nb_color * sizeof(png_color));
if(!lpalette)
{
ret = -1;
goto end;
}
alpha = (png_byte*) malloc(nb_color * sizeof(png_byte));
if(!alpha)
lalpha = (png_byte*) malloc(nb_color * sizeof(png_byte));
if(!lalpha)
{
ret = -1;
goto end;
}


if(clut)
mapclut_paletee(palette, alpha, clut, nb_color);
mapclut_paletee(lpalette, lalpha, clut, nb_color);
else
{
/* initialize colors with white */
memset(palette,0xff,sizeof(nb_color * sizeof(*palette)));
memset(palette,0xff,sizeof(nb_color * sizeof(*lpalette)));

/* initialize transparency as complete transparent */
memset(alpha,0,sizeof(nb_color * sizeof(*alpha)));
memset(lalpha,0,sizeof(nb_color * sizeof(*lalpha)));
}

if(bitmap)
quantize_map(alpha, luit, palette, bitmap, h, w, 3, nb_color);
{
quantize_map(lalpha, luit, lpalette, bitmap, size, 3, nb_color);
}
*palette = lpalette;
*alpha = lalpha;
end:
return ret;
}
static int save_spupng(const char *filename, uint8_t *bitmap, int w, int h,
png_color *palette, png_byte *alpha, int nb_color)
{
FILE *f = NULL;
png_structp png_ptr = NULL;
png_infop info_ptr = NULL;
png_bytep* row_pointer = NULL;
int i, j, ret = 0;
int k = 0;
if(!h)
h = 1;
if(!w)
w = 1;


f = fopen(filename, "wb");
if (!f)
Expand Down Expand Up @@ -791,6 +810,11 @@ static void save_display_set(DVBSubContext *ctx)

if (x_pos >= 0)
{
png_color *palette = NULL;
png_byte *alpha = NULL;
#ifdef ENABLE_OCR
char*str = NULL;
#endif

filename = get_spupng_filename(sp);
inc_spupng_fileindex(sp);
Expand Down Expand Up @@ -821,17 +845,28 @@ static void save_display_set(DVBSubContext *ctx)
}

}
save_spupng(filename, pbuf, width, height,
clut->clut16, clut->ilut16,region->depth);
pre_process_bitmap(&palette,&alpha,width*height,clut->clut16, clut->ilut16,pbuf,region->depth);
#ifdef ENABLE_OCR
str = ocr_bitmap(palette,alpha,pbuf,width,height);
if(str)
{
write_spucomment(sp,str);
}
#endif
save_spupng(filename, pbuf, width, height, palette, alpha,(1 << region->depth));

free(pbuf);
freep(&palette);
freep(&alpha);
}
else if(!ctx->prev_start)
{
png_color palette = {0,0,0};
png_byte alpha = 0;
filename = get_spupng_filename(sp);
inc_spupng_fileindex(sp);
/* save dummy frame */
save_spupng(filename,NULL,1,1,NULL,NULL,0);
save_spupng(filename,NULL,1,1,&palette,&alpha,1);

}

Expand Down
1 change: 1 addition & 0 deletions src/dvb_subtitle_decoder.h
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ int parse_dvb_description(struct dvb_config* cfg, unsigned char*data,
*
*/
void dvbsub_set_write(void *dvb_ctx, struct ccx_s_write *out);

#ifdef __cplusplus
}
#endif
Expand Down
57 changes: 57 additions & 0 deletions src/ocr.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
#include "png.h"
#ifdef ENABLE_OCR
#include "platform.h"
#include "capi.h"
#include "allheaders.h"

char* ocr_bitmap(png_color *palette,png_byte *alpha, unsigned char* indata,int w, int h)
{
TessBaseAPI* api;
PIX *pix;
char*text_out= NULL;
int i,j,index,ret;
unsigned int wpl;
unsigned int *data,*ppixel;
api = TessBaseAPICreate();

pix = pixCreate(w, h, 32);
if(pix == NULL)
{
return NULL;
}
wpl = pixGetWpl(pix);
data = pixGetData(pix);
#if LEPTONICA_VERSION > 69
pixSetSpp(pix, 4);
#endif
for (i = 0; i < h; i++)
{
ppixel = data + i * wpl;
for (j = 0; j < w; j++)
{
index = indata[i * w + (j)];
composeRGBPixel(palette[index].red, palette[index].green,palette[index].blue, ppixel);
SET_DATA_BYTE(ppixel, L_ALPHA_CHANNEL,alpha[index]);
ppixel++;
}
}

ret = TessBaseAPIInit3(api,"", "eng");
if(ret < 0)
{
return NULL;
}

//text_out = TessBaseAPIProcessPages(api, "/home/anshul/test_videos/dvbsubtest.d/sub0018.png", 0, 0);
text_out = TessBaseAPIProcessPage(api, pix, 0, NULL, NULL, 1000);
if(!text_out)
printf("\nsomething messy\n");
return text_out;
}
#else
char* ocr_bitmap(png_color *palette,png_byte *alpha, unsigned char* indata,unsigned char d,int w, int h)
{
mprint("ocr not supported without tesseract\n");
return NULL;
}
#endif
6 changes: 6 additions & 0 deletions src/ocr.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#ifndef OCR_H
#define OCR_H
#include <png.h>
char* ocr_bitmap(png_color *palette,png_byte *alpha, unsigned char* indata,int w, int h);

#endif
14 changes: 11 additions & 3 deletions windows/ccextractor.sln
Original file line number Diff line number Diff line change
@@ -1,18 +1,26 @@

Microsoft Visual Studio Solution File, Format Version 10.00
# Visual C++ Express 2008
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "ccextractor", "ccextractor.vcproj", "{0F0063C4-BCBC-4379-A6D5-84A5669C940A}"
Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio Express 2013 for Windows Desktop
VisualStudioVersion = 12.0.21005.1
MinimumVisualStudioVersion = 10.0.40219.1
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "ccextractor", "ccextractor.vcxproj", "{0F0063C4-BCBC-4379-A6D5-84A5669C940A}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|Win32 = Debug|Win32
Debug|x64 = Debug|x64
Release|Win32 = Release|Win32
Release|x64 = Release|x64
EndGlobalSection
GlobalSection(ProjectConfigurationPlatforms) = postSolution
{0F0063C4-BCBC-4379-A6D5-84A5669C940A}.Debug|Win32.ActiveCfg = Debug|Win32
{0F0063C4-BCBC-4379-A6D5-84A5669C940A}.Debug|Win32.Build.0 = Debug|Win32
{0F0063C4-BCBC-4379-A6D5-84A5669C940A}.Debug|x64.ActiveCfg = Debug|x64
{0F0063C4-BCBC-4379-A6D5-84A5669C940A}.Debug|x64.Build.0 = Debug|x64
{0F0063C4-BCBC-4379-A6D5-84A5669C940A}.Release|Win32.ActiveCfg = Release|Win32
{0F0063C4-BCBC-4379-A6D5-84A5669C940A}.Release|Win32.Build.0 = Release|Win32
{0F0063C4-BCBC-4379-A6D5-84A5669C940A}.Release|x64.ActiveCfg = Release|x64
{0F0063C4-BCBC-4379-A6D5-84A5669C940A}.Release|x64.Build.0 = Release|x64
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE
Expand Down