Any Example for Document Segmentation with hOCR output?

It seems a bit quite hard to get an example regarding the document layout analysis with tesseract-ocr python binding. 

I want to analyse the pdf document and separate the regions into text and images. But I cannot find the relevant pages/threads regarding this. 

Thanks a lot.