In [1]:
!nvidia-smi

Mon Dec  2 02:23:09 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA A100-SXM4-40GB          Off | 00000000:00:04.0 Off |                    0 |
| N/A   32C    P0              42W / 400W |      2MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [2]:
#BASE_DIR= '/content/drive/MyDrive/DACON/Finance/reprocessed/'
BASE_DIR='/content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/'

# 설명

## Question - Answering with Retrieval

본 대회의 과제는 중앙정부 재정 정보에 대한 **검색 기능**을 개선하고 활용도를 높이는 질의응답 알고리즘을 개발하는 것입니다. <br>이를 통해 방대한 재정 데이터를 일반 국민과 전문가 모두가 쉽게 접근하고 활용할 수 있도록 하는 것이 목표입니다. <br><br>
베이스라인에서는 평가 데이터셋만을 활용하여 source pdf 마다 Vector DB를 구축한 뒤 langchain 라이브러리와 llama-2-ko-7b 모델을 사용하여 RAG 프로세스를 통해 추론하는 과정을 담고 있습니다. <br>( train_set을 활용한 훈련 과정은 포함하지 않으며, test_set  에 대한 추론만 진행합니다. )

## Mount/Login

구글 드라이브를 마운트하고 허깅페이스에 로그인
- 이때 허깅페이스 토큰은 kdt3 그룹에 대해 읽기/쓰기 권한이 있는 토큰이어야 함

## Download Library
필요/사용 라이브러리 다운로드
이때 버전 문제로 설치를 한 뒤 세션을 한번 재시작해줘야 합니다
<br>(그리고 세션 완전히 끊기면 다운로드 후 재시작을 다시 해줘야...)

## Import Library
한번 재시작했으면 위 과정 없이 Import만 실행해주면 됩니다

## Vector DB
문서를 여러 조각(chunk)로 나누고, 임베딩 유사도를 통해 관련 조각을 찾을 수 있게 DB화하는 함수들이 정의되어 있습니다.

## DB 생성
Vector DB에서 정의된 함수들로 문서 DB를 만들어줍니다.<br><br>
이때 Train과 Test를 한번에 하려고 하면 코랩이 터질 확률이 높으므로 Train하고 Create Dataset까지 실행해 업로드 한 뒤 재시작해서 램을 비우고 Test를 하는 것이 좋습니다.<br> 또한 문서 임베딩을 어떤 모델로 할지 인자로 넘겨줄 수 있습니다

## Create Dataset
DB 생성에서 만든 db와 데이터 dataframe을 사용해 HuggingFace 데이터셋 생성 후 업로드

## Fine-Tuning
학습 데이터셋으로 모델에 대한 파인튜닝 진행 후 Huggingface에 업로드<br>
4비트 양자화 LoRA로 파인튜닝<br>
기반 모델 또는 넣어줄때 사용할 프롬프트, 학습 관련 하이퍼파라미터 수정 가능

## Langchain 을 이용한 추론
모델을 사용한 추론


## 실행
### 기본
Mount/Login -> Download Library -> 재시작 (처음 1번)
Mount/Login -> Import Library (이후)

### 데이터셋 만들기
기본 -> Vector DB -> DB 생성 -> Create Dataset에서 첫 셀 + Train/Valid/Test 중 해당하는 셀

### 모델 학습하기
기본 -> Fine-Tuning(업로드할 위치, 데이터셋 위치, 모델 링크 확인 필수)

### 학습된 모델로 추론하기
기본 -> Langchain을 이용한 추론(모델 링크, 데이터셋 위치 확인) -> Submission(저장할 파일명 확인)

# Mount/Login

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [4]:
ls {BASE_DIR}

241008_csv_checker.ipynb             [0m[01;34mgemma2_financeQA-finetune[0m/  [01;34mtest_source[0m/
combined_train_aug_v3.5_editted.csv  [01;34mprocessed[0m/                  train.csv
combined_train_aug_v3.csv            sample_submission.csv       [01;34mtrain_source[0m/
combined_train_aug_v3_editted.csv    [01;34msub[0m/                        Untitled0.ipynb
[01;34mdata[0m/                                [01;34mtemp[0m/
[01;34meval[0m/                                test.csv


In [5]:
import os

token_path = os.path.join(BASE_DIR,'data','token')
with open(token_path,'r') as f:
    master_token = f.readline().strip('\n')

In [6]:
from huggingface_hub import login

login(token=master_token, add_to_git_credential=True)

# Download Library

In [7]:
!apt-get install tesseract-ocr
!apt-get install poppler-utils

!pip install orjson==3.10.6

!pip install accelerate
!pip install -i https://pypi.org/simple/ bitsandbytes
!pip install transformers[torch] -U

!pip install datasets
!pip install langchain
!pip install langchain_community
!pip install langchain-teddynote
!pip install PyMuPDF
!pip install sentence-transformers
!pip install faiss-gpu
!pip install unstructured pdfminer.six
!pip install pillow-heif
!pip install pikepdf pypdf

!pip install pymupdf4llm

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
tesseract-ocr is already the newest version (4.1.1-2.1build1).
0 upgraded, 0 newly installed, 0 to remove and 49 not upgraded.
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
poppler-utils is already the newest version (22.02.0-2ubuntu0.5).
0 upgraded, 0 newly installed, 0 to remove and 49 not upgraded.
Looking in indexes: https://pypi.org/simple/


# Import Library

In [8]:
import os
import unicodedata
import torch
import pandas as pd
from tqdm.auto import tqdm
import fitz  # PyMuPDF

from langchain.document_loaders.parsers.pdf import PDFPlumberParser

from transformers import (
    AutoTokenizer,
    AutoModelForCausalLM,
    pipeline,
    BitsAndBytesConfig
)
from accelerate import Accelerator

from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.schema import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter, MarkdownHeaderTextSplitter

# PDF 로딩/청크화 관련
from langchain.document_loaders.parsers.pdf import PDFPlumberParser
from langchain.document_loaders.pdf import PDFPlumberLoader
from langchain.document_loaders import UnstructuredPDFLoader
from langchain_teddynote.retrievers import KiwiBM25Retriever
from langchain.retrievers import EnsembleRetriever, MultiQueryRetriever

from unstructured.cleaners.core import clean_extra_whitespace, clean, clean_non_ascii_chars

import pymupdf4llm
import pymupdf

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
device

device(type='cuda', index=0)

In [9]:
# gpu memory 할당 해제
import gc, time

def free_cuda():
  mem = 1
  while mem > 0 :
    time.sleep(0.5)
    mem = gc.collect()
    torch.cuda.empty_cache()
    print("freed : ",mem)

# Vector DB

In [10]:
from operator import itemgetter
from langchain_text_splitters import RecursiveCharacterTextSplitter
from unstructured.cleaners.core import clean_extra_whitespace, clean, clean_non_ascii_chars


# 불릿포인트 제거용 함수
def remove_bulletpoints(text):
    cleaned_text = text
    for symbol in ['ㅇ','-','□', '※', '▸','∙','●','☞','■','','','·']:
        cleaned_text = cleaned_text.replace(symbol, f"-")
    return cleaned_text

def replace_sign_symbol(text):
    cleaned_text = text
    cleaned_text = cleaned_text.replace('△', "-")
    return cleaned_text


# 숫자 심볼 숫자로 변환
def replace_num_symbols_with_number(text):
    cleaned_text = text
    for idx, symbol in enumerate(['①', '②', '③', '④', '⑤', '⑥', '⑦', '⑧', '⑨', '⑩', '⑪', '⑫', '⑬', '⑭', '⑮']):
        cleaned_text = cleaned_text.replace(symbol, f"{idx+1})")
    return cleaned_text

def erase_unicode_chr(text):
  return re.sub(r'\\u[0-9a-fA-F]{4}','-',text)

In [11]:
def normalize_path(path):
    """경로 유니코드 정규화"""
    return unicodedata.normalize('NFC', path)

def process_path(base_dir,file_path):
  norm_path = normalize_path(file_path)
  if not os.path.isabs(norm_path):
    return os.path.normpath(os.path.join(base_dir, norm_path))
  else : return norm_path

def subpath_list(dir_path):
  return list(map(lambda x : os.path.join(dir_path,x),os.listdir(dir_path)))

def processed_path_matcher(dir_path,file_path):
  sub_list = subpath_list(dir_path)
  path_list = list()
  for sub in sub_list:
    path_list.extend(subpath_list(sub))
  prcssd_list =list(map(normalize_path,path_list))
  for real_path,prcssd_path in zip(path_list,prcssd_list) :
    if file_path == prcssd_path : return real_path
  else : return file_path

In [12]:
from operator import itemgetter
import re

def remove_table_spaces(text):
  text = re.sub(r'[ \t\r]+',' ',text)
  text = re.sub(r'[\n\v\f]+','\n',text)
  text = re.sub(r'\|:?[\-]+:?(?=[\|])','|-',text)
  return text


def clean_string(text):
    text_string = clean(text, dashes=True,trailing_punctuation=True, bullets=True)
    text_string = replace_num_symbols_with_number(text_string)
    text_string = remove_bulletpoints(text_string)
    return text_string

def clean_table(text_string):
#    text_string = remove_table_spaces(text_string)
    text_string = replace_num_symbols_with_number(text_string)
    text_string = replace_sign_symbol(text_string)
    text_string = remove_bulletpoints(text_string)
    return erase_unicode_chr(text_string)

# 전체 마크다운 처리
def process_pdf(file_path, chunk_size=256, chunk_overlap=32):
    """PDF 텍스트 추출 후 chunk 단위로 나누기"""
    # PDF 파일 열기
    doc = pymupdf4llm.to_markdown(file_path)

    headers_to_split_on = [
        ("#","Header 1"),
        ("##","Header 2"),
        ("###","Header 3"),
    ]

    md_splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split_on, strip_headers=False)
    md_chunks = md_splitter.split_text(doc)

    splitter = RecursiveCharacterTextSplitter(
        chunk_size=chunk_size,
        chunk_overlap=chunk_overlap
    )
    chunks = splitter.split_documents(md_chunks)

    return chunks


def create_vector_db(chunks, model_path="intfloat/multilingual-e5-small"):
    """FAISS DB 생성"""
    # 임베딩 모델 설정
    model_kwargs = {'device': 'cuda'}
    encode_kwargs = {'normalize_embeddings': True}
    embeddings = HuggingFaceEmbeddings(
        model_name=model_path,
        model_kwargs=model_kwargs,
        encode_kwargs=encode_kwargs
    )
    # FAISS DB 생성 및 반환
    db = FAISS.from_documents(chunks, embedding=embeddings)
    return db




In [13]:
import pickle

def check_and_mkdir(func):
    def wrapper(*args,**kwargs):
        if not os.path.exists(args[0]): os.makedirs(args[0])
        return func(*args,**kwargs)
    return wrapper

@check_and_mkdir
def save_pkl(save_dir,file_name,save_object):
    if not os.path.exists(save_dir): os.mkdir(save_dir)
    file_path = os.path.join(save_dir,file_name)
    with open(file_path,'wb') as f:
        pickle.dump(save_object,f)

def load_pkl(file_path):
    with open(file_path,'rb') as f:
        data = pickle.load(f)
    return data

# Preprocessing Tables

In [20]:
!pip install gmft
!pip install git+https://github.com/conjuncts/gmft_pymupdf.git

Collecting gmft
  Downloading gmft-0.4.0-py3-none-any.whl.metadata (10 kB)
Collecting pypdfium2>=4 (from gmft)
  Downloading pypdfium2-4.30.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (48 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m48.5/48.5 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
Downloading gmft-0.4.0-py3-none-any.whl (73 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m73.2/73.2 kB[0m [31m6.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pypdfium2-4.30.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.8/2.8 MB[0m [31m87.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pypdfium2, gmft
Successfully installed gmft-0.4.0 pypdfium2-4.30.0
Collecting git+https://github.com/conjuncts/gmft_pymupdf.git
  Cloning https://github.com/conjuncts/gmft_pymupdf.git to /tmp/pip-req-build-kh8krizb
  Running command git clone --

In [21]:
import gmft.table_detection
import gmft
import markdown
from gmft.auto import CroppedTable, TableDetector, AutoTableFormatter, AutoFormatConfig
from gmft.auto import AutoTableDetector, TATRDetectorConfig
from gmft.pdf_bindings import PyPDFium2Document
from gmft_pymupdf import PyMuPDFPage
from collections import defaultdict
import copy

In [22]:
!pip install ipdb

Collecting ipdb
  Downloading ipdb-0.13.13-py3-none-any.whl.metadata (14 kB)
Collecting jedi>=0.16 (from ipython>=7.31.1->ipdb)
  Downloading jedi-0.19.2-py2.py3-none-any.whl.metadata (22 kB)
Downloading ipdb-0.13.13-py3-none-any.whl (12 kB)
Downloading jedi-0.19.2-py2.py3-none-any.whl (1.6 MB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.6 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m67.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: jedi, ipdb
Successfully installed ipdb-0.13.13 jedi-0.19.2


In [23]:
from matplotlib import pyplot as plt
from matplotlib.pyplot import imshow
import numpy as np
from PIL import Image
from ipdb import set_trace

import matplotlib.pyplot as plt

def close_event():
    plt.close() #timer calls this function after 3 seconds and closes the window

fig = plt.figure()
timer = fig.canvas.new_timer(interval = 3000) #creating a timer object and setting an interval of 3000 milliseconds
timer.add_callback(close_event)

<Figure size 640x480 with 0 Axes>

## functions

### process info about page and box

In [24]:
'''
-height
8pt : 2.82mm , a4 : 210mm * 297mm
ratio : 2.82/297 ~ 0.0094
-width
same length as height : get height and use it
'''

def bound_page(box,page):
  x0 = min(max(box[0],page[0]),page[2])
  x1 = min(max(box[1],page[1]),page[3])
  x2 = max(min(box[2],page[2]),page[0])
  x3 = max(min(box[3],page[3]),page[1])
  return (x0,x1,x2,x3)

def check_exclusv_range(ran1,ran2,ths):
  flag1 = (ran1[0] - ran2[1] >= -ths)
  flag2 = (ran2[0] - ran1[1] >= -ths)
  return flag1 or flag2

def check_exclusive_box(box1,box2,ths):
  flag_horiz = check_exclusv_range((box1[0],box1[2]),(box2[0],box2[2]),ths)
  flag_verti = check_exclusv_range((box1[1],box1[3]),(box2[1],box2[3]),ths)
  return flag_horiz or flag_verti

def union_box(box1,box2):
  return min(box1[0],box2[0]),min(box1[1],box2[1]),max(box1[2],box2[2]),max(box1[3],box2[3])

def check_pairly_linked(ele_list,check_not_link,union_func):
  elements = copy.deepcopy(ele_list)
  for i0,e0 in enumerate(elements):
    for i1,e1 in enumerate(elements):
      if i0 >= i1 : continue
      if not check_not_link(e0,e1):
        elements.pop(i1)
        elements.pop(i0)
        new = union_func(e0,e1)
        elements.append(new)
    #    print(i0,i1,new,len(elements))
    #  else : print('=',i0,i1)
  return elements

import copy

def union_pairly_linked(ele_list,check_not_link,union_func):
  elements = ele_list
  cnt,bnd = 0,2**len(ele_list)
  while True :
    #print('-'*5,cnt,'-'*5)
    rslt = check_pairly_linked(elements,check_not_link,union_func)
    if len(rslt) == len(elements) or len(rslt) < 2 : break
    if cnt > bnd : break
    elements,cnt = rslt, cnt+1
  return rslt

def get_ths(page,ths_ratio=0.0094):
  return ths_ratio * (page[3]-page[1])

def organize_box(box_list,page,ths):
  box_list = list(map(lambda x : bound_page(x,page),box_list))
  check_exclusv = lambda x,y : check_exclusive_box(x,y,ths)
  return union_pairly_linked(box_list,check_exclusv,union_box)

def get_bbox(tables):
  return list(map(lambda x : x.bbox,list(tables)))

def get_page_size(page,ths=0):
  area = (0,0,page[2]-page[0],page[3]-page[1])
  return expand_bbox_by_ths(area,area,ths)

def expand_bbox_by_ths(bbox,area,ths=0):
  bbox =(bbox[0]-ths,bbox[1]-ths,bbox[2]+ths,bbox[3]+ths)
  return bound_page(bbox,area)

def larger_v_ths(area,ths=0):
  size = get_page_size(area)
  return (size[2]>=ths) and (size[3]>=ths)

def infer_bbox_pos(area,bbox):
  return bbox[0]+area[0],bbox[1]+area[1],bbox[2]+area[0],bbox[3]+area[1]

### process detector,formatter,tables

In [26]:
from gmft.auto import AutoTableDetector
from gmft_pymupdf import PyMuPDFPage


def get_ft_bbox(ft,area,ths=0):
#  return ft.rect.bbox
  rslt = organize_box(ft.fctn_results['boxes'],area,ths)[0]
  return infer_bbox_pos(ft.rect.bbox,rslt) #expand_bbox_by_ths(rslt,area,ths)

def define_formatter():
    config = AutoFormatConfig()
    config.semantic_spanning_cells=True
    config.semantic_hierarchical_left_fill='deep'
    config.enable_multi_header=False
    config.torch_device= device
    config.total_overlap_reject_threshold = 0.3
    config.large_table_assumption = True
    config.verbose = 2
    formatter = AutoTableFormatter(config=config)
    return formatter

def define_detector():
    config =TATRDetectorConfig()
    config.torch_device= device
    config.detector_base_threshold=0.75
#    config.detector_base_threshold=0.6
    detector = AutoTableDetector(config=config)
    return detector
import re

def erase_constant_rowcol(df,val):
  cols = range(len(df.columns))
  df_temp = df.set_axis(cols,axis=1)
  cond = df_temp == val
  target = list(filter(lambda col : np.sum(cond[col]) != len(cond),cols))
  df_temp, cond = df_temp[target], cond[target]
  cond2 = np.sum(cond,axis=1) != len(target)
  return df_temp[cond2]

def maybe_numeric_table(df,ths=0.35,line_ths=0.9):
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = erase_constant_rowcol(df_temp,'0')
  #df_temp = df.replace(r'(?:(\d+?)),(\d+?)',r'\1\2',regex=True)
  df_temp = df.replace(r'[△\,\(\)\-\+\.\s\%\[\]]','',regex=True)
  df_temp = df_temp[list(df_temp.columns)].apply(pd.to_numeric,errors='coerce')
  rslt = ~df_temp.isna()
  rowwise = rslt.apply(sum,axis=1)
  colwise = rslt.apply(sum,axis=0)
  if np.sum(rowwise * line_ths <= len(rslt)) > 0 : return True
  if np.sum(colwise * line_ths <= len(rslt.columns)) > 0 : return True
  if np.sum(rslt.values) > len(rslt)*len(rslt.columns)*ths : return True
  return False

def check_table_df_soundness(df,ths=0.5):
  if len(df) < 1 : return False
  if maybe_numeric_table(df) : return True
  df_temp = df.replace(to_replace=[None], value='PD_NONE')
  cols = list(df.columns.astype(np.string_))
  cols = list(map(lambda x : '' if x is None else str(x),cols))
  cols = list(map(lambda x : re.sub(r'[\s]*','',x),cols))
  null_col = list(filter(lambda x: len(x)<1,cols))
  if len(null_col) > len(cols) * ths : return False
  if np.sum(df_temp=='PD_NONE') > len(df)*len(cols)*ths : return False
  return True

def detect_sound_table(dt):
  if len(dt.text()) > 2 : return True
  return False

def find_tables(page,area,ths=0):
  detector,formatter = define_detector(),define_formatter()
  doc = PyMuPDFPage(page)
  dt_whole = detector.extract(doc)
  dt_whole = list(filter(detect_sound_table,dt_whole))
  gmft_bboxes = list(map(lambda x : x.bbox,dt_whole))
  searched = get_bbox(page.find_tables())
  searched.extend(gmft_bboxes)
  searched = organize_box(searched,area,ths)
  searched = list(map(lambda x: expand_bbox_by_ths(x,area,ths),searched))
#  rslt = list(map(lambda x:make_table(x,doc,area,formatter,ths),searched))
  return list(filter(lambda x : x is not None,searched))

def make_table(bbox,doc,area,formatter,ths=0):
  rect = gmft.common.Rect(bbox)
  temp = gmft.table_detection.CroppedTable(doc,rect,0.8) #confidence level 조정이 표 인식에 영향 있을지도
  ft = formatter.extract(temp)
  #display(ft.rect.bbox,ft.visualize())
  try :
    tab_box = get_ft_bbox(ft,area,ths)
    caption = '\t'.join(ft.captions()) if 'captions' in ft.__dir__() else ''
    df_tab = ft.df()
    if not check_table_df_soundness(df_tab) :
      print('table does not sound')
      raise Exception('table does not sound')
    #else : print('table sounds')
    table = {'content':df_tab,'bbox':tab_box,'caption':caption} #, 'ft':ft}
    return table
  except Exception as e:
    try :
      display(df_tab)
      df_tab.msg = e
      table = df_tab
    except : table = None
#    display(ft.visualize())
    print('exception : ',e)
    return table

### search in page

In [27]:
'''
if table found, search left and right
and then, the area will be colored
'''

def get_area(area,page,left):
  if left : return page[0],area[1],area[0],area[3]
  else : return area[2],area[1],page[2],area[3]


def get_blank(searched_row,page,ths=0):
  rslt,point = list(), page[1]
  searched_row = sorted(searched_row,key=lambda x: x[1])
  searched_row.append((page[0],page[3],page[2],page[3]))
  for row in searched_row[:-1]:
    if row[1] - point < ths : continue
    rslt.append((page[0],point,page[2],row[1]))
    point=row[3]
  return [bound_page(box,page) for box in rslt]


def extend_list_dict(a,b):
  rslt = defaultdict(list)
  for key,val in a.items():
    rslt[key].extend(val)
  for key,val in b.items():
    rslt[key].extend(val)
  return rslt

def search_page(page,area,ths=0,depth=0):
  if not larger_v_ths(area,ths) : return list()
  if depth >=10 : raise Exception(depth)
#  print('depth : ',depth,'area : ',area)
  page.set_cropbox(area) #area : page에서 절대적 위치. cropbox를 하게 되면 상대적 위치로 바뀜
  rslt,searched = list(),list()
#  set_trace()
  detected = find_tables(page,get_page_size(area),ths)
  if len(detected)==0 : return rslt
  for target in detected:
#    if target is None : continue
#    if type(target) is not dict : rslt['errs'].append(target)
    rect = infer_bbox_pos(area,target)
    bbox = {'bbox':rect,'depth':depth}
    rslt.append(bbox)
    print('detected:',rect,'\t at',area,f' in depth {depth}')

    left_area = get_area(rect,area,True)
    right_area = get_area(rect,area,False)
    if depth > 5 : print(f'left {left_area}\tright{right_area}')
    left = search_page(page,left_area,ths,depth+1)
    right = search_page(page,right_area,ths,depth+1)
    searched.append((area[0],rect[1],area[2],rect[3]))
    if depth > 5 : print(searched)
    rslt.extend(left+right)

  if len(rslt)==0 : return rslt
  searched = organize_box(searched,area,ths)
  blanks = get_blank(searched,area,ths)
  if depth >5 : print(f'in {area}\n\t',blanks)
  for row in blanks:
    detected = search_page(page,row,ths,depth+1)
    rslt.extend(detected)

#  rslt= organize_box(rslt,area,ths) : can't apply directly like this
#  print('in search page, err : ',len(rslt['errs']))
  page.set_cropbox(area)
  return rslt

### extract tables and reform pdfs

In [14]:
def replace_area_to_mark(mu_page,area,mark):
    mu_page.add_redact_annot(area)
    mu_page.apply_redactions()
    mu_page.draw_rect(area,color=(.0,0,0),fill=(.99,.99,.99))
    rc = mu_page.insert_htmlbox(area,mark,scale_low=0)
    return mu_page

def extract_tables_from_pdf(full_path,tab_word='[[TABLE_{0}]]'):
    pdf = pymupdf.open(full_path)
    chunks, tables_dict, cnt= list(), defaultdict(list),0
    err_dict = dict()
    if pdf is None : return None, tables_dict
    for pnum, page in enumerate(tqdm(pdf)):
      page_area = tuple(page.mediabox)
      ths = get_ths(page_area)

      detected = search_page(page,page_area,ths)
      bboxes = [dic['bbox'] for dic in detected]
      bboxes = organize_box(bboxes,page_area,ths)
      tables,errors,doc =list(),list(), PyMuPDFPage(page)
      formatter = define_formatter()
      for box in bboxes:
        table = make_table(box,doc,page_area,formatter,get_ths(page_area))
        if table is None :
          print('error : ',box)
          continue
        if type(table) is not dict : errors.append(table)
        else : tables.append(table)

      if len(errors)>0 :
        err_dict[pnum] = errors
        print('errs : ',len(errors))
      if len(tables) == 0 : continue
      print(f'detected in p.{pnum} :\t',len(tables),' tables')
      tables = sorted(tables,key=lambda x : (x['bbox'][0],x['bbox'][1]))
      for idx,tab in enumerate(tables):
        tab_mark = tab_word.format(cnt+idx)
        table_md = clean_table(tab['content'].to_markdown(index=False))
        tables_dict[pnum].append((tab_mark,table_md + f"\n{tab['caption']}"))

        try :
          area = tab['bbox']
          page.add_redact_annot(area)
          page.apply_redactions()
          page.draw_rect(area,color=(.0,0,0),fill=(.99,.99,.99))
          rc = page.insert_htmlbox(area,tab_mark,scale_low=0)
        except :
          print(page.mediabox)
          print(page.cropbox)
          print(tab['bbox'])
          display(tab['content'])
          print(table_md)

      cnt+=len(tables)

    print(cnt, len(tables_dict), sum([len(tables) for tables in tables_dict.values()]))
    return pdf, tables_dict,err_dict

def extract_table_and_pdf(pdf_path,base_path,save_dir):
    # 경로 정규화 및 절대 경로 생성
    norm_path = normalize_path(pdf_path)
    full_path = process_path(base_path,pdf_path)
    full_path = processed_path_matcher(base_path,full_path)
    pdf_name = os.path.basename(full_path)

    print(f"Processing {pdf_name}...")
    save_path = os.path.join(save_dir, norm_path)
    print(full_path,save_path)
    pdf_dir = os.path.dirname(save_path)
    if not os.path.exists(pdf_dir) : os.makedirs(pdf_dir)
    new_pdf,tab_list,err_list = extract_tables_from_pdf(full_path,tab_word)
    save_pkl(pdf_dir,pdf_name[:-4]+'.pkl',tab_list)
    save_pkl(pdf_dir,'err_'+pdf_name[:-4]+'.pkl',err_list)
    new_pdf.save(save_path,garbage=4,deflate=True)
    return tab_list,err_list

def reform_pdfs_from_df(df, base_path,save_dir,name='data'):
    """딕셔너리에 pdf명을 키로해서 DB, retriever 저장"""
    unique_paths = df['Source_path'].unique()
    tab_dict,err_dict = dict(), dict()
    for path in tqdm(unique_paths, desc="Processing PDFs"):
      print(path,base_path)
      tab_dict[path],err_dict[path]=extract_table_and_pdf(path,base_path,save_dir)
    save_pkl(os.path.join(save_dir,'tables'),f'tab_{name}.pkl',tab_dict)
    return tab_dict,err_dict

## read table marks

In [14]:
def convert_neg_idx(idxs,len_obj):
  rslt = deepcopy(idxs)
  for i in idxs:
    if i >= 0 : continue
    new_num = i + len_obj
    del rslt[i]
    rslt.append(new_num)
  return rslt

def add_escape(sent):
  idxs = list(filter(lambda x : sent[x] in ['[',']'],range(len(sent))))
  temp, idxs = list(sent), convert_neg_idx(idxs,len(sent))
  idxs = sorted(idxs)[::-1]
  for i in idxs:
    temp.insert(i,'\\')
  return ''.join(temp)

In [15]:
import re
from copy import deepcopy
from collections import defaultdict
from langchain_core.documents import Document as Doc

def get_former_idx(err_list):
  rslt = list()
  for i in err_list:
    cand = list(filter(lambda x : x not in err_list,range(i)))
    idx = max(cand) if cand else 0
    rslt.append(idx)
  return rslt

def get_latter_idx(err_list):
  rslt = list()
  for i in err_list:
    cand = list(filter(lambda x : x not in err_list,range(i,err_list[-1]+2)))
    idx = min(cand) if cand else err_list[-1]+1
    rslt.append(idx)
  return rslt

def make_table_page(content,tab_mark,table,tab_caption=None,th_len=100):
  if len(content)<len(tab_mark) : return None, None
  if tab_caption is None : tab_caption = tab_mark
  re_sep = '[\s\|]*'
  re_mark = insert_btwn_chr(tab_mark,re_sep)
  re_trgt = re.compile(re_mark)
  flag = list(re.finditer(re_trgt,content))
  if flag :
      front,end = flag[0].pos,flag[0].endpos
      start = min(0,front-th_len)
      new_page = content[start:front] + '\n' + table + f'\n{tab_caption}'
      #page = content[:front]+tab_caption+content[end:]
      page = content
  else : new_page, page = None, None
  return new_page,page

def get_insert_idx(former_idx,latter_idx,tab_page):
  rslt = list()
  for former,latter in zip(former_idx,latter_idx):
    pages = tab_page.values()
    first, last = 0,max(pages)
    i0 = tab_page[former] if former in tab_page else first
    i1 = tab_page[latter] if latter in tab_page else last
    rslt.append(int((i0+i1)/2))
  return rslt

def set_err_tab_page(new_pages,err_list,table_list,tab_page,tab_caption):
  if len(err_list) == 0 : return new_pages
  err_idx, err_tabs = zip(*err_list)
  if len(new_pages) == 0 : insert_idx = [-1 for _ in err_list]
  else :
    former_idx = get_former_idx(err_idx)
    latter_idx = get_latter_idx(err_idx)
    insert_idx = get_insert_idx(former_idx,latter_idx,tab_page)
  for page,i_tab,tab in zip(insert_idx,err_idx,err_tabs):
    content =tab +'\n'+ tab_caption.format(i_tab)
    new_pages[page].append(Doc(page_content=content))
  return new_pages

def convert_neg_num_page(page_dict,book_len):
  rslt = deepcopy(page_dict)
  for page,docs in page_dict.items():
    if page >= 0 : continue
    new_num = page+book_len
    del rslt[page]
    rslt[new_num] = docs
  return rslt

def insert_pages(doc_list,new_pages):
  new_pages = convert_neg_num_page(new_pages,len(doc_list))
  page_list = sorted(list(new_pages.keys()))[::-1]
  for page in page_list:
    if page >= len(doc_list) -1 : doc_list += new_pages[page]
    else : doc_list = doc_list[:page+1]+new_pages[page]+doc_list[page+1:]
  return doc_list

def get_table_page(num,doc_list,tab_mark,table):
    this_page = doc_list[num]['text']
    next_page = doc_list[num+1]['text'] if num+1 < len(doc_list) else ''
    both_page = this_page + next_page if next_page != '' else ''

    this_rslt,page0 = make_table_page(this_page,tab_mark,table)
    next_rslt,page1 = make_table_page(next_page,tab_mark,table)
    both_rslt,page2 = make_table_page(both_page,tab_mark,table)

    if this_rslt is not None : page_content,page = this_rslt, page0
    if next_rslt is not None : page_content,page,num = next_rslt, page1,num+1
    elif both_rslt is not None :
      page_content,page = both_rslt, page2[:len(this_page)+abs(len(both_page)-len(both_rslt))]
    else : page_content,page = None, this_page
    return page_content, page, num

def expand_pages(doc_list,new_pages):
  for page,tables in new_pages.items():
    content = doc_list[page]['text']+'\n'+'\n'.join(tables)
    doc_list[page]['text'] = content
  return doc_list

def insert_table_2_doc(doc_list,table_dict,tab_word='[[TABLE_{0}]]'):
  new_pages,cnt = defaultdict(list),0
  for num,table_info in table_dict.items():
    for tab_mark,table in table_info:
      table_page,page_adjst,page_num = get_table_page(num,doc_list,tab_mark,table)
      if table_page is not None:
        new_pages[page_num].append(table_page)
#        doc_list[page_num]['text'] = page_adjst
        cnt+=1
      else : new_pages[num].append(table+'\n'+tab_mark)

  doc_list = expand_pages(doc_list,new_pages)
  return doc_list, cnt/sum(map(len,table_dict.values()))

def insert_btwn_chr(sent,sep):
  c = list(add_escape(sent))
  d = c.copy()
  diff = (len(c)-len(sent))//2
  for i in range(2*diff+1,len(c)-diff*2+2)[::-1]:
    d.insert(i-1,sep)
  return ''.join(d)



In [16]:
import difflib

def union_strs(str0,str1):
    output_list = difflib.ndiff(str0, str1)
    return ''.join(map(lambda x : x[-1],output_list))

def pdf_2_chunck_w_table(file_path, tables, tab_word,chunk_size=256, chunk_overlap=32):
    """PDF 텍스트 추출 후 chunk 단위로 나누기"""
    # PDF 파일 열기
    doc = pymupdf4llm.to_markdown(file_path,page_chunks=True,table_strategy='lines')
    doc,rate = insert_table_2_doc(doc,tables,tab_word)
    doc0 = '\n'.join(map(lambda x : x['text'],doc))
#    doc1 = pymupdf4llm.to_markdown(file_path)
#    docs = union_strs(doc0,doc1)
    md_splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split_on, strip_headers=False)
    md_chunks = md_splitter.split_text(doc0)
    splitter = RecursiveCharacterTextSplitter(
        chunk_size=chunk_size,
        chunk_overlap=chunk_overlap
    )
    chunks = splitter.split_documents(md_chunks)
    print(file_path)
    print(f'table mark detect rate : {rate:.5f}')

    return chunks,rate

def make_chunk_dict_from_df(df, base_dir, table_dict, chunk_size=256):
    """딕셔너리에 pdf명을 키로해서 DB, retriever 저장"""
    unique_paths = df['Source_path'].unique()
    chunk_dict = dict()
    err_tab_dict = dict()

    for file_path in tqdm(unique_paths, desc="Processing PDFs"):
        # 경로 정규화 및 절대 경로 생성
        full_path = process_path(base_dir,file_path)
        full_path = processed_path_matcher(base_dir,full_path)
        pdf_title = os.path.basename(full_path)
        print(f"Processing {pdf_title}...")

        # PDF 처리 및 벡터 DB 생성
        chunk_dict[file_path]= pdf_2_chunck_w_table(full_path,table_dict[file_path],chunk_size)
    return chunk_dict

In [17]:
#앙상블
def process_pdfs_from_df(df, base_dir, table_dict, tab_word, chunk_size=256, model_path = "intfloat/multilingual-e5-small"):
    """딕셔너리에 pdf명을 키로해서 DB, retriever 저장"""
    pdf_databases = {}
    unique_paths = df['Source_path'].unique()
    rate_dict=dict()

    for file_path in tqdm(unique_paths, desc="Processing PDFs"):
        # 경로 정규화 및 절대 경로 생성
        full_path = process_path(base_dir,file_path)
        full_path = processed_path_matcher(base_dir,full_path)
        pdf_title = os.path.basename(full_path)
        print(f"Processing {pdf_title}...")

        # PDF 처리 및 벡터 DB 생성
        chunks,rate =pdf_2_chunck_w_table(full_path,table_dict[file_path],tab_word,chunk_size)
        db = create_vector_db(chunks, model_path=model_path)

        kiwi_bm25_retriever = KiwiBM25Retriever.from_documents(chunks)
        faiss_retriever = db.as_retriever()
        retriever = EnsembleRetriever(
            retrievers=[kiwi_bm25_retriever, faiss_retriever],
            weights=[0.5, 0.5],
            search_type="mmr",
        )

        # 결과 저장
        pdf_databases[pdf_title] = {
                'db': db,
                'retriever': retriever
        }
        rate_dict[pdf_title] = rate
    return pdf_databases, rate_dict

## run codes

In [18]:
headers_to_split_on = [
    ("#","Header 1"),
    ("##","Header 2"),
    ("###","Header 3"),
]

tab_word = '!표{0}!'

In [20]:
train_df = pd.read_csv(f'{BASE_DIR}train.csv')
test_df = pd.read_csv(f'{BASE_DIR}test.csv')

#### move files to runtime

In [31]:
ls {BASE_DIR}

241008_csv_checker.ipynb             [0m[01;34mgemma2_financeQA-finetune[0m/  [01;34mtest_source[0m/
combined_train_aug_v3.5_editted.csv  [01;34mprocessed[0m/                  train.csv
combined_train_aug_v3.csv            sample_submission.csv       [01;34mtrain_source[0m/
combined_train_aug_v3_editted.csv    [01;34msub[0m/                        Untitled0.ipynb
[01;34mdata[0m/                                [01;34mtemp[0m/
[01;34meval[0m/                                test.csv


In [32]:
src_dirs = ['train_source/','test_source/']
file_path = ' '.join([os.path.join(BASE_DIR,sub) for sub in src_dirs])
temp_path = '/content/src/'
if not os.path.exists(temp_path) : os.makedirs(temp_path)

In [33]:
for sub in src_dirs:
  src_path = os.path.join(BASE_DIR,sub)
  dst_path = os.path.join(temp_path,sub)
  if not os.path.exists(dst_path) : os.makedirs(dst_path)
  !rsync -rvzh {src_path} {dst_path} --bwlimit 4096000000000000 --progress

sending incremental file list
1-1 2024 주요 재정통계 1권.pdf
         12.79M 100%   18.05MB/s    0:00:00 (xfr#1, to-chk=15/17)
2024 나라살림 예산개요.pdf
          7.63M 100%   18.82MB/s    0:00:00 (xfr#2, to-chk=14/17)
2024년도 성과계획서(총괄편).pdf
          5.66M 100%    4.14MB/s    0:00:01 (xfr#3, to-chk=13/17)
고용노동부_내일배움카드(일반).pdf
        143.29K 100%  164.44kB/s    0:00:00 (xfr#4, to-chk=12/17)
고용노동부_조기재취업수당.pdf
        166.37K 100%   63.71MB/s    0:00:00 (xfr#5, to-chk=11/17)
고용노동부_청년일자리창출지원.pdf
        135.51K 100%  200.81kB/s    0:00:00 (xfr#6, to-chk=10/17)
국토교통부_민간임대(융자).pdf
        116.94K 100%   80.28MB/s    0:00:00 (xfr#7, to-chk=9/17)
국토교통부_소규모주택정비사업.pdf
        305.15K 100%  460.59kB/s    0:00:00 (xfr#8, to-chk=8/17)
국토교통부_전세임대(융자).pdf
        136.00K 100%   98.45MB/s    0:00:00 (xfr#9, to-chk=7/17)
보건복지부_노인일자리 및 사회활동지원.pdf
 

In [34]:
## caution to unicode normalize
display('1-1 2024 주요 재정통계 1권.pdf' == normalize_path('1-1 2024 주요 재정통계 1권.pdf'))
display('1-1 2024 주요 재정통계 1권.pdf' == '1-1 2024 주요 재정통계 1권.pdf')

True

False

In [35]:
PROCESSEDDIR = os.path.join(BASE_DIR,'processed')
if not os.path.exists(PROCESSEDDIR) : os.makedirs(PROCESSEDDIR)
ERRORDIR = os.path.join(PROCESSEDDIR,'ERRORS')
if not os.path.exists(ERRORDIR) : os.makedirs(ERRORDIR)

In [37]:
reform_pdfs_from_df(train_df, temp_path,PROCESSEDDIR,'trn')
reform_pdfs_from_df(test_df, temp_path,PROCESSEDDIR,'tst');

Processing PDFs:   0%|          | 0/16 [00:00<?, ?it/s]

./train_source/1-1 2024 주요 재정통계 1권.pdf /content/src/
Processing 1-1 2024 주요 재정통계 1권.pdf...
/content/src/train_source/1-1 2024 주요 재정통계 1권.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/1-1 2024 주요 재정통계 1권.pdf


  0%|          | 0/137 [00:00<?, ?it/s]

detected: (106.16761850585938, 262.17433237304687, 496.45875478515626, 664.8613426757812) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected in p.2 :	 1  tables
detected: (181.28433298339843, 170.5902564453125, 486.7898705078125, 636.3243553710937) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
Filling in gap at top of table
detected in p.5 :	 1  tables
detected: (154.67742990722655, 115.70205759277344, 484.81419301757813, 655.3303368164062) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected in p.6 :	 1  tables
detected: (182.44290231933593, 113.56134104003907, 484.57386708984376, 623.3132469726562) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected in p.7 :	 1  tables
detected: (197.99774240722655, 113.601853125, 486.1949913574219, 584.6992333007812) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected in p.9 :	 1  tables
detected: (161.9510962890625, 103.98705362548829, 485.0072472167969, 643.39961171875) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.12 :	 1  tables
detected: (50.01510974019369, 114.9621394897461, 474.5486515258789, 157.16454914957683) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (50.01510974019369, 411.4477860555013, 474.5486515258789, 453.6505314086914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.15 :	 2  tables
detected: (63.92701290961371, 114.9621394897461, 488.6794895385742, 157.16454914957683) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.92701290961371, 411.4477860555013, 488.6794895385742, 453.6505314086914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (311.4406551147461, 187.724116007487, 482.97935831298827, 380.39973631998697) 	 at (0.0, 157.16454914957683, 538.5830078125, 411.4477860555013)  in depth 1
detected: (97.8418544555664, 226.53278641153975, 252.10516702880858, 380.39973631998697) 	 at (0.0, 187.724116007487, 311.4406551147461, 380.39973631998697)  in depth 2


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.16 :	 4  tables
detected: (50.03778979159269, 114.9621394897461, 474.62398111572264, 157.16454914957683) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (50.03778979159269, 397.9834611043294, 474.62398111572264, 440.18587076416014) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (52.100555780029296, 175.3492495218913, 204.7850528930664, 343.6281603922526) 	 at (0.0, 157.16454914957683, 538.5830078125, 397.9834611043294)  in depth 1
detected: (275.6678237915039, 178.6810694163005, 461.9567482421875, 343.6281603922526) 	 at (204.7850528930664, 175.3492495218913, 538.5830078125, 343.6281603922526)  in depth 2
detected in p.17 :	 4  tables
detected: (64.03212392578125, 114.9621394897461, 488.65676135428293, 157.16454914957683) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.03212392578125, 397.9834611043294, 488.65676135428293, 440.18587076416014) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688) 

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (387.988574621582, 177.70336786905926, 406.8864500473022, 339.26182774454753)
detected in p.18 :	 6  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (50.47743642578125, 114.9621394897461, 474.21116983642577, 157.16454914957683) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (50.47743642578125, 154.64713704833986, 474.21116983642577, 413.1056400512695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (50.47743642578125, 428.54444539794923, 474.21116983642577, 470.746875402832) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (50.90212412923177, 157.16454914957683, 473.9226973429362, 413.10563841959635) 	 at (0.0, 157.16454914957683, 538.5830078125, 428.54444539794923)  in depth 1
detected in p.19 :	 3  tables
detected: (63.93812215576172, 114.9621394897461, 488.62988626708983, 157.16454914957683) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (81.91419637451172, 170.13764608154298, 358.6410557006836, 395.98286783447264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (390.6940383911133, 220.181271319

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (50.26509257405599, 114.9621394897461, 474.64387857666014, 157.16454914957683) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (50.26509257405599, 333.229442956543, 474.64387857666014, 375.43187296142577) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (56.70512426147461, 369.0881160522461, 457.5760624145508, 503.6998783325195) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (56.70512426147461, 519.4657609090169, 457.5760624145508, 658.329852697754) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (79.61212575683594, 159.79749715576173, 443.0735813354492, 298.19087564697264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (79.53112447509766, 159.79748884836835, 443.0735813354492, 298.19087401529947) 	 at (0.0, 157.16454914957683, 538.5830078125, 333.229442956543)  in depth 1
detected: (56.19564473876953, 402.1082763671875, 454.9139591430664, 517.32952045

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.21 :	 5  tables
detected: (50.26509257405599, 114.9621394897461, 474.64387857666014, 157.16454914957683) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.64712560424805, 178.07045400390626, 456.25487100830077, 374.1754337524414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (50.26509257405599, 389.2134517456055, 474.64387857666014, 431.41588175048827) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.64712560424805, 472.97791707763673, 456.25487100830077, 664.8340844685873) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.23 :	 4  tables
detected: (64.00626681780133, 114.9621394897461, 488.5552707885742, 157.16454914957683) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.00626681780133, 196.3244899536133, 488.5552707885742, 380.642864714898) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.00626681780133, 40

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.33 :	 2  tables
detected: (58.04240106811523, 233.34211801757812, 482.9176018310547, 641.9293358398437) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (510.55267333984375, 445.3434614990234, 530.2152543212891, 641.9293358398437) 	 at (482.9176018310547, 233.34211801757812, 556.0, 641.9293358398437)  in depth 1
detected: (507.95045471191406, 263.7101073120117, 530.5006928588867, 641.9293358398437) 	 at (482.9176018310547, 233.34211801757812, 556.0, 641.9293358398437)  in depth 1
detected: (509.6359443664551, 234.62090633850096, 534.1127457763672, 641.9293358398437) 	 at (482.9176018310547, 233.34211801757812, 556.0, 641.9293358398437)  in depth 1
detected: (92.91384195556641, 76.48510812988282, 488.83906484375, 215.15138173828126) 	 at (0.0, 0.0, 556.0, 233.34211801757812)  in depth 1
detected: (512.8574619293213, 102.37444447021485, 534.9698166992188, 176.52246856689453) 	 at (488.83906484375, 76.48510812988282, 556.0, 215.15138173828126)  in depth 2
exception :  No 

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.36 :	 2  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (75.0223967956543, 451.05641245117187, 480.39759755859376, 490.67161489257813) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (98.62076639404297, 93.411574609375, 505.91249013671876, 202.80551768798827) 	 at (0.0, 93.411574609375, 556.0, 451.05641245117187)  in depth 1
Filling in gap at top of table
detected in p.37 :	 3  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (58.04240106811523, 444.27638315429687, 483.39759755859376, 521.7434227539062) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected in p.38 :	 1  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239710083008, 407.2010962890625, 497.6175987792969, 679.2635643554687) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 81.49% of area is overlapping (Max is 30.00%)
error :  (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375)
detected in p.39 :	 1  tables
detected: (58.04240106811523, 80.736374609375, 483.45761037597657, 649.8394310546875) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (514.2798194885254, 80.736374609375, 536.6038651611328, 492.1121520996094) 	 at (483.45761037597657, 80.736374609375, 556.0, 649.8394310546875)  in depth 1
detected: (508.7919235229492, 80.736374609375, 534.9229607727051

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.45 :	 1  tables
detected: (58.04240106811523, 164.99232553710937, 483.2475884033203, 400.0973351074219) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (58.04240106811523, 497.79640268554687, 483.2475884033203, 653.88105703125) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (510.1627960205078, 571.046006665039, 535.6560921813965, 645.5890655517578) 	 at (483.2475884033203, 497.79640268554687, 556.0, 653.88105703125)  in depth 1
detected: (56.235783331298826, 145.419968359375, 484.94212270507813, 164.99232553710937) 	 at (0.0, 0.0, 556.0, 164.99232553710937)  in depth 1
detected: (485.3305149078369, 145.419968359375, 552.8097977783203, 164.99232553710937) 	 at (484.94212270507813, 145.419968359375, 556.0, 164.99232553710937)  in depth 2
exception :  The identified boxes have significant overlap: 125.40% of area is overlapping (Max is 30.00%)
error :  (485.3305149078369, 145.419968359375, 552.8097977783203, 164.99232553710937)
detected in p.46 :	 4  tables
detected

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.49 :	 1  tables
detected: (75.74239801635743, 214.561569921875, 479.85760427246095, 661.3231346679687) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (485.81422233581543, 214.561569921875, 546.5346512939453, 652.6962280273438) 	 at (479.85760427246095, 214.561569921875, 556.0, 661.3231346679687)  in depth 1
detected: (47.7512671875, 90.33945535888672, 520.8496849609375, 214.3497307373047) 	 at (0.0, 0.0, 556.0, 214.561569921875)  in depth 1
exception :  No rows or columns detected
error :  (485.81422233581543, 214.561569921875, 546.5346512939453, 652.6962280273438)
exception :  The identified boxes have significant overlap: 33.85% of area is overlapping (Max is 30.00%)
error :  (47.7512671875, 90.33945535888672, 520.8496849609375, 214.3497307373047)
detected in p.50 :	 1  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239710083008, 181.394821875, 497.3775932861328, 649.986830957

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (58.04240106811523, 480.57814096679687, 483.2775871826172, 679.2093651367187) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (510.5171012878418, 566.8933577392578, 534.5011086608887, 657.1777191162109) 	 at (483.2775871826172, 480.57814096679687, 556.0, 679.2093651367187)  in depth 1
exception :  No rows or columns detected
error :  (510.5171012878418, 566.8933577392578, 534.5011086608887, 657.1777191162109)
detected in p.54 :	 1  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 644.3733055664062) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.55 :	 1  tables
detected: (58.04240106811523, 166.03006052246093, 483.3376, 670.3652489257812) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (508.8524589538574, 218.0155615661621, 533.7627015258789, 658.5581207275391) 	 at (483.3376, 166.03006052246093, 556.0, 670.3652489257812)  in depth 1
detected: (513.2426452636719, 213.00618885498045, 534.7271218444824, 596.8578643798828) 	 at (483.3376, 166.03006052246093, 556.0, 670.3652489257812)  in depth 1
exception :  No rows or columns detected
error :  (508.8524589538574, 213.00618885498045, 534.7271218444824, 658.5581207275391)
detected in p.56 :	 1  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 682.5964500976562) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.57 :	 1  tables
detected: (58.04240106811523, 80.736374609375, 483.3376, 632.8257591796875) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (508.1311569213867, 80.736374609375, 536.7985215332031, 632.8257591796875) 	 at (483.3376, 80.736374609375, 556.0, 632.8257591796875)  in depth 1
detected in p.58 :	 2  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239710083008, 168.84422373046874, 497.9775841308594, 683.065871484375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 96.65% of area is overlapping (Max is 30.00%)
error :  (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375)
detected in p.59 :	 1  tables
detected: (58.04240106811523, 80.736374609375, 483.81759572753907, 553.6569359375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (514.3370819091797, 80.736374609375, 536.580355181

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (58.04240106811523, 166.84396433105468, 484.41761708984376, 632.500930078125) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (513.9482383728027, 211.59060810546873, 535.1559662963867, 632.500930078125) 	 at (484.41761708984376, 166.84396433105468, 556.0, 632.500930078125)  in depth 1
detected: (50.42978739013672, 87.10965799560547, 492.46314931640626, 161.34271169433595) 	 at (0.0, 0.0, 556.0, 166.84396433105468)  in depth 1
detected in p.62 :	 3  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (75.11239694824219, 270.12037729492187, 479.40760732421876, 661.9751732421875) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.18076395263672, 93.411574609375, 482.4360802246094, 268.54122783203127) 	 at (0.0, 93.411574609375, 556.0, 270.12037729492187)  in depth 1
Filling in gap at top of table


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Filling in gap at top of table
detected in p.63 :	 3  tables
detected: (57.69658731689453, 175.30050729980468, 483.3675987792969, 634.8647606445312) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (513.5808525085449, 215.97095631103514, 533.8905739929199, 634.8647606445312) 	 at (483.3675987792969, 175.30050729980468, 556.0, 634.8647606445312)  in depth 1
detected: (46.89956354370117, 79.16478800048829, 495.74708486328126, 172.24209523925782) 	 at (0.0, 0.0, 556.0, 175.30050729980468)  in depth 1
detected: (512.3548812866211, 105.89547108154298, 537.685884967041, 172.24209523925782) 	 at (495.74708486328126, 79.16478800048829, 556.0, 172.24209523925782)  in depth 2
table does not sound


Unnamed: 0,Unnamed: 1


exception :  table does not sound
errs :  1
detected in p.64 :	 3  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (78.7124005086263, 235.29637216796874, 477.6376183105469, 605.5716087890625) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (71.45253061523438, 93.411574609375, 556.0, 235.29637216796874) 	 at (0.0, 93.411574609375, 556.0, 235.29637216796874)  in depth 1
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 49.33% of area is overlapping (Max is 30.00%)
error :  (71.45253061523438, 93.411574609375, 556.0, 235.29637216796874)
detected in p.65 :	 2  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (58.04240106811523, 141.74908518066405, 483.27760244140626, 633.8734276367187) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (509.01636123657227, 179.55132435302733, 533.0187974121094, 633.8734276367187) 	 at (483.27760244140626, 141.74908518066405, 556.0, 633.8734276367187)  in depth 1
detected: (488.8713388442993, 141.74908518066405, 539.8649315979004, 179.55132435302733) 	 at (483.27760244140626, 141.74908518066405, 556.0, 179.55132435302733)  in depth 2


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.66 :	 3  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239710083008, 393.8763892578125, 497.40760732421876, 479.91624379882813) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239710083008, 555.1564185546875, 497.40760732421876, 669.107863671875) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
exception :  The identified boxes have significant overlap: 60.71% of area is overlapping (Max is 30.00%)
error :  (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375)
detected in p.67 :	 2  tables
detected: (58.04240106811523, 135.65217661132812, 483.21758962402345, 649.8818504882812) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (507.9073848724365, 135.65217661132812, 532.7137894775391, 649.8818504882811) 	 at (483.21758962402345, 135.65217661132812, 556.0, 649.8818504882812)  in depth 1
exception :  No rows or columns detected
error :  (507.9073848724365, 135.652176611328

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


table does not sound


Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4


exception :  table does not sound
errs :  1
detected in p.70 :	 6  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239710083008, 160.80777048339843, 497.3776085449219, 656.1749413085937) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 55.03% of area is overlapping (Max is 30.00%)
error :  (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375)
detected in p.71 :	 1  tables
detected: (58.04240106811523, 145.920944921875, 483.2176048828125, 663.000930078125) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (507.8150577545166, 145.920944921875, 532.4707093383789, 648.8040161132812) 	 at (483.2176048828125, 145.920944921875, 556.0, 663.000930078125)  in depth 1
exception :  No rows or columns detected
error :  (507.8150577545166, 145.920944921875, 532.4707093383789, 648.8040161132812)
detected in p.72 :	 1  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.41157460937

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (58.04240106811523, 141.02754282226562, 483.2176048828125, 655.7586205078125) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (507.86230659484863, 141.02754282226562, 532.4931931640625, 646.7403259277344) 	 at (483.2176048828125, 141.02754282226562, 556.0, 655.7586205078125)  in depth 1
exception :  No rows or columns detected
error :  (507.86230659484863, 141.02754282226562, 532.4931931640625, 646.7403259277344)
detected in p.74 :	 1  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (75.0223967956543, 252.51640390625, 477.57760549316407, 542.2581932617187) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (66.91192697753907, 93.411574609375, 556.0, 252.51640390625) 	 at (0.0, 93.411574609375, 556.0, 252.51640390625)  in depth 1
Filling in gap at top of table
detected in p.75 :	 3  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (58.04240106811523, 190.90065073242187, 483.27760244140626, 623.6825096679687) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (511.95357513427734, 191.32189224700926, 532.5071473266602, 425.3271484375) 	 at (483.27760244140626, 190.90065073242187, 556.0, 623.6825096679687)  in depth 1
detected: (513.2022438049316, 211.120978817749, 533.3483224060059, 563.019775390625) 	 at (483.27760244140626, 190.90065073242187, 556.0, 623.6825096679687)  in depth 1
detected: (510.71828842163086, 368.86526630859373, 531.2711167480469, 623.6825096679687) 	 at (483.27760244140626, 190.90065073242187, 556.0, 623.6825096679687)  in depth 1
detected: (509.8399353027344, 237.4175505493164, 531.8055482055664, 623.6825096679687) 	 at (483.27760244140626, 190.90065073242187, 556.0, 623.6825096679687)  in depth 1
detected: (511.80811309814453, 215.59061382751463, 534.344015612793, 623.6825096679687) 	 at (483.27760244140626, 190.90065073242187, 556.0, 623.6825096679687)  in depth 1
detected: (50.

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.78 :	 2  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239710083008, 292.52342294921874, 497.43760610351563, 602.3794090820312) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (149.45008920898437, 157.714111328125, 495.4187462402344, 247.87144328613283) 	 at (0.0, 93.411574609375, 556.0, 292.52342294921874)  in depth 1
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 93.87% of area is overlapping (Max is 30.00%)
error :  (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375)
detected in p.79 :	 2  tables
detected: (58.04240106811523, 154.14483713378905, 483.21758962402345, 660.3211815429687) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (508.32080459594727, 186.5984358642578, 533.2198853637695, 650.7772979736328) 	 at (483.21758962402345, 154.14483713378905, 556.0, 660.3211815429687)  in depth 1
detected: (492.3889102935791,

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239710083008, 134.74580454101562, 497.3776085449219, 624.2804100585937) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 105.67% of area is overlapping (Max is 30.00%)
error :  (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375)
detected in p.81 :	 1  tables
detected: (58.04240106811523, 135.88211130371093, 483.21758962402345, 633.2837669921875) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (508.36968994140625, 135.88211130371093, 533.4263901855469, 633.2837669921875) 	 at (483.21758962402345, 135.88211130371093, 556.0, 633.2837669921875)  in depth 1
detected in p.82 :	 2  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239710083008, 151.6936042236328, 497.3776085449219, 6

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (58.04240106811523, 141.986862890625, 480.5176079345703, 641.4082787109375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (511.12511444091797, 141.986862890625, 533.2381196166992, 617.9985656738281) 	 at (480.5176079345703, 141.986862890625, 556.0, 641.4082787109375)  in depth 1
detected: (506.9401512145996, 141.986862890625, 530.5020013000488, 641.4082787109375) 	 at (480.5176079345703, 141.986862890625, 556.0, 641.4082787109375)  in depth 1
detected in p.88 :	 2  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239710083008, 134.82228159179687, 497.3776085449219, 645.1687767578125) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 143.31% of area is overlapping (Max is 30.00%)
error :  (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375)
detected in p.89 :	 1  tables
detected: (58.0424010681

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.93 :	 1  tables
detected: (0.0, 48.136419368489584, 493.5876, 702.8316185546875) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (511.4871063232422, 164.35292703450523, 530.7353100921631, 702.8316185546875) 	 at (493.5876, 48.136419368489584, 556.0, 702.8316185546875)  in depth 1
detected: (513.2592144012451, 295.5276249104818, 531.2771020080567, 702.8316185546875) 	 at (493.5876, 48.136419368489584, 556.0, 702.8316185546875)  in depth 1
detected: (511.58531951904297, 48.136419368489584, 530.5104146148682, 589.4162190755208) 	 at (493.5876, 48.136419368489584, 556.0, 702.8316185546875)  in depth 1
detected: (512.7431716918945, 48.136419368489584, 531.1697335388184, 410.89388020833337) 	 at (493.5876, 48.136419368489584, 556.0, 702.8316185546875)  in depth 1
detected: (512.1368961334229, 95.384188478597, 531.3536572601319, 702.8316185546875) 	 at (493.5876, 48.136419368489584, 556.0, 702.8316185546875)  in depth 1
detected: (511.76224517822266, 78.61436349690754, 530.6

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  list index out of range
error :  (511.4871063232422, 76.7725818806966, 531.3536572601319, 702.8316185546875)
detected in p.94 :	 1  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 72.33% of area is overlapping (Max is 30.00%)
error :  (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375)
detected: (58.04239725341797, 94.35276865234376, 484.17759633789063, 599.4027245117187) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (510.11183166503906, 94.35276865234376, 535.4008736755371, 584.550048828125) 	 at (484.17759633789063, 94.35276865234376, 556.0, 599.4027245117187)  in depth 1
detected in p.96 :	 2  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239328613282, 96.92528604736329, 497.34759959309895, 618.422560937

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Filling in gap at top of table
detected in p.101 :	 4  tables
detected: (0.0, 49.88140939941406, 493.5876, 704.5116112304687) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (511.59666442871094, 159.97292469482423, 530.8235154296875, 704.5116112304687) 	 at (493.5876, 49.88140939941406, 556.0, 704.5116112304687)  in depth 1
detected: (513.3475933074951, 279.2244887207031, 531.3755345489502, 704.5116112304687) 	 at (493.5876, 49.88140939941406, 556.0, 704.5116112304687)  in depth 1
detected: (511.57434272766113, 49.88140939941406, 530.5055947448731, 584.1537628173828) 	 at (493.5876, 49.88140939941406, 556.0, 704.5116112304687)  in depth 1
detected: (512.781286239624, 49.88140939941406, 531.212610736084, 400.19615173339844) 	 at (493.5876, 49.88140939941406, 556.0, 704.5116112304687)  in depth 1
detected: (512.1860446929932, 92.84768436889648, 531.3808636810303, 703.1012115478516) 	 at (493.5876, 49.88140939941406, 556.0, 704.5116112304687)  in depth 1
detected: (511.80723762512207,

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  list index out of range
error :  (511.59666442871094, 73.97473858337402, 531.3808636810303, 704.5116112304687)
detected in p.102 :	 1  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239710083008, 186.2570044921875, 497.34759959309895, 670.6468041015625) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 138.31% of area is overlapping (Max is 30.00%)
error :  (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375)
detected in p.103 :	 1  tables
detected: (58.04240106811523, 80.736374609375, 483.18760610351563, 548.5952904296875) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (513.8092079162598, 80.736374609375, 535.5365586425781, 467.70013427734375) 	 at (483.18760610351563, 80.736374609375, 556.0, 548.5952904296875)  in depth 1
detected: (510.14403533935547, 80.736374609375, 533.8989739562988,

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.107 :	 3  tables
detected: (0.0, 48.29641285807291, 493.5876, 705.8916161132812) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (510.98949432373047, 182.54092040201823, 530.5044655944824, 705.8916161132812) 	 at (493.5876, 48.29641285807291, 556.0, 705.8916161132812)  in depth 1
detected: (513.061408996582, 335.3213311686198, 531.4066205169678, 705.8916161132812) 	 at (493.5876, 48.29641285807291, 556.0, 705.8916161132812)  in depth 1
detected: (511.1003837585449, 48.29641285807291, 530.2921662475586, 561.7083536783854) 	 at (493.5876, 48.29641285807291, 556.0, 705.8916161132812)  in depth 1
detected: (512.1620807647705, 48.29641285807291, 530.9011902954102, 375.5569559733073) 	 at (493.5876, 48.29641285807291, 556.0, 705.8916161132812)  in depth 1
detected: (511.90734672546387, 102.45469679972331, 531.3776078369141, 705.8916161132812) 	 at (493.5876, 48.29641285807291, 556.0, 705.8916161132812)  in depth 1
detected: (511.4331817626953, 77.97995772501628, 530.4971261

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  list index out of range
error :  (510.98949432373047, 74.30570616861979, 531.4066205169678, 705.8916161132812)
detected in p.108 :	 1  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 76.61% of area is overlapping (Max is 30.00%)
error :  (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375)
detected: (58.04240106811523, 102.79741739501954, 483.30760122070313, 630.8676903320312) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (513.3724365234375, 102.79741739501954, 535.1304459716797, 446.9816665649414) 	 at (483.30760122070313, 102.79741739501954, 556.0, 630.8676903320312)  in depth 1
detected: (507.381290435791, 102.79741739501954, 534.4263749267578, 630.8676903320312) 	 at (483.30760122070313, 102.79741739501954, 556.0, 630.8676903320312)  in depth 1
detected in p.110 :	 2  tables
detected: (63.6123

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (510.8658676147461, 301.7223365234375, 535.7474122192383, 634.2408142089844)
detected in p.120 :	 1  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.23239778747559, 165.996384375, 497.7376015258789, 417.6509239746094) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.23239778747559, 544.9114844726563, 497.7376015258789, 660.1353905273437) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 74.08% of area is overlapping (Max is 30.00%)
error :  (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375)
detected in p.121 :	 2  tables
detected: (58.04240106811523, 263.35020517578124, 483.21758962402345, 628.8802025390625) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (507.4714546203613, 263.35020517578124, 531.3145013000488, 628.8802025390625) 	 at (483.217589

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (58.04240106811523, 397.776413671875, 483.5176079345703, 620.7642357421875) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (510.1268119812012, 575.9777693603515, 534.4307809020996, 620.7642357421875) 	 at (483.5176079345703, 397.776413671875, 556.0, 620.7642357421875)  in depth 1
exception :  list index out of range
error :  (510.1268119812012, 575.9777693603515, 534.4307809020996, 620.7642357421875)
detected in p.126 :	 1  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.23239715169271, 453.2163855957031, 497.5876, 653.3457787109375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 131.18% of area is overlapping (Max is 30.00%)
error :  (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375)
detected in p.127 :	 1  tables
detected: (58.04240106811523, 461.9147193359375, 483.6676018310547, 657.5755

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (509.6331100463867, 101.33640360107422, 533.3627801086426, 463.3010787963867)
detected in p.130 :	 1  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
Filling in gap at top of table
detected in p.131 :	 1  tables
detected: (58.04240106811523, 216.52439645996094, 483.72759938964845, 332.4866478515625) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (52.894589178466795, 106.40469431152344, 518.9245140625, 170.47758413085938) 	 at (0.0, 0.0, 556.0, 216.52439645996094)  in depth 1
detected in p.132 :	 2  tables
detected: (63.6123994913737, 41.19639658203125, 556.0, 93.411574609375) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (72.20239710083008, 524.9764127999442, 497.67759633789063, 673.2835838867187) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
exception :  The identified boxes have significant overlap: 148.79% of area is overlapping (Max is 30.00%)
error :  (

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (58.04240106811523, 481.3564002441406, 483.7875969482422, 654.3969872070312) 	 at (0.0, 0.0, 556.0, 754.0)  in depth 0
detected: (510.6060600280762, 571.3387236450195, 535.4316582824707, 646.9358215332031) 	 at (483.7875969482422, 481.3564002441406, 556.0, 654.3969872070312)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (510.6060600280762, 571.3387236450195, 535.4316582824707, 646.9358215332031)
detected in p.136 :	 1  tables
195 113 195
./train_source/2024 나라살림 예산개요.pdf /content/src/
Processing 2024 나라살림 예산개요.pdf...
/content/src/train_source/2024 나라살림 예산개요.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/2024 나라살림 예산개요.pdf


  0%|          | 0/314 [00:00<?, ?it/s]

detected: (49.465633752441406, 30.693985345458984, 166.19385111083983, 101.52121317138672) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 1  tables
detected: (56.615623834228515, 65.35612142333984, 219.15677225341796, 106.35188638916016) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 63.35% of area is overlapping (Max is 30.00%)
error :  (56.615623834228515, 65.35612142333984, 219.15677225341796, 106.35188638916016)
detected: (349.0251277709961, 70.21669042358398, 481.73164713134764, 101.49095499267578) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.2 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.4845157836914, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.8 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.9 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (80.61442220458984, 100.00746572265625, 397.3158146118164, 600.402118322754) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 32.91% of area is overlapping (Max is 30.00%)
error :  (80.61442220458984, 100.00746572265625, 397.3158146118164, 600.402118322754)
detected in p.10 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.11 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (79.5702480102539, 100.45055806884766, 415.52928507080077, 660.4654728149414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 35.18% of area is overlapping (Max is 30.00%)
error :  (79.5702480102539, 100.45055806884766, 415.52928507080077, 660.4654728149414)
detected in p.12 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.13 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (80.3805965209961, 92.94704091796875, 402.2480655883789, 643.6858707641602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 33.75% of area is overlapping (Max is 30.00%)
error :  (80.3805965209961, 92.94704091796875, 402.2480655883789, 643.6858707641602)
detected in p.14 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (65.91046560058594, 131.85474813232423, 331.2320743774414, 667.3730045532227) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.15 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (80.95605886230469, 104.06467092285156, 387.888965246582, 643.526874182129) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 33.93% of area is overlapping (Max is 30.00%)
error :  (80.95605886230469, 104.06467092285156, 387.888965246582, 643.526874182129)
detected in p.16 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (62.83668935546875, 104.12819326171875, 368.73082315673827, 667.0338932250977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 30.33% of area is overlapping (Max is 30.00%)
error :  (62.83668935546875, 104.12819326171875, 368.73082315673827, 667.0338932250977)
detected in p.17 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (80.36610830078125, 91.17310750732422, 412.45067178955077, 661.489520666504) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 36.27% of area is overlapping (Max is 30.00%)
error :  (80.36610830078125, 91.17310750732422, 412.45067178955077, 661.489520666504)
detected in p.18 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.74367940673828, 90.83350026855469, 296.02681314697264, 669.8951603149414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Filling in gap at top of table
detected in p.19 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (81.07874715576172, 95.09920919189453, 420.1833683227539, 582.5309025024414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.20 :	 2  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.14352071533203, 103.9420741821289, 350.2516361450195, 631.623553869629) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.21 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (81.7035716796875, 102.40396535644531, 369.504687902832, 579.1539493774414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 35.02% of area is overlapping (Max is 30.00%)
error :  (81.7035716796875, 102.40396535644531, 369.504687902832, 579.1539493774414)
detected in p.22 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.23 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.24 :	 1  tables
detected: (63.93817556152344, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (65.71325338134766, 218.64997518310548, 320.3548771118164, 621.3927189086914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.27 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (80.91174733886719, 102.02887380371094, 359.46629678955077, 563.7361637329102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.28 :	 2  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.12557637939453, 108.85490072021484, 458.84956705322264, 651.625995275879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 32.25% of area is overlapping (Max is 30.00%)
error :  (64.12557637939453, 108.85490072021484, 458.84956705322264, 651.625995275879)
detected in p.29 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (82.55669057617187, 102.88779104003906, 340.26912271728514, 599.1190983032227) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 33.36% of area is overlapping (Max is 30.00%)
error :  (82.55669057617187, 102.88779104003906, 340.26912271728514, 599.1190983032227)
detected in p.30 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.31973684082031, 98.80473745117187, 352.22737467041014, 663.204120275879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.31 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (80.36347615966797, 112.93844259033203, 465.9967228149414, 603.5064274047852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 32.32% of area is overlapping (Max is 30.00%)
error :  (80.36347615966797, 112.93844259033203, 465.9967228149414, 603.5064274047852)
detected in p.32 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (65.05871999511719, 121.19465291748047, 287.8639103149414, 611.5890690063477) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 32.29% of area is overlapping (Max is 30.00%)
error :  (65.05871999511719, 121.19465291748047, 287.8639103149414, 611.5890690063477)
detected in p.33 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (80.52187764892578, 102.22414052734375, 372.90953409423827, 589.171649572754) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 32.09% of area is overlapping (Max is 30.00%)
error :  (80.52187764892578, 102.22414052734375, 372.90953409423827, 589.171649572754)
detected in p.34 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.35 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.36 :	 1  tables
detected: (78.11112630615234, 360.8319438720703, 474.4986759399414, 454.057666418457) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (482.3903408050537, 360.8319438720703, 532.5715897033691, 443.56502532958984) 	 at (474.4986759399414, 360.8319438720703, 538.5830078125, 454.057666418457)  in depth 1
detected: (78.11112630615234, 603.4559444213867, 474.4986759399414, 668.3351627563477) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (75.44180715332031, 232.90214193115236, 472.21556436767577, 325.28074991455077) 	 at (0.0, 0.0, 538.5830078125, 360.8319438720703)  in depth 1
detected: (481.6947841644287, 232.90214193115236, 528.8897011230468, 325.28074991455077) 	 at (472.21556436767577, 232.90214193115236, 538.5830078125, 325.28074991455077)  in depth 2
detected: (69.80060231933594, 455.2197847366333, 489.1095157836914, 595.3314430664062) 	 at (0.0, 454.057666418457, 538.5830078125, 603.4559444213867)  in de

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4


exception :  table does not sound


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


errs :  1
detected in p.42 :	 6  tables
detected: (63.93824041137695, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93824041137695, 493.7788373521593, 460.47498412679033, 658.2466617797852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.43 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 475.1443592285156, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.44 :	 1  tables
detected: (63.93820302734375, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93820302734375, 465.0632577894423, 460.4708744262695, 666.0057560180664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.45 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.64407999267576, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11132657775879, 300.6639657972548, 474.64407999267576, 402.39366495361327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.46 :	 2  tables
detected: (63.9380938180106, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.9380938180106, 218.80192220458986, 460.4708744262695, 319.03038370361327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.9380938180106, 456.0098418551975, 460.4708744262695, 568.1153751586914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (61.188099267578124, 110.81037902832031, 462.0997501586914, 181.8722984741211) 	 at (0.0, 61.12102091064453, 538.5830078125, 218.80192220458986)  in depth 1
detected: (60.27688253173828, 328.2047710418701, 462.4894291137695, 421.35542225341794) 	 at (0.0, 319.03038370361327, 538.5830078125, 456.0098418551975)  in depth 1
detected in p.47 :	 5  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.48 :	 1  tables
detected: (63.93824041137695, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93824041137695, 504.2321522393121, 460.4743127400716, 645.6470524047852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.49 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 475.1443592285156, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.50 :	 1  tables
detected: (63.93822324523926, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93822324523926, 427.4819080986871, 460.4708744262695, 549.053973791504) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.51 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.64407999267576, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11132657775879, 367.9543439439562, 474.64407999267576, 561.0513492797852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (507.99390335083007, 367.9543439439562, 528.6122040222168, 553.1839006212023) 	 at (474.64407999267576, 367.9543439439562, 538.5830078125, 561.0513492797852)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.52 :	 3  tables
detected: (64.08314321289062, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.08314321289062, 508.10924777967665, 460.76475870361327, 649.524188635254) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.53 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 475.1443592285156, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.54 :	 1  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.33326375732422, 207.63044393310548, 461.892627355957, 542.9926334594727) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.55 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6442142700195, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 126.30033529052734, 474.6442142700195, 314.15636026611327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (495.2071075439453, 132.5676896621704, 531.8792183349609, 314.15636026611327) 	 at (474.6442142700195, 126.30033529052734, 538.5830078125, 314.15636026611327)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (495.2071075439453, 132.5676896621704, 531.8792183349609, 314.15636026611327)
detected in p.56 :	 2  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.61 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.62 :	 1  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.63 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.84850504150387, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.64 :	 1  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 421.5388861124675, 460.4708744262695, 643.4391666625977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (61.99228322753906, 121.05779266357422, 461.43822061767577, 373.88135456542966) 	 at (0.0, 61.12102091064453, 538.5830078125, 421.5388861124675)  in depth 1
detected in p.65 :	 3  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.66 :	 1  tables
detected: (63.93805603434245, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93805603434245, 262.8042957939996, 460.47188150634764, 518.062274572754) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (60.05788839111328, 124.78962707519531, 461.608447668457, 219.18706440429685) 	 at (0.0, 61.12102091064453, 538.5830078125, 262.8042957939996)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.67 :	 3  tables
detected: (344.71012532958986, 33.09214437255859, 474.1445957397461, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 462.4378474975586, 474.1445957397461, 663.545673010254) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (507.3219795227051, 462.4378474975586, 530.0016930053711, 638.0877075195312) 	 at (474.1445957397461, 462.4378474975586, 538.5830078125, 663.545673010254)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (507.3219795227051, 462.4378474975586, 530.0016930053711, 638.0877075195312)
detected in p.68 :	 2  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.69 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (77.87677419433594, 116.65551412353516, 474.90535318603514, 667.5775943969727) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 39.36% of area is overlapping (Max is 30.00%)
error :  (77.87677419433594, 116.65551412353516, 474.90535318603514, 667.5775943969727)
detected in p.70 :	 1  tables
detected: (63.93805603434245, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93805603434245, 113.3732478881836, 460.47188150634764, 346.93345987548827) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.71 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (97.02577626953125, 127.8417323852539, 476.31273233642577, 650.393817541504) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 42.92% of area is overlapping (Max is 30.00%)
error :  (97.02577626953125, 127.8417323852539, 476.31273233642577, 650.393817541504)
detected in p.72 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.73 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.8948398803711, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (142.74113118896486, 501.19103963623047, 445.44685709228514, 609.2838932250977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (466.3026752471924, 504.4032792617798, 530.5097877929687, 609.2838932250977) 	 at (445.44685709228514, 501.19103963623047, 538.5830078125, 609.2838932250977)  in depth 1
detected: (76.50065267333984, 108.05928802490234, 473.084766027832, 426.3070442626953) 	 at (0.0, 61.12102091064

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (502.28099060058594, 130.42403829345704, 526.889617199707, 426.3070442626953)
detected in p.74 :	 4  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.75 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.9779239868164, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.76 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.62820089111328, 174.94807088623048, 461.3246341918945, 397.369922277832) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (52.18993413696289, 158.96421813964844, 471.9319034790039, 174.94807088623048) 	 at (0.0, 61.12102091064453, 538.5830078125, 174.94807088623048)  in depth 1
exception :  The identified boxes have significant overlap: 47.26% of area is overlapping (Max is 30.00%)
error :  (52.18993413696289, 158.96421813964844, 471.9319034790039, 174.94807088623048)
detected in p.77 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448347941081, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11139333496094, 294.70570028076173, 474.6448347941081, 433.6728702758789) 	 at (0.0, 0.0, 538.5830078125, 737.00

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (498.0743471781413, 311.0000617553711, 529.8311823954265, 431.1328125)
detected in p.78 :	 4  tables
detected: (63.93805603434245, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93805603434245, 107.18493137342665, 460.4711185668945, 208.9146915649414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.79 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.64407694091796, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 109.08495578748915, 474.64407694091796, 491.7003666137695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (505.36307525634766, 119.42533776991102, 526.1346352050781, 431.25282118055554) 	 at (474.64407694091796, 109.08495578748915, 538.5830078125, 491.7003666137695)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Filling in gap at top of table
exception :  No rows or columns detected
error :  (505.36307525634766, 119.42533776991102, 526.1346352050781, 431.25282118055554)
detected in p.80 :	 2  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.81 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.34643972167969, 130.8620418334961, 475.137256262207, 643.4602848266602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 38.51% of area is overlapping (Max is 30.00%)
error :  (78.34643972167969, 130.8620418334961, 475.137256262207, 643.4602848266602)
detected in p.82 :	 1  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.83 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448347941081, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 288.873255348036, 474.6448347941081, 365.091357824707) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (75.24203145751953, 121.62742614746094, 474.2590519165039, 288.873255348036) 	 at (0.0, 61.12102091064453, 538.5830078125, 288.873255348036)  in depth 1
detected: (497.8317756652832, 125.53396451721191, 530.5673630187988, 288.873255348036) 	 at (474.2590519165039, 121.62742614746094, 538

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.84 :	 4  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.85 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.9779239868164, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (79.52277791748047, 298.24586141357423, 474.80232584228514, 566.9911686157227) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (74.09305989990234, 225.3861083984375, 471.6910587524414, 281.5420143554687) 	 at (0.0, 61.12102091064453, 538.5830078125, 298.24586141357423)  in depth 1
detected: (475.68940830230713, 225.3861083984375, 526.9448120544433, 281.5420143554687) 	 at (471.6910587524414, 225.3861083984375, 538.5830078125, 281.5420143554687)  in depth 2


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 105.30% of area is overlapping (Max is 30.00%)
error :  (475.68940830230713, 225.3861083984375, 526.9448120544433, 281.5420143554687)
detected in p.86 :	 3  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.87 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.88 :	 1  tables
detected: (63.93817992117745, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93817992117745, 419.927200253635, 460.4708744262695, 613.024188635254) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (62.01801717529297, 168.8687515258789, 461.8793216918945, 406.1869575927734) 	 at (0.0, 61.12102091064453, 538.5830078125, 419.927200253635)  in depth 1
exception :  The identified boxes have significant overlap: 32.21% of area is overlapping (Max is 30.00%)
error :  (63.93817992117745, 419.927200253635, 460.4708744262695, 613.024188635254)
detected in p.89 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448703979492, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 461.7202238736239, 474.6448703979492, 648.6964908813477) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (498.41696548461914, 461.7202238736239, 530.4742844055176, 647.5635043057528)
exception :  The identified boxes have significant overlap: 31.45% of area is overlapping (Max is 30.00%)
error :  (79.3356746459961, 132.01703643798828, 475.0975528930664, 439.21485065917966)
exception :  No rows or columns detected
error :  (507.0418472290039, 132.01703643798828, 528.3614876220703, 418.0252338623047)
detected in p.90 :	 2  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.91 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.92 :	 1  tables
detected: (63.93808248291016, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93808248291016, 120.20414388427734, 460.47098123779296, 301.96236002197264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 48.74% of area is overlapping (Max is 30.00%)
error :  (63.93808248291016, 120.20414388427734, 460.47098123779296, 301.96236002197264)
detected in p.93 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.94 :	 1  tables
detected: (63.93805603434245, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93805603434245, 240.13516737603084, 460.4708744262695, 416.14888345947264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (60.04661214599609, 102.39830017089844, 462.06798135986327, 214.40695881347654) 	 at (0.0, 61.12102091064453, 538.5830078125, 240.13516737603084)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.95 :	 3  tables
detected: (344.71012532958986, 33.09214437255859, 474.9779239868164, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.96 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.97 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.98 :	 1  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.99 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.100 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.101 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.102 :	 1  tables
detected: (63.938193363444014, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.103 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.104 :	 1  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.105 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6443465128581, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 471.26160848388673, 474.6443465128581, 593.1149479125977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (487.99763743082684, 471.26160848388673, 532.6694659342448, 593.1149479125977) 	 at (474.6443465128581, 471.26160848388673, 538.5830078125, 593.1149479125977)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.106 :	 3  tables
detected: (63.93805603434245, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93805603434245, 325.48560750732423, 460.4709659790039, 436.11497843017577) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (61.61341512451172, 139.91595458984375, 461.08406412353514, 311.85947346191404) 	 at (0.0, 61.12102091064453, 538.5830078125, 325.48560750732423)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.107 :	 3  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.108 :	 1  tables
detected: (63.93805603434245, 203.75507772216798, 460.46997924397783, 655.2362858032227) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.109 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (61.234615686035156, 199.53399312744142, 463.89683878173827, 665.0237613891602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (61.47402608642578, 97.93407821655273, 462.153125402832, 177.70962452392578) 	 at (0.0, 61.12102091064453, 538.5830078125, 199.53399312744142)  in depth 1
exception :  The identified boxes have significant overlap: 33.86% of area is overlapping (Max is 30.00%)
error :  (61.234615686035156, 199.53399312744142, 463.89683878173827, 665.0237613891602)
detected in p.113 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.64415323486327, 61.12102091064453) 	

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.114 :	 3  tables
detected: (63.93820302734375, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93820302734375, 351.3981746459961, 460.4708744262695, 573.298602697754) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.115 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.5451923583984, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 300.20411336669923, 474.5451923583984, 612.4601017211914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (502.9308052062988, 336.271084552002, 529.0151131103515, 612.4601017211914) 	 at (474.5451923583984, 300.20411336669923, 538.5830078125, 612.4601017211914)  in depth 1
detected: (77.22063100585937, 117.86360168457031, 473.85969888916014, 297.64246296386716) 	 at (0.0, 61.12102091064453, 538.5830078125, 300.20411336669923)  i

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.116 :	 5  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.117 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.7063862060547, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 120.20414388427734, 474.7063862060547, 357.9952579711914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (507.7499122619629, 120.20414388427734, 528.3690216491699, 326.5148010253906) 	 at (474.7063862060547, 120.20414388427734, 538.5830078125, 357.9952579711914)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (507.7499122619629, 120.20414388427734, 528.3690216491699, 326.5148010253906)
detected in p.118 :	 2  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 220.20411336669923, 460.47098632405596, 409.60667764892577) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (60.12856710205078, 104.45193481445312, 462.1156498168945, 186.94150471191406) 	 at (0.0, 61.12102091064453, 538.5830078125, 220.20411336669923)  in depth 1
Filling in gap at top of table
detected in p.119 :	 3  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.120 :	 1  tables
detected: (63.938072564697265, 33.09214437255859, 176.86720082792397, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 120.20414388427734, 460.47188150634764, 637.1055484985352) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.121 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.122 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.23973500976562, 107.03940236816406, 461.28438150634764, 671.4933048461914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 45.32% of area is overlapping (Max is 30.00%)
error :  (64.23973500976562, 107.03940236816406, 461.28438150634764, 671.4933048461914)
detected in p.123 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 240.20411336669923, 474.644206640625, 659.7552677368164) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (502.52133560180664, 240.20411336669923, 522.3415176818847, 634.8132934570312) 	 at (474.644206640625, 240.20411336669923, 538.583

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (502.52133560180664, 240.20411336669923, 522.3415176818847, 634.8132934570312)
detected in p.124 :	 2  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.125 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448398803711, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 120.20414388427734, 474.6448398803711, 359.00667154541014) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (506.5881156921387, 120.20414388427734, 526.6227447937011, 320.88568115234375) 	 at (474.6448398803711, 120.20414388427734, 538.5830078125, 359.00667154541014)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (506.5881156921387, 120.20414388427734, 526.6227447937011, 320.88568115234375)
detected in p.126 :	 2  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 423.20411336669923, 460.47188150634764, 584.461444494629) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.75919759521484, 104.69497680664062, 459.76509439697264, 375.5117866943359) 	 at (0.0, 61.12102091064453, 538.5830078125, 423.20411336669923)  in depth 1
detected in p.127 :	 3  tables
detected: (344.71012532958986, 33.09214437255859, 474.6441837524414, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 436.73290254017223, 474.6441837524414, 578.1476017211914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (489.4347381591797, 459.07

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.128 :	 3  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.129 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448398803711, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 120.20414388427734, 474.6448398803711, 655.063495275879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.130 :	 2  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.131 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6441837524414, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 240.20411336669923, 474.6441837524414, 550.6250797485352) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (509.33788299560547, 240.20411336669923, 532.4841530273437, 547.7478942871094) 	 at (474.6441837524414, 240.20411336669923, 538.5830078125, 550.6250797485352)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.132 :	 3  tables
detected: (63.688351037597656, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (61.85902059326172, 255.33735311279298, 462.30147135009764, 664.071796057129) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 38.98% of area is overlapping (Max is 30.00%)
error :  (61.85902059326172, 255.33735311279298, 462.30147135009764, 664.071796057129)
detected in p.133 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 177.38872945556642, 474.644206640625, 381.14647257080077) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (506.8268280029297, 177.38872945556642, 534.8433830688476, 381.14647257080077) 	 at (474.644206640625, 177.38872945556642, 538.5830

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (506.8268280029297, 177.38872945556642, 534.8433830688476, 381.14647257080077)
detected in p.134 :	 2  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.135 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.136 :	 1  tables
detected: (63.93835866699219, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.137 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 403.92872274169923, 474.644206640625, 566.800067541504) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.90314901123047, 128.37010955810547, 473.49827921142577, 364.0778267333984) 	 at (0.0, 61.12102091064453, 538.5830078125, 403.92872274169923)  in depth 1
detected: (508.85314559936523, 128.37010955810547, 531.2913887451172, 364.0778267333984) 	 at (473.49827921142577, 128.37010955810547, 538.5830078125, 364.0778267333984)  in depth 2


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.138 :	 4  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.139 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 398.74430501708986, 474.644206640625, 561.461444494629) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (500.60630798339844, 408.2205016662598, 522.3847420166015, 561.461444494629) 	 at (474.644206640625, 398.74430501708986, 538.5830078125, 561.461444494629)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.140 :	 3  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.23511159667969, 306.90174520263673, 461.1170841430664, 668.4169498657227) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (72.8558162475586, 132.65931701660156, 450.3753849243164, 263.8591225097656) 	 at (0.0, 61.12102091064453, 538.5830078125, 306.90174520263673)  in depth 1
exception :  The identified boxes have significant overlap: 30.38% of area is overlapping (Max is 30.00%)
error :  (72.8558162475586, 132.65931701660156, 450.3753849243164, 263.8591225097656)
detected in p.141 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (80.90245473632812, 122.40241658935547, 474.395068762207, 586.0706974243164) 	 at (0.0, 0.0, 538.5830078125, 737.0079

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 39.34% of area is overlapping (Max is 30.00%)
error :  (80.90245473632812, 122.40241658935547, 474.395068762207, 586.0706974243164)
detected in p.142 :	 1  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 240.20411336669923, 460.47093546142577, 441.14647257080077) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (61.84511220703125, 107.19757461547852, 461.7712589477539, 207.60144733886716) 	 at (0.0, 61.12102091064453, 538.5830078125, 240.20411336669923)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.143 :	 3  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 468.73289822455513, 474.644206640625, 610.1476017211914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (494.7815685272217, 468.73289822455513, 531.443816418457, 610.1476017211914) 	 at (474.644206640625, 468.73289822455513, 538.5830078125, 610.1476017211914)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.144 :	 3  tables
detected: (63.93835866699219, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (59.14230764160156, 483.85368001708986, 461.79771768798827, 662.3045841430664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.24266469726562, 416.8080139160156, 456.8508487915039, 483.85368001708986) 	 at (0.0, 61.12102091064453, 538.5830078125, 483.85368001708986)  in depth 1
detected in p.145 :	 3  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (77.56774556884766, 220.48681295166017, 476.51390421142577, 595.5419498657227) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (501.0009765625, 220.48681295166017, 523.4353783081054, 595.5419498657227) 	 at (476.51390421142577, 220.48681295166017, 538.5830078125, 595.5419498657227) 

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.146 :	 4  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.147 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.148 :	 1  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 482.83008992919923, 460.47093546142577, 644.1307560180664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.149 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.150 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.01748311767578, 125.8594325805664, 461.7169681762695, 668.1979557250977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 40.79% of area is overlapping (Max is 30.00%)
error :  (64.01748311767578, 125.8594325805664, 461.7169681762695, 668.1979557250977)
detected in p.151 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (80.42506063232422, 118.02363240966797, 474.054980871582, 655.1543766235352) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 43.50% of area is overlapping (Max is 30.00%)
error :  (80.42506063232422, 118.02363240966797, 474.054980871582, 655.1543766235352)
detected in p.152 :	 1  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 177.00964010009767, 460.47093546142577, 460.51655924072264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.153 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.61726796875, 304.11146199951173, 476.014209387207, 656.952655432129) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (82.82391774902344, 126.29156494140625, 471.49800455322264, 248.64437031249997) 	 at (0.0, 61.12102091064453, 538.5830078125, 304.111

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.154 :	 4  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.09949147949219, 147.9988826538086, 460.5927005981445, 454.797656652832) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.155 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 115.96269643554687, 474.644206640625, 519.5716739868164) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.156 :	 2  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.157 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (79.57532155761719, 130.73902547607423, 475.18684732666014, 639.3024479125977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (502.91284942626953, 347.3492515136719, 522.6991722534179, 541.6536560058594) 	 at (475.18684732666014, 130.73902547607423, 538.5830078125, 639.3024479125977)  in depth 1
detected: (507.34880447387695, 130.73902547607423, 529.0287735412597, 314.2465515136719) 	 at (475.18684732666014, 130.73902547607423, 538.5830078125, 347.3492515136719)  in depth 2


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 35.07% of area is overlapping (Max is 30.00%)
error :  (79.57532155761719, 130.73902547607423, 475.18684732666014, 639.3024479125977)
exception :  No rows or columns detected
error :  (502.91284942626953, 347.3492515136719, 522.6991722534179, 541.6536560058594)
exception :  list index out of range
error :  (507.34880447387695, 130.73902547607423, 529.0287735412597, 314.2465515136719)
detected in p.158 :	 1  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 114.3265956665039, 460.47093546142577, 340.989062902832) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.159 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 476.6114460205078, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 279.130

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.160 :	 4  tables
detected: (64.13318288574219, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.161 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6441303466797, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 459.8648494506836, 474.6441303466797, 641.303973791504) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (80.72218740234375, 108.47775650024414, 472.576221105957, 401.8478157470703) 	 at (0.0, 61.12102091064453, 538.5830078125, 459.8648494506836)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 40.96% of area is overlapping (Max is 30.00%)
error :  (80.72218740234375, 108.47775650024414, 472.576221105957, 401.8478157470703)
detected in p.162 :	 2  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.92468679199219, 245.32523763427736, 461.355029699707, 657.7769962524414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.163 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6442142700195, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 300.20411336669923, 474.6442142700195, 481.3039737915039) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (76.40221059570312, 104.78326416015625, 473.0921817993164, 288.8071663330078) 	 at (0.0, 61.12102091064453, 538.5830078125, 300.20411336669923)  in depth 1
detected: (498.28889083862305, 136.38462865600587, 525.71122288208, 288.8071663330078) 	 at (473.0921817993164, 104.78326416015625, 538.5830078125, 288.8071663330078)  in depth 2


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.164 :	 4  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.165 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.94534719238281, 135.57658040771486, 473.93944132080077, 583.1800113891602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 36.58% of area is overlapping (Max is 30.00%)
error :  (78.94534719238281, 135.57658040771486, 473.93944132080077, 583.1800113891602)
detected in p.166 :	 1  tables
detected: (63.93805603434245, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93805603434245, 117.30579030761719, 460.4708744262695, 241.77647745361327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.167 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.168 :	 1  tables
detected: (63.93835866699219, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.169 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.170 :	 1  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 441.57706868896486, 460.4708744262695, 643.8157535766602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (62.66445577392578, 136.308349609375, 461.38667642822264, 402.0192329833984) 	 at (0.0, 61.12102091064453, 538.5830078125, 441.57706868896486)  in depth 1
detected in p.171 :	 3  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.78337514648437, 274.26591146240236, 474.97664224853514, 546.917499182129) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.172 :	 2  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 120.20414388427734, 460.4718509887695, 261.61897623291014) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.173 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448398803711, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 120.20414388427734, 474.6448398803711, 261.71898233642577) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (491.49902153015137, 126.50137496490478, 530.6972496459961, 261.71898233642577) 	 at (474.6448398803711, 120.20414388427734, 538.5830078125, 261.71898233642577)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.174 :	 3  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 260.20411336669923, 460.4718509887695, 422.86497843017577) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (60.75136983642578, 109.65083312988281, 460.34996378173827, 222.29070209960935) 	 at (0.0, 61.12102091064453, 538.5830078125, 260.20411336669923)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.175 :	 3  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448398803711, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 320.20411336669923, 474.6448398803711, 481.461475012207) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.34616506347656, 103.11627197265625, 473.22481119384764, 261.92302631835935) 	 at (0.0, 61.12102091064453, 538.5830078125, 320.20411336669923)  in depth 1
detected: (494.6860885620117, 124.04874837646484, 530.2768623779297, 261.92302631835935) 	 at (473.22481119384764, 103.11627197265625, 538.5830078125, 261.92302631835935)  in depth 2


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.176 :	 4  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 420.20411336669923, 460.4718509887695, 660.8314396118164) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.177 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6511824503581, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.178 :	 1  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 284.31315267333986, 460.47093546142577, 506.3275638793945) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (58.74237478027344, 139.9680938720703, 461.65266763916014, 239.45816730957029) 	 at (0.0, 61.12102091064453, 538.5830078125, 284.31315267333986)  in depth 1
detected in p.179 :	 3  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.180 :	 1  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 115.0732997680664, 460.47093546142577, 281.461475012207) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.181 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 440.20411336669923, 474.644206640625, 607.1149479125977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (491.6897430419922, 463.96730685729983, 530.4751350830078, 607.1149479125977) 	 at (474.644206640625, 440.20411336669923, 538.5830078125, 607.1149479125977)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.182 :	 3  tables
detected: (63.93805603434245, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93805603434245, 548.7328982245551, 460.4709659790039, 610.7774845336914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (62.398189904785156, 205.21102905273438, 462.3563724731445, 512.3624336669922) 	 at (0.0, 61.12102091064453, 538.5830078125, 548.7328982245551)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.183 :	 3  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.72860372314453, 243.34235799560548, 475.7460209106445, 551.9237247680664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (502.172420501709, 243.34235799560548, 532.0220024536133, 551.9237247680664) 	 at (475.7460209106445, 243.34235799560548, 538.5830078125, 551.9237247680664)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.184 :	 3  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 122.97567403564453, 460.47093546142577, 228.03035318603514) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.185 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.68988243852095, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 92.81308400878906, 290.3920170043945, 217.8778263305664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (293.4731140136719, 92.81308400878906, 474.7961342285156, 217.8778263305664) 	 at (290.3920170043945, 92.81308400878906, 538.5830078125, 217.8778263305664)  in depth 1
detected: (491.4039779876709, 113.40163684387207, 530.1171520874024, 217.8778263305664) 	 at (474.7961342285156, 92.81308400878906, 538.5830078125, 217.

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Unnamed: 0,Unnamed: 1


exception :  table does not sound
exception :  No rows or columns detected
error :  (490.34586958451706, 113.40163684387207, 531.2599466230913, 217.8778263305664)
errs :  1
detected in p.186 :	 5  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.3511546875, 308.76942098388673, 461.41026651611327, 655.8091617797852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (60.55484426269531, 125.8310775756836, 451.2718692993164, 234.97183918457029) 	 at (0.0, 61.12102091064453, 538.5830078125, 308.76942098388673)  in depth 1
exception :  The identified boxes have significant overlap: 70.07% of area is overlapping (Max is 30.00%)
error :  (63.3511546875, 308.76942098388673, 461.41026651611327, 655.8091617797852)
detected in p.187 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.64409728597, 61.12102091064453) 	 at (0.0, 0.0, 538.583

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.188 :	 2  tables
detected: (63.93835866699219, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (59.20310628662109, 466.18034017333986, 461.7269779418945, 665.4350162719727) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (61.71394002685547, 393.0751647949219, 459.12215006103514, 466.18034017333986) 	 at (0.0, 61.12102091064453, 538.5830078125, 466.18034017333986)  in depth 1
exception :  The identified boxes have significant overlap: 52.43% of area is overlapping (Max is 30.00%)
error :  (59.20310628662109, 466.18034017333986, 461.7269779418945, 665.4350162719727)
detected in p.189 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.190 :	 1  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 442.873255348036, 460.47093546142577, 584.2882877563477) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.191 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.192 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.94086110839844, 132.61054647216798, 462.05186807861327, 653.787860510254) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 36.15% of area is overlapping (Max is 30.00%)
error :  (64.94086110839844, 132.61054647216798, 462.05186807861327, 653.787860510254)
detected in p.193 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 260.20411336669923, 474.644206640625, 461.14647257080077) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (502.55459213256836, 260.20411336669923, 530.2172729919433, 430.2496795654297) 	 at (474.644206640625, 260.20411336669923, 538.5830

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (502.55459213256836, 260.20411336669923, 530.2172729919433, 430.2496795654297)
detected in p.194 :	 4  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.3319896484375, 135.93363607177736, 462.064258215332, 666.6919742797852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 37.64% of area is overlapping (Max is 30.00%)
error :  (64.3319896484375, 135.93363607177736, 462.064258215332, 666.6919742797852)
detected in p.195 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 240.20411336669923, 474.644206640625, 381.61897623291014) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (492.2

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


table does not sound


Unnamed: 0,Unnamed: 1,Unnamed: 2


exception :  table does not sound
errs :  1
detected in p.196 :	 4  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.197 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 200.1852840209961, 474.644206640625, 401.14647257080077) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (506.50426483154297, 200.1852840209961, 533.7728950927734, 401.14647257080077) 	 at (474.644206640625, 200.1852840209961, 538.5830078125, 401.14647257080077)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (506.50426483154297, 200.1852840209961, 533.7728950927734, 401.14647257080077)
detected in p.198 :	 2  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 279.4196590209961, 460.47093546142577, 421.61897623291014) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (60.78005635986328, 113.8348159790039, 460.4093204711914, 224.23628925781247) 	 at (0.0, 61.12102091064453, 538.5830078125, 279.4196590209961)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 36.05% of area is overlapping (Max is 30.00%)
error :  (60.78005635986328, 113.8348159790039, 460.4093204711914, 224.23628925781247)
detected in p.199 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 476.07905924072264, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.77418935546875, 433.37171590576173, 475.98277628173827, 664.5865665649414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.200 :	 2  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (60.42407644042969, 208.84142720947267, 463.72117960205077, 659.3263736938477) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (61.48647725830078, 98.2142219543457, 463.1254154418945, 179.7128517578125) 	 at (0.0, 61.12102091064453, 538.5830078125, 208.84142720947267)  in depth 1
exception :  The identified boxes have significant overlap: 34.96% of area is overlapping (Max is 30.00%)
error :  (60.42407644042969, 208.84142720947267, 463.72117960205077, 659.3263736938477)
detected in p.201 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11132657775879, 199.98202169189454, 474.644206640625, 381.3039737915039) 	 at (0.0, 0.0, 538.5830078125, 737.007

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.202 :	 3  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 335.6797908569336, 460.47093546142577, 457.30806314697264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.203 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.204 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.205 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 115.67337453613281, 474.644206640625, 384.926074621582) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.206 :	 2  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.207 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6441837524414, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 114.59383809814453, 474.6441837524414, 348.56667346365793) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (509.3785972595215, 114.59383809814453, 531.7764694641113, 333.1673126220703) 	 at (474.6441837524414, 114.59383809814453, 538.5830078125, 348.56667346365793)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (509.3785972595215, 114.59383809814453, 531.7764694641113, 333.1673126220703)
detected in p.208 :	 2  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 276.26713216552736, 460.47093546142577, 437.524463293457) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.209 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6511824503581, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.210 :	 1  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.211 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.212 :	 1  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 177.57934224853517, 460.47093546142577, 480.3590580200195) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.213 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 440.20411336669923, 474.644206640625, 581.6188541625977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (489.5420169830322, 459.23844981689456, 531.1978599975586, 581.6188541625977) 	 at (474.644206640625, 440.20411336669923, 538.5830078125, 581.6188541625977)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.214 :	 3  tables
detected: (63.93835866699219, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (61.48353994140625, 459.3118709350586, 461.97798502197264, 669.926532385254) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.215 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 565.5428957831489, 474.644206640625, 667.272479650879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (492.20491790771484, 565.5428957831489, 530.4617569396972, 614.8380576239692) 	 at (474.644206640625, 565.5428957831489, 538.5830078125, 667.272479650879)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.216 :	 3  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.217 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.218 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.219 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448398803711, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 220.20411336669923, 474.6448398803711, 440.98897135009764) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (499.5334777832031, 220.20411336669923, 534.817164654541, 427.99745178222656) 	 at (474.6448398803711, 220.20411336669923, 538.5830078125, 440.98897135009764)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.220 :	 3  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.221 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448627685547, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 292.14197957763673, 474.9914737915039, 438.626758215332) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (77.13497579345703, 126.48355865478516, 473.38411295166014, 256.1567604492187) 	 at (0.0, 61.12102091064453, 538.5830078125, 292.14197957763673)  in depth 1
detected: (492.478723526001, 128.0447629714966, 531.0504677246093, 256.1567604492187) 	 at (473.38411295166014, 126.48355865478516, 538.5830078125, 256.1567604492187)  in depth 2


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (77.13497579345703, 126.48355865478516, 473.38411295166014, 256.1567604492187)
detected in p.222 :	 3  tables
detected: (63.93835866699219, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.223 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11132657775879, 116.55101430664062, 474.644206640625, 281.461475012207) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (490.25519943237305, 127.93771625061035, 531.5257065246582, 281.461475012207) 	 at (474.644206640625, 116.55101430664062, 538.5830078125, 281.461475012207)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.224 :	 3  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.225 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448398803711, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 120.20414388427734, 474.6448398803711, 340.989062902832) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (508.88882064819336, 120.20414388427734, 531.2062942932129, 338.45851135253906) 	 at (474.6448398803711, 120.20414388427734, 538.5830078125, 340.989062902832)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Filling in gap at top of table
exception :  No rows or columns detected
error :  (508.88882064819336, 120.20414388427734, 531.2062942932129, 338.45851135253906)
detected in p.226 :	 2  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.79945791015625, 120.9534724975586, 461.12300455322264, 557.7974430297852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 35.66% of area is overlapping (Max is 30.00%)
error :  (63.79945791015625, 120.9534724975586, 461.12300455322264, 557.7974430297852)
detected in p.227 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6442142700195, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 120.20414388427734, 474.6442142700195, 441.26366007080077) 	 at (0.0, 0.0, 538.5830078125, 737.00799

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (506.06286239624023, 132.57255626220703, 525.6965515563965, 425.6811828613281)
detected in p.228 :	 2  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.229 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448398803711, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 120.20414388427734, 474.6448398803711, 301.3039737915039) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (496.5252113342285, 135.46214938659668, 528.8669617126465, 292.45497131347656) 	 at (474.6448398803711, 120.20414388427734, 538.5830078125, 301.3039737915039)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (496.5252113342285, 135.46214938659668, 528.8669617126465, 292.45497131347656)
detected in p.230 :	 2  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 405.3772146184748, 460.4708744262695, 658.3648868774414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 32.64% of area is overlapping (Max is 30.00%)
error :  (63.938072564697265, 405.3772146184748, 460.4708744262695, 658.3648868774414)
detected in p.231 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.6447941040039, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.232 :	 1  tables
detected: (63.937923791503906, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.72067678222656, 127.02540242919922, 460.9193302368164, 577.1613346313477) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 36.14% of area is overlapping (Max is 30.00%)
error :  (63.72067678222656, 127.02540242919922, 460.9193302368164, 577.1613346313477)
detected in p.233 :	 1  tables
detected: (341.9461722569057, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 120.20414388427734, 474.644206640625, 377.828662512207) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (507.49352264404297, 120.20414388427734, 529.0462715576172, 377.828662512207) 	 at (474.644206640625, 120.20414388427734, 538.58300781

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (507.49352264404297, 120.20414388427734, 529.0462715576172, 377.828662512207)
detected in p.234 :	 2  tables
detected: (63.93835866699219, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.235 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 482.21207167290584, 474.644206640625, 663.3118473266602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (506.8296813964844, 482.21207167290584, 528.3669807861328, 615.493172539605) 	 at (474.644206640625, 482.21207167290584, 538.5830078125, 663.3118473266602)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.236 :	 3  tables
detected: (63.938324334716796, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.237 :	 1  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 116.74483907470703, 474.644206640625, 271.4110599731445) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (496.73669624328613, 126.33510661621094, 530.6569588134765, 271.4110599731445) 	 at (474.644206640625, 116.74483907470703, 538.5830078125, 271.4110599731445)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.238 :	 3  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 116.10477864990234, 460.47093546142577, 261.61897623291014) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.239 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 115.30390584716797, 474.644206640625, 281.461475012207) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (490.4215660095215, 132.80665469665527, 531.6345932434082, 281.461475012207) 	 at (474.644206640625, 115.30390584716797, 538.5830078125, 281.461475012207)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.240 :	 3  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 197.83404195556642, 460.47093546142577, 389.60667764892577) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.64974630126953, 97.68538665771484, 461.62776529541014, 170.12472462158203) 	 at (0.0, 61.12102091064453, 538.5830078125, 197.83404195556642)  in depth 1
detected in p.241 :	 3  tables
detected: (344.71012532958986, 33.09214437255859, 474.644206640625, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 198.6741756225586, 474.644206640625, 382.6075626586914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (492.7908821105957, 198.6741756225586, 533.5670196960449, 334.91529846191406) 	 at (474.644206640625, 198.6741756225586, 538.5830078125, 382.6075626586914) 

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (492.7908821105957, 198.6741756225586, 533.5670196960449, 334.91529846191406)
detected in p.242 :	 2  tables
detected: (78.11112630615234, 212.0389217163086, 474.6447534138997, 669.9849735473633) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.248 :	 1  tables
detected: (63.93808909505208, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93808909505208, 412.51115682373046, 460.4708744262695, 547.8117008422852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.249 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.467090246582, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 92.53495443115234, 474.467090246582, 197.09911539306643) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in dept

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Unnamed: 0,Unnamed: 1,Unnamed: 2


exception :  table does not sound
errs :  1
detected in p.250 :	 4  tables
detected: (63.93809736022949, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93809736022949, 132.03889555838447, 460.47163736572264, 213.92598306884764) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93809736022949, 462.8089412475586, 460.47163736572264, 567.3731022094727) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (73.46751821289062, 227.37283325195312, 461.3525577758789, 462.8089412475586) 	 at (0.0, 213.92598306884764, 538.5830078125, 462.8089412475586)  in depth 1
detected in p.251 :	 4  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448475097656, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 132.03888509521482, 474.6448475097656, 281.95723306884764) 	 at (0.0, 0.0, 538.5830078125, 7

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Filling in gap at top of table
detected in p.252 :	 3  tables
detected: (63.938130372032745, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938130372032745, 278.763134362793, 459.8295126447405, 654.1543877208363) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.253 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.7223275324895, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 273.11141927490235, 474.7223275324895, 648.5030516911434) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (503.4687802241399, 273.11141927490235, 527.9307851191594, 602.9082824707032) 	 at (474.7223275324895, 273.11141927490235, 538.5830078125, 648.5030516911434)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.254 :	 3  tables
detected: (64.44210068423622, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.44210068423622, 137.99288900146485, 461.00288736572264, 639.4077091430664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.255 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.78257028808594, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 137.99287346524324, 474.78257028808594, 639.4076866070087) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.256 :	 2  tables
detected: (63.938112709844674, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938112709844674, 133.04631917724612, 460.470883581543, 662.807880041504) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.257 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.64483571888314, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 132.86706706136067, 474.64483571888314, 662.6286023489816) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (502.3800770152699, 347.47944458821615, 522.4664868175159, 521.4495442708333) 	 at (474.64483571888314, 132.86706706136067, 538.5830078125, 662.6286023489816)  in depth 1
detected: (509.050029407848, 132.86706706136067, 531.7579068004261, 335.1014912923177) 	 at (474.64483571888314, 132.86706706136067, 538.5830078125, 34

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Filling in gap at top of table
exception :  No rows or columns detected
error :  (502.3800770152699, 347.47944458821615, 522.4664868175159, 521.4495442708333)
exception :  list index out of range
error :  (509.050029407848, 132.86706706136067, 531.7579068004261, 335.1014912923177)
detected in p.258 :	 2  tables
detected: (63.93811595687866, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93811595687866, 118.87330281982422, 460.4709659790039, 669.5005680297852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.259 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.0824238037109, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 121.31720873690519, 474.0824238037109, 671.9442936157227) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.260 :	 2  tables
detected: (63.93811632151885, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93811632151885, 92.53490865478516, 460.4709659790039, 676.6618284612483) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.261 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.04491617431637, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 92.53488784734553, 474.04491617431637, 676.661826502901) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.262 :	 2  tables
detected: (63.938112709844674, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938112709844674, 131.20414388427736, 460.4708744262695, 660.965643713379) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.263 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 473.74278327303796, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 131.53603271891274, 473.74278327303796, 661.2976086966379) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.264 :	 2  tables
detected: (63.938111715537616, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938111715537616, 132.92828939208982, 460.4708744262695, 634.3431095336914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.265 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 473.74707376708983, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 132.9283260131836, 473.74707376708983, 634.3430973266602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Filling in gap at top of table
detected in p.266 :	 2  tables
detected: (63.71390722045898, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.71390722045898, 171.10978353271486, 460.2348819946289, 665.7216722429548) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.267 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 473.6991749750046, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.8201259399414, 171.1097694476788, 473.6991749750046, 665.721651726936) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.268 :	 2  tables
detected: (63.93811595687866, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93811595687866, 119.10930745849609, 460.47088427064955, 693.4658146118164) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.269 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 475.33140909423827, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.8201259399414, 120.88117635498047, 475.33140909423827, 692.8788377702986) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.270 :	 2  tables
detected: (63.93811664564345, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93811664564345, 105.29086263427735, 460.47088314557755, 671.6188541625977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.271 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 475.33372843017577, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.8201259399414, 102.6284984375, 476.46174967041014, 671.6188541625977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.272 :	 2  tables
detected: (63.938115315614894, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938115315614894, 120.29083415120442, 460.47094388006803, 669.8942622262138) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.273 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6439741984049, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 118.90455621066623, 474.6439741984049, 669.0678355536568) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.274 :	 2  tables
detected: (63.93811614472361, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.90771901855469, 92.53493408610026, 460.47093546142577, 673.3198167742048) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.275 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.8274014181698, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.30012166748047, 92.5349195539202, 474.8274014181698, 673.3198158053929) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.276 :	 2  tables
detected: (63.9381157569147, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.9381157569147, 92.53493408610026, 460.4709393991778, 655.1779449131558) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.277 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.82705938568114, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.30012166748047, 92.5349195539202, 474.82705938568114, 655.1779429755318) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.278 :	 2  tables
detected: (63.93811648821149, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93811648821149, 118.8732478881836, 460.47188150634764, 682.6506778930665) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.279 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6439175713433, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 118.12594259033203, 474.6439175713433, 682.2965275024414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.280 :	 2  tables
detected: (63.93811223754882, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93811223754882, 120.0309627319336, 460.4708774780273, 654.4938256795248) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.281 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 475.31745950927734, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.8201259399414, 120.88117432047525, 475.31745950927734, 654.8942040974936) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.282 :	 2  tables
detected: (63.938112709844674, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938112709844674, 123.71607422281902, 460.4708744262695, 676.1541813110352) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.283 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 475.45438185511995, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 124.41501504171619, 475.45438185511995, 676.853117707095) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.284 :	 2  tables
detected: (63.93811223754882, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93811223754882, 124.41501221110026, 460.4708744262695, 659.5619124308269) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.285 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 475.4523578857422, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 124.41501504171619, 475.4523578857422, 659.561911015519) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.286 :	 2  tables
detected: (63.9381258389177, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.9381258389177, 205.4481502319336, 460.4742084716797, 684.1858707641602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.287 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6448477728482, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 206.15699145174892, 474.6448477728482, 685.8867750234751) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (502.1600916632291, 206.15699145174892, 522.574222107565, 428.8575259121981) 	 at (474.6448477728482, 206.15699145174892, 538.5830078125, 685.8867750234751)  in depth 1
detected: (500.82383372865877, 206.15699145174892, 522.4893450934048, 557.775723544034) 	 at (474.6448477728482, 206.15699145174892, 538.5830078125, 685.88677

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (500.82383372865877, 206.15699145174892, 522.574222107565, 557.775723544034)
detected in p.288 :	 2  tables
detected: (63.827124002075195, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.827124002075195, 214.17438585476344, 460.41381876220703, 353.376758215332) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.289 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.3553704793294, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (77.72185679524739, 214.17438585476344, 474.3553704793294, 353.376758215332) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (496.2637418111165, 214.17438585476344, 531.4501271993001, 298.8043026394314) 	 at (474.3553704793294, 214.17438585476344, 538.5830078125, 353.376758215332)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.290 :	 3  tables
detected: (63.93802297363281, 219.84399450073244, 460.46997924397783, 359.0461612915039) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.291 :	 1  tables
detected: (78.11112630615234, 205.92091024169923, 474.65120279541014, 671.7439762329102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.294 :	 1  tables
detected: (63.938072564697265, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.938072564697265, 92.53495443115234, 461.88386881103514, 678.4607242797852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.295 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.3516155456543, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (77.75722539672851, 212.00328481445314, 474.3516155456543, 651.2792545532227) 	 at (0.0, 0.0, 538.583007812

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.296 :	 3  tables
detected: (63.93814351806641, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93814351806641, 203.75598151198164, 460.475000402832, 671.279803869629) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.297 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 475.11686034749346, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.8201259399414, 103.88718362285907, 475.11686034749346, 672.8463932250977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.298 :	 2  tables
detected: (62.87529154866536, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (62.87529154866536, 104.92105813457782, 458.8768802856445, 676.0785709594727) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.299 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 475.11686034749346, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.8201259399414, 105.39330591853215, 475.11686034749346, 668.2975040649414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.300 :	 2  tables
detected: (64.08302266845703, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (64.08302266845703, 204.5032067598783, 460.76475870361327, 667.310077307129) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.301 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6442549601237, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 98.0681728149414, 474.6442549601237, 668.5442692016602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.302 :	 2  tables
detected: (63.93805603434245, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93805603434245, 103.59100759277344, 460.4708744262695, 666.335467932129) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.303 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6442549601237, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 104.5821346069336, 474.6442549601237, 666.2828556274414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.304 :	 2  tables
detected: (78.11112630615234, 194.5821346069336, 474.65095865478514, 660.1720768188477) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.308 :	 1  tables
detected: (63.93805603434245, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93805603434245, 92.53496315046038, 460.4708744262695, 660.563739416504) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.309 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6442142700195, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 92.53496315046038, 474.6442142700195, 660.563739416504) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.310 :	 2  tables
detected: (63.93805603434245, 33.09214437255859, 174.63487589111327, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.93805603434245, 92.53496315046038, 460.4708744262695, 676.8629947875977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.311 :	 2  tables
detected: (344.71012532958986, 33.09214437255859, 474.6442142700195, 61.12102091064453) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (78.11112630615234, 92.53496315046038, 474.6442142700195, 678.2803531860352) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.312 :	 2  tables
527 283 527
./train_source/재정통계해설.pdf /content/src/
Processing 재정통계해설.pdf...
/content/src/train_source/재정통계해설.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/재정통계해설.pdf


  0%|          | 0/164 [00:00<?, ?it/s]

detected: (0.0, 0.0, 532.9130249023438, 728.5040283203125) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.5 :	 1  tables
detected: (0.0, 0.0, 532.9130249023438, 728.5040283203125) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.6 :	 1  tables
detected: (143.94719763183593, 89.30606146240234, 452.1672127685547, 695.5585091552734) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
exception :  The identified boxes have significant overlap: 35.93% of area is overlapping (Max is 30.00%)
error :  (143.94719763183593, 89.30606146240234, 452.1672127685547, 695.5585091552734)
detected: (79.71106024169922, 89.66639013671875, 478.6707833251953, 642.9484627685547) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.8 :	 1  tables
detected: (67.40705938720703, 593.6350943603516, 192.4129403076172, 641.2349617919922) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: 

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.11 :	 2  tables
detected: (86.77438612365722, 432.4782617520419, 478.42106476508246, 620.0303109130859) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.12 :	 1  tables
detected: (61.10299749755859, 237.47393103027343, 447.8851082763672, 629.4993660888672) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
exception :  The identified boxes have significant overlap: 31.14% of area is overlapping (Max is 30.00%)
error :  (61.10299749755859, 237.47393103027343, 447.8851082763672, 629.4993660888672)
detected: (86.86185013834636, 115.54197912597655, 477.39898614501953, 290.47001224772134) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.14 :	 1  tables
detected: (55.78691584361684, 113.28783483886718, 446.21796960449217, 206.26168298339843) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.15 :	 1  tables
detected: (86.94521017456054, 115.54193029785156, 477.3989327392578

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.18 :	 3  tables
detected: (67.86761541748047, 325.77300329589843, 447.87811975097657, 620.0497811279297) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.19 :	 1  tables
detected: (86.56176062011718, 89.83510656738281, 418.1649239501953, 259.3361249097741) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (418.1649239501953, 98.49756661224365, 430.11607512664796, 259.3361249097741) 	 at (418.1649239501953, 89.83510656738281, 532.9130249023438, 259.3361249097741)  in depth 1
detected: (55.65790005111694, 350.1089747043186, 446.21795739746096, 477.9898717312283) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.21 :	 3  tables
detected: (83.20316381835937, 55.292524736676896, 476.8989327392578, 666.8199837646484) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.22 :	 1  tables
detected: (86.81417986479259, 115.5419669189453, 477.3190644226074, 278.9581368408203) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (86.81417986479259, 308.774906829834, 477.3190644226074, 629.5060257025825) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.24 :	 2  tables
detected: (86.56176062011718, 505.78191442871093, 412.76593713378907, 603.2972643253102) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.25 :	 1  tables
detected: (116.68246527099609, 155.3880850830078, 201.0474443533761, 293.9609414660884) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (204.3290373938424, 251.44119396972656, 444.2421382010324, 293.9609414660884) 	 at (201.0474443533761, 155.3880850830078, 532.9130249023438, 293.9609414660884)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.26 :	 2  tables
detected: (82.56176062011718, 374.78442704264324, 413.1329415283203, 519.2952896584903) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.27 :	 1  tables
detected: (86.69518041992187, 228.21907873535156, 477.39993981933594, 509.4581520996094) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.28 :	 1  tables
detected: (84.95906134033203, 533.8895930979659, 405.23912235514325, 598.1117318115234) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (405.23912235514325, 533.8895930979659, 505.1429989420573, 598.1117318115234) 	 at (405.23912235514325, 533.8895930979659, 532.9130249023438, 598.1117318115234)  in depth 1
exception :  The identified boxes have significant overlap: 46.60% of area is overlapping (Max is 30.00%)
error :  (405.23912235514325, 533.8895930979659, 505.1429989420573, 598.1117318115234)
detected in p.29 :	 1  tables
detected: (86.77854350789387, 115.5419721

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.42 :	 1  tables
detected: (64.26052923583984, 283.56914587402343, 447.39017419433594, 638.8456185302734) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.43 :	 1  tables
detected: (86.76661749267578, 272.5380103149414, 477.398980211046, 446.60396575131625) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.44 :	 1  tables
detected: (65.2324683227539, 242.21624060058593, 449.0791085205078, 649.4780648193359) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
exception :  The identified boxes have significant overlap: 31.46% of area is overlapping (Max is 30.00%)
error :  (65.2324683227539, 242.21624060058593, 449.0791085205078, 649.4780648193359)
detected: (86.71782924860173, 115.54203609212239, 477.3989525756836, 497.74187866741676) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.46 :	 1  tables
detected: (82.72686071777343, 246.3190161743164, 476.9711678466797, 640.

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.48 :	 2  tables
detected: (294.7580496826172, 0.0, 511.41495446777344, 728.5040283203125) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
table does not sound


Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3


exception :  table does not sound
errs :  1
detected: (121.38392960611978, 143.7048321126302, 445.21195153808594, 343.3505623779297) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
exception :  The identified boxes have significant overlap: 42.48% of area is overlapping (Max is 30.00%)
error :  (121.38392960611978, 143.7048321126302, 445.21195153808594, 343.3505623779297)
detected: (125.41835852050781, 143.78440669759115, 441.38324670410157, 344.8255684814453) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (95.43646307373047, 143.78440669759115, 125.19946984863282, 344.8255684814453) 	 at (0.0, 143.78440669759115, 125.41835852050781, 344.8255684814453)  in depth 1
exception :  The identified boxes have significant overlap: 78.35% of area is overlapping (Max is 30.00%)
error :  (125.41835852050781, 143.78440669759115, 441.38324670410157, 344.8255684814453)
detected in p.54 :	 1  tables
detected: (51.54576177978515, 53.89635725402832, 445.71

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (82.72686071777343, 55.1060514351981, 476.8989327392578, 666.8209603271484) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.58 :	 1  tables
detected: (301.4200675048828, 0.0, 511.41495446777344, 728.5040283203125) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (201.2456290283203, 283.4943778076172, 301.4200675048828, 641.9825814208984) 	 at (0.0, 0.0, 301.4200675048828, 728.5040283203125)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.60 :	 2  tables
detected: (125.76265783691406, 144.76399843052454, 442.6185372314453, 345.98754561360676) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
exception :  The identified boxes have significant overlap: 32.73% of area is overlapping (Max is 30.00%)
error :  (125.76265783691406, 144.76399843052454, 442.6185372314453, 345.98754561360676)
detected: (120.82196366373698, 141.5180899658203, 449.80704431152344, 343.6492553536551) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
exception :  The identified boxes have significant overlap: 75.77% of area is overlapping (Max is 30.00%)
error :  (120.82196366373698, 141.5180899658203, 449.80704431152344, 343.6492553536551)
detected: (125.78889532470703, 130.78487463378906, 444.51727990722657, 313.9483178100586) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (117.89304418945312, 423.20659704589843, 415.4634116868239, 583.4069462343414) 	 at (0.0, 0.0, 532.91302

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (82.72686071777343, 55.10591192626953, 476.8989327392578, 476.89896325683594) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (499.70723724365234, 87.96025219726562, 518.9058633728027, 179.66551971435547) 	 at (476.8989327392578, 55.10591192626953, 532.9130249023438, 476.89896325683594)  in depth 1
detected: (498.1454563140869, 89.14211026000976, 517.4653726501465, 343.22664642333984) 	 at (476.8989327392578, 55.10591192626953, 532.9130249023438, 476.89896325683594)  in depth 1
exception :  list index out of range
error :  (498.1454563140869, 87.96025219726562, 518.9058633728027, 343.22664642333984)
detected in p.70 :	 1  tables
detected: (93.85305853271484, 413.9352957763672, 415.67584924316407, 582.5073693237305) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (61.65586157226562, 319.25225134277343, 439.00592736816407, 402.20722131347657) 	 at (0.0, 0.0, 532.9130249023438, 413.9352957763672)  in depth 1
detected: (56.4

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (125.99056311035156, 141.0287100830078, 443.9056466064453, 409.64325443522137) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.80 :	 1  tables
detected: (90.24012569173176, 165.6120841064453, 411.765548034668, 389.93082578822543) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (56.955742553710934, 66.92252416992187, 446.6170723876953, 122.95757989501953) 	 at (0.0, 0.0, 532.9130249023438, 165.6120841064453)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.81 :	 2  tables
detected: (51.54576177978515, 53.89635725402832, 445.71793298339844, 666.8209603271484) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
Filling in gap at top of table
detected in p.83 :	 1  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (192.995064453125, 93.19710989379882, 371.0999520263672, 115.90110330200196) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.84 :	 1  tables
detected: (124.97435827636718, 136.0627677001953, 441.8875496826172, 403.6314984828404) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.86 :	 1  tables
detected: (69.13806982421875, 272.0812003173828, 446.9137032470703, 657.8796761474609) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.87 :	 1  tables
detected: (124.932564453125, 132.6117178955078, 441.1574471435547, 253.8975616280692) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.88 :	 1  tables
detected: (90.59972448730468, 435.5628897705078, 407.30155114746094, 576.9866664167132) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (407.30155114746094, 445.49705067443847, 428.168149559021, 576.9866664167132) 	 at (407.30155114746094, 435.5628897705078, 532.9130249023438, 576.9866664167132)  in depth 1
exception :  The identified boxes have significant overlap: 34.09% of area is overlapping (Max is 30.00%)
error :  (90.59972448730468, 435.5628897705078, 407.30155114746094, 576.9866664167132)
detected in p.89 :	 1  tables
detected: (89.90142635091145, 136.12069006347656, 410.4682381591797, 407.55092117745534) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.91 :	 1  tables
detected: (124.80376501464843, 226.40691442871093, 434.2396639404297, 409.4546875915527) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.92 :	 1  tables
detected: (68.70386572265625, 265.8577811279297, 447.41376428222657, 651.4045174560547) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
exception :  The identified boxes have significant overlap: 32.96% of area is overlapping (Max is 30.00%)
error :  (68.70386572265625, 265.8577811279297, 447.41376428222657, 651.4045174560547)
detected: (121.7002432861328, 132.7206046142578, 439.24262170410157, 247.55882087053573) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.94 :	 1  tables
detected: (125.54195471191406, 131.82161779785156, 441.5528328857422, 247.21233736746652) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.96 :	 1  tables
detected: (51.54576177978515, 53.89635725402832, 445.71793298339844, 666.8209603271484) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.97 :	 1  tables
detected: (223.3190619506836, 93.19710989379882, 338.9742684326172, 115.90110330200196) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (343.12519760131835, 99.46333542633056, 459.85194567871093, 115.90110330200196) 	 at (338.9742684326172, 93.19710989379882, 532.9130249023438, 115.90110330200196)  in depth 1
detected: (122.31066198730468, 120.47896643066406, 440.40262536621094, 291.8727464773996) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (184.70909948730468, 388.0998312988281, 380.5539315185547, 412.8302071533203) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (122.31066198730468, 120.47894668579102, 440.40262536621094, 291.87272673252653) 	 at (0.0, 115.90110330200196, 532.9130249023438, 388.0998312988281)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.98 :	 4  tables
detected: (86.62815924072265, 564.6398551025391, 179.23454217529297, 650.3196785888672) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (197.9261589050293, 564.6398551025391, 350.83849200439454, 650.3196785888672) 	 at (179.23454217529297, 564.6398551025391, 532.9130249023438, 650.3196785888672)  in depth 1
detected: (197.92616339111328, 564.6398551025391, 290.5325234375, 650.3196785888672) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (86.62815924072265, 564.6398786621094, 179.23454217529297, 650.3196785888672) 	 at (0.0, 564.6398551025391, 197.92616339111328, 650.3196785888672)  in depth 1
detected: (308.4528923034668, 564.6398786621094, 401.0592676086426, 650.3196785888672) 	 at (290.5325234375, 564.6398551025391, 532.9130249023438, 650.3196785888672)  in depth 1
detected: (308.4529044189453, 564.6398551025391, 401.05927209472657, 650.3196785888672) 	 at (0.0, 0.0, 532.9130249023438, 728.50402832

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Unnamed: 0,보통세,목적세


exception :  table does not sound
errs :  1
detected in p.99 :	 1  tables
detected: (122.57431351725259, 130.17879553222656, 440.659232421875, 251.11785494559152) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (122.57431351725259, 362.00183426920574, 283.6524727783203, 461.57279138183594) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (283.6524727783203, 432.7205391642253, 438.2440477294922, 461.57279138183594) 	 at (283.6524727783203, 362.00183426920574, 532.9130249023438, 461.57279138183594)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.100 :	 3  tables
detected: (125.27919592285156, 122.64840002441406, 451.54434899902344, 364.03709683566626) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
exception :  The identified boxes have significant overlap: 74.83% of area is overlapping (Max is 30.00%)
error :  (125.27919592285156, 122.64840002441406, 451.54434899902344, 364.03709683566626)
detected: (51.54576177978515, 53.89635725402832, 445.71793298339844, 666.8209603271484) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
Filling in gap at top of table
detected in p.103 :	 1  tables
detected: (125.9646689453125, 119.5820522664388, 441.92902307128907, 358.5626540963309) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (231.26206274414062, 93.19710989379882, 332.83194665527344, 115.90110330200196) 	 at (0.0, 0.0, 532.9130249023438, 119.5820522664388)  in depth 1
detected: (335.76438331604004, 97.24639740753173, 459.2378221435547, 115.90110330200196) 	

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.104 :	 3  tables
detected: (125.37637396240234, 143.6330262075571, 444.29193811035157, 364.3959284532335) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
exception :  The identified boxes have significant overlap: 64.36% of area is overlapping (Max is 30.00%)
error :  (125.37637396240234, 143.6330262075571, 444.29193811035157, 364.3959284532335)
detected: (124.73935766601562, 142.0165030517578, 443.6396248779297, 404.4771253138951) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.110 :	 1  tables
detected: (95.22525854492187, 224.63060827636718, 406.3679411214193, 406.7625802001953) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (59.449822143554684, 80.17792578125, 446.93476037597657, 128.33807305908203) 	 at (0.0, 0.0, 532.9130249023438, 224.63060827636718)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.111 :	 2  tables
detected: (120.8936977742513, 167.10918493652343, 447.87572920735676, 404.4999227294922) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
exception :  The identified boxes have significant overlap: 49.13% of area is overlapping (Max is 30.00%)
error :  (120.8936977742513, 167.10918493652343, 447.87572920735676, 404.4999227294922)
detected: (121.92432979329428, 150.13387365722656, 443.9051278076172, 347.92486178152905) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (443.9051278076172, 150.13387365722656, 466.2058807296753, 347.92486178152905) 	 at (443.9051278076172, 150.13387365722656, 532.9130249023438, 347.92486178152905)  in depth 1
detected: (82.36082525634765, 70.46102972412109, 478.50736169433594, 123.410208984375) 	 at (0.0, 0.0, 532.9130249023438, 150.13387365722656)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.116 :	 3  tables
detected: (95.28775854492187, 172.89608068847656, 409.61600896747296, 412.5474399937221) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.117 :	 1  tables
detected: (51.54576177978515, 53.89635725402832, 445.71793298339844, 666.8209603271484) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.119 :	 1  tables
detected: (225.87606115722656, 93.19710989379882, 340.32600717163086, 115.90110330200196) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (121.73296423339843, 120.06478186035156, 442.98795251464844, 374.7341657918294) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.120 :	 2  tables
detected: (171.90364468819755, 91.17073889160156, 331.4149239501953, 115.90110330200196) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (94.45306463623047, 118.27211828613281, 420.03913049316407, 386.28461134847004) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.121 :	 2  tables
detected: (209.28035803222656, 93.19710989379882, 355.71280603027344, 115.90110330200196) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (125.66016455078125, 119.91506262207031, 451.70954064941407, 386.47036908224356) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.122 :	 2  tables
detected: (143.9548575439453, 434.8349844970703, 364.27423791503907, 460.3908455810547) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (77.83676214599609, 495.51058264160156, 299.5731270751953, 521.0664437255859) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (62.849487976074215, 258.9370047607422, 446.15665368652344, 434.8349844970703) 	 at (0.0, 0.0, 532.9130249023438, 434.8349844970703)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.127 :	 3  tables
detected: (132.73325923665365, 186.8267466583252, 440.9046678641183, 388.2925815952846) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (28.184914306640625, 186.8267466583252, 132.73325923665365, 388.2925815952846) 	 at (0.0, 186.8267466583252, 132.73325923665365, 388.2925815952846)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.128 :	 2  tables
detected: (118.06266088867187, 139.52840490722656, 447.0268319091797, 405.14183958217075) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.130 :	 1  tables
detected: (101.55213772837321, 170.21949835205078, 405.48216333007815, 298.69786985560825) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (12.86509390258789, 170.21949835205078, 97.98421411132813, 298.69786985560825) 	 at (0.0, 170.21949835205078, 101.55213772837321, 298.69786985560825)  in depth 1
detected: (60.1396948852539, 74.51168890380859, 445.3864510498047, 123.00078515625) 	 at (0.0, 0.0, 532.9130249023438, 170.21949835205078)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.131 :	 3  tables
detected: (116.99655981445312, 136.03218908691406, 444.9614327392578, 407.1934578857422) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.134 :	 1  tables
detected: (118.48416204833984, 137.76400061035156, 443.11945275878907, 405.93205303083147) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.138 :	 1  tables
detected: (114.93216009521484, 137.3055655517578, 446.89554528808594, 405.848430507115) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.142 :	 1  tables
detected: (96.120064453125, 431.04610510253906, 384.52293583170575, 592.9175863037109) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (385.36080144700554, 515.2179884033203, 415.0740692698161, 592.9175863037109) 	 at (384.52293583170575, 431.04610510253906, 532.9130249023438, 592.9175863037109)  in depth 1
detected: (385.3608067278181, 549.6989913872612, 410.28095178222657, 592.9175863037109) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.145 :	 2  tables
detected: (86.69511390955307, 113.35958166503906, 477.39993981933594, 500.28897790527344) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.146 :	 1  tables
detected: (114.49926062011718, 191.8978507080078, 433.4029610595703, 405.93265466308594) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.148 :	 1  tables
detected: (116.13416357421875, 162.04841571916853, 443.4459450683594, 405.4726154413638) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.150 :	 1  tables
detected: (51.54576177978515, 53.89635725402832, 445.71793298339844, 666.8209603271484) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.151 :	 1  tables
detected: (219.93649816894532, 381.72188635253906, 353.336234375, 406.45226220703125) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (339.6403586425781, 490.2543875732422, 440.5486621500651, 537.9662850341797) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.152 :	 2  tables
detected: (91.2812252400716, 516.0149313964844, 404.60260705566407, 612.8584562581381) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (67.84755773925781, 245.49898596191406, 446.2419197998047, 490.61673669433594) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.153 :	 2  tables
detected: (130.3527277760225, 129.56485815429687, 427.45414893567886, 225.28956791178385) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected: (116.03736181640625, 351.72992357288706, 446.14923791503907, 490.9012004300631) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 30.45% of area is overlapping (Max is 30.00%)
error :  (116.03736181640625, 351.72992357288706, 446.14923791503907, 490.9012004300631)
detected in p.154 :	 1  tables
detected: (56.2769133605957, 114.85047979736328, 446.6025155029297, 641.9490120849609) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.155 :	 1  tables
detected: (87.25640173339843, 107.22008581542968, 475.91327600097657, 643.9664681396484) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.156 :	 1  tables
detected: (56.03360815429687, 89.54754705810547, 445.3783028564453, 644.4747078857422) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.157 :	 1  tables
detected: (88.24754400634765, 87.75048895263672, 475.1918709716797, 605.6476204833984) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
detected in p.158 :	 1  tables
detected: (57.91359014892578, 82.080

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.159 :	 2  tables
detected: (89.41627569580078, 80.5084692993164, 350.12982873535157, 602.4159920654297) 	 at (0.0, 0.0, 532.9130249023438, 728.5040283203125)  in depth 0
exception :  The identified boxes have significant overlap: 33.00% of area is overlapping (Max is 30.00%)
error :  (89.41627569580078, 80.5084692993164, 350.12982873535157, 602.4159920654297)
130 90 130
./train_source/국토교통부_전세임대(융자).pdf /content/src/
Processing 국토교통부_전세임대(융자).pdf...
/content/src/train_source/국토교통부_전세임대(융자).pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/국토교통부_전세임대(융자).pdf


  0%|          | 0/4 [00:00<?, ?it/s]

detected: (51.49688489467076, 188.08062294921876, 544.377118528054, 259.749394140625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.49688489467076, 254.12762001953126, 544.377118528054, 325.79742880859374) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.49688489467076, 536.7785599609375, 544.377118528054, 668.6223799804687) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (49.63839653930664, 327.24761513671876, 544.377118528054, 431.138066015625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.49688489467076, 449.5146134277344, 544.377118528054, 511.5764510253906) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (71.76915291748047, 97.67351845703125, 407.20053549804686, 165.0793501953125) 	 at (0.0, 0.0, 595.0, 188.08062294921876)  in depth 1
detected: (51.54259931844076, 259.749394140625, 544.3370609700521, 325.7974333007812) 	 at (0.0, 259.749394140625, 595.0, 327.24761513671876)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 6  tables
detected: (50.10360014241537, 218.40661171875, 550.8133996744791, 479.5893904785156) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (50.10360014241537, 97.37163857421875, 543.8574060221354, 172.60541220703124) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.1 :	 2  tables
detected: (73.3706009765625, 650.1746170898438, 544.2170658528646, 703.315800390625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected in p.2 :	 1  tables
detected: (79.84759835205078, 187.480586328125, 544.8163903971354, 276.77142783203124) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (53.28945855102539, 137.9684250732422, 501.16891928710936, 187.480586328125) 	 at (0.0, 0.0, 595.0, 187.480586328125)  in depth 1
detected in p.3 :	 2  tables
11 4 11
./train_source/고용노동부_청년일자리창출지원.pdf /content/src/
Processing 고용노동부_청년일자리창출지원.pdf...
/content/src/train_source/고용노동부_청년일자리창출지원.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/고용노동부_청년일자리창출지원.pdf


  0%|          | 0/3 [00:00<?, ?it/s]

detected: (51.182799361165365, 453.709590234375, 544.2250695963542, 542.76038046875) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 346.4266007324219, 544.2250695963542, 402.83752890625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 169.260615625, 544.2250695963542, 314.8894087890625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (76.70550850830078, 75.2702820678711, 411.8840376953125, 145.758671484375) 	 at (0.0, 0.0, 595.0, 169.260615625)  in depth 1
Filling in gap at top of table
detected in p.0 :	 4  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (73.3706009765625, 386.1036026855469, 544.2170658528646, 454.1774031738281) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (52.30293587646484, 70.36753395996094, 546.1914107421875, 386.1036026855469) 	 at (0.0, 0.0, 595.0, 386.1036026855469)  in depth 1
detected in p.1 :	 2  tables
detected: (259.3010208984375, 366.30279091796876, 530.4326827148437, 469.14252158203124) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.2 :	 1  tables
7 3 7
./train_source/고용노동부_내일배움카드(일반).pdf /content/src/
Processing 고용노동부_내일배움카드(일반).pdf...
/content/src/train_source/고용노동부_내일배움카드(일반).pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/고용노동부_내일배움카드(일반).pdf


  0%|          | 0/4 [00:00<?, ?it/s]

detected: (51.182799361165365, 176.6926224609375, 544.2250695963542, 251.8383833984375) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 249.0936234375, 544.2250695963542, 324.239384375) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (46.7745430847168, 353.2596085449219, 544.2250695963542, 408.62738486328124) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (48.71673897705078, 460.4226029296875, 544.2250695963542, 536.167362890625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (70.99149636230469, 79.8102753540039, 411.76840659179686, 138.70912619628905) 	 at (0.0, 0.0, 595.0, 176.6926224609375)  in depth 1
detected: (51.54259931844076, 251.8383833984375, 543.8574060221354, 324.2393888671875) 	 at (0.0, 251.8383833984375, 595.0, 353.2596085449219)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 5  tables
detected: (54.03471306762695, 142.36985329589845, 537.5889327148437, 784.9164473632812) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
exception :  The identified boxes have significant overlap: 34.89% of area is overlapping (Max is 30.00%)
error :  (54.03471306762695, 142.36985329589845, 537.5889327148437, 784.9164473632812)
detected: (73.3706009765625, 636.1495926757813, 544.628593359375, 688.970585546875) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected in p.2 :	 1  tables
6 2 6
./train_source/보건복지부_노인일자리 및 사회활동지원.pdf /content/src/
Processing 보건복지부_노인일자리 및 사회활동지원.pdf...
/content/src/train_source/보건복지부_노인일자리 및 사회활동지원.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/보건복지부_노인일자리 및 사회활동지원.pdf


  0%|          | 0/5 [00:00<?, ?it/s]

detected: (51.182799361165365, 529.1076004882813, 544.2250695963542, 626.3093916992187) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 400.24859169921876, 544.2250695963542, 479.66467734375) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 188.67962197265626, 544.2250695963542, 365.9534041503906) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 3  tables
detected: (79.84759835205078, 273.187617578125, 544.8163903971354, 501.8854109863281) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected in p.2 :	 1  tables
4 2 4
./train_source/중소벤처기업부_창업사업화지원.pdf /content/src/
Processing 중소벤처기업부_창업사업화지원.pdf...
/content/src/train_source/중소벤처기업부_창업사업화지원.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/중소벤처기업부_창업사업화지원.pdf


  0%|          | 0/2 [00:00<?, ?it/s]

detected: (51.54259931844076, 492.5476029296875, 544.5284021158855, 569.7304122070312) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.54259931844076, 383.46660927734376, 544.5284021158855, 440.82333823242186) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.54259931844076, 161.708613671875, 544.5284021158855, 344.25738974609374) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (60.78220681152344, 65.04171303710937, 387.03577109375, 121.30766745605469) 	 at (0.0, 0.0, 595.0, 161.708613671875)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 4  tables
detected: (63.6423294921875, 484.3955948730469, 526.3464196940104, 542.3242232421875) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected in p.1 :	 1  tables
5 2 5
./train_source/보건복지부_생계급여.pdf /content/src/
Processing 보건복지부_생계급여.pdf...
/content/src/train_source/보건복지부_생계급여.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/보건복지부_생계급여.pdf


  0%|          | 0/4 [00:00<?, ?it/s]

detected: (51.182799361165365, 492.4276078125, 544.2250695963542, 582.3173873046875) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 369.20159462890626, 544.2250695963542, 440.77801962890624) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 183.04558876953126, 544.2250695963542, 340.541386328125) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (64.28278664550781, 91.2607148071289, 386.4445845703125, 161.31125327148436) 	 at (0.0, 0.0, 595.0, 183.04558876953126)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 4  tables
detected: (67.69804696044922, 99.7136185546875, 543.2658736328125, 650.7962081054687) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
exception :  The identified boxes have significant overlap: 41.01% of area is overlapping (Max is 30.00%)
error :  (67.69804696044922, 99.7136185546875, 543.2658736328125, 650.7962081054687)
detected: (82.72559924723308, 292.36559609375, 544.8163903971354, 381.6564070800781) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (52.01249627075195, 97.49961403808594, 529.571171484375, 225.50758811035155) 	 at (0.0, 0.0, 595.0, 292.36559609375)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.3 :	 2  tables
6 2 6
./train_source/국토교통부_소규모주택정비사업.pdf /content/src/
Processing 국토교통부_소규모주택정비사업.pdf...
/content/src/train_source/국토교통부_소규모주택정비사업.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/국토교통부_소규모주택정비사업.pdf


  0%|          | 0/4 [00:00<?, ?it/s]

detected: (51.48259921671549, 188.08062294921876, 543.9239614691841, 259.749394140625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.48259921671549, 254.12762001953126, 543.9239614691841, 325.79742880859374) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.48259921671549, 531.1445877929688, 543.9239614691841, 633.2614180664062) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (50.011344232177734, 347.0266068359375, 543.9239614691841, 427.41623374023436) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.48259921671549, 452.3905899902344, 543.9239614691841, 513.3579451660156) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (76.80162362060547, 99.54028634033203, 420.48874350585936, 160.41835471191405) 	 at (0.0, 0.0, 595.0, 188.08062294921876)  in depth 1
detected: (51.54259931844076, 259.749394140625, 543.8574060221354, 325.7974333007812) 	 at (0.0, 259.749394140625, 595.0, 347.0266068359375)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 30.10% of area is overlapping (Max is 30.00%)
error :  (50.011344232177734, 347.0266068359375, 543.9239614691841, 427.41623374023436)
detected in p.0 :	 5  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (50.10360014241537, 232.9116166015625, 544.4568933268229, 760.3223921875) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (50.10360014241537, 105.9696, 544.4568933268229, 181.11639853515624) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
Filling in gap at top of table
detected in p.1 :	 2  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (74.41010216674805, 466.05559853515626, 544.6699467122396, 517.3113448242187) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (68.57359881998698, 598.8706009765625, 544.6699467122396, 771.427616796875) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.3 :	 2  tables
9 3 9
./train_source/국토교통부_민간임대(융자).pdf /content/src/
Processing 국토교통부_민간임대(융자).pdf...
/content/src/train_source/국토교통부_민간임대(융자).pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/국토교통부_민간임대(융자).pdf


  0%|          | 0/3 [00:00<?, ?it/s]

detected: (51.49688489467076, 188.08062294921876, 544.377118528054, 259.749394140625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.49688489467076, 254.12762001953126, 544.377118528054, 325.79742880859374) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.49688489467076, 327.24761513671876, 544.377118528054, 423.73041220703124) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.49688489467076, 449.5146134277344, 544.377118528054, 498.88840170898436) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.49688489467076, 527.5485794921875, 544.377118528054, 652.3203780273437) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (71.33357552490234, 97.72579506835937, 406.6133162597656, 164.6401412109375) 	 at (0.0, 0.0, 595.0, 188.08062294921876)  in depth 1
detected: (51.54259931844076, 259.749394140625, 544.3370609700521, 325.7974333007812) 	 at (0.0, 259.749394140625, 595.0, 327.24761513671876)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 6  tables
detected: (50.10360014241537, 254.00762490234376, 544.4568933268229, 448.78339072265624) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (50.10360014241537, 113.4016068359375, 544.4568933268229, 194.30139609375) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
Filling in gap at top of table
Filling in gap at top of table
detected in p.1 :	 2  tables
detected: (79.96760109863281, 640.9445755859375, 544.516728125, 765.7163741210937) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (73.3706009765625, 489.8299515625, 544.516728125, 539.5759932617187) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
exception :  The identified boxes have significant overlap: 34.89% of area is overlapping (Max is 30.00%)
error :  (79.96760109863281, 640.9445755859375, 544.516728125, 765.7163741210937)
detected in p.2 :	 1  tables
9 3 9


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


./train_source/고용노동부_조기재취업수당.pdf /content/src/
Processing 고용노동부_조기재취업수당.pdf...
/content/src/train_source/고용노동부_조기재취업수당.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/고용노동부_조기재취업수당.pdf


  0%|          | 0/3 [00:00<?, ?it/s]

detected: (51.182799361165365, 163.147578515625, 544.3211267252605, 234.8173873046875) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 229.19561318359376, 544.3211267252605, 300.864384375) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 326.64858559570314, 544.3211267252605, 374.1044051269531) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 425.7806046386719, 544.3211267252605, 506.67938681640624) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (67.34893349609375, 73.73018959960937, 405.9071395019531, 147.03888388671874) 	 at (0.0, 0.0, 595.0, 163.147578515625)  in depth 1
detected: (51.54259931844076, 234.8173873046875, 544.3370609700521, 300.8643888671875) 	 at (0.0, 234.8173873046875, 595.0, 326.64858559570314)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 33.04% of area is overlapping (Max is 30.00%)
error :  (51.182799361165365, 326.64858559570314, 544.3211267252605, 374.1044051269531)
Filling in gap at top of table
detected in p.0 :	 4  tables
detected: (73.3706009765625, 715.3836014648438, 544.2170658528646, 768.1134078125) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (46.93140725097656, 523.421016015625, 525.2799727539062, 661.965702734375) 	 at (0.0, 0.0, 595.0, 715.3836014648438)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.1 :	 2  tables
detected: (71.01259926757812, 214.69060830078126, 392.41840048828124, 267.541386328125) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (71.01259926757812, 266.2346146484375, 392.41840048828124, 319.08539267578124) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (397.0156888961792, 289.0297570983887, 519.1040357788086, 314.70911407470703) 	 at (392.41840048828124, 266.2346146484375, 595.0, 319.08539267578124)  in depth 1
detected: (410.71862411499023, 267.7572551528931, 504.0382703979492, 309.0699005126953) 	 at (392.41840048828124, 266.2346146484375, 595.0, 319.08539267578124)  in depth 1
detected: (59.97760323486328, 388.980586328125, 392.41840048828124, 456.57440634765624) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (407.071008682251, 394.6643586914063, 499.99138776855466, 441.9763298034668) 	 at (392.41840048828124, 388.980586328125, 595.0, 456.57440634765624)  in depth 1
detected: (59.97760323486328, 452.3905899902344, 392.4184004882812

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.2 :	 7  tables
13 3 13
./train_source/2024년도 성과계획서(총괄편).pdf /content/src/
Processing 2024년도 성과계획서(총괄편).pdf...
/content/src/train_source/2024년도 성과계획서(총괄편).pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/2024년도 성과계획서(총괄편).pdf


  0%|          | 0/345 [00:00<?, ?it/s]

detected: (83.05689656982422, 160.45252645263673, 445.2271915649414, 666.0303531860352) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.4 :	 1  tables
detected: (81.77570760498047, 198.98861348876954, 435.6418644165039, 656.1751285766602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 56.38% of area is overlapping (Max is 30.00%)
error :  (81.77570760498047, 198.98861348876954, 435.6418644165039, 656.1751285766602)
detected: (87.92208516845703, 196.04656636962892, 441.75593912353514, 671.4420963500977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.6 :	 1  tables
detected: (87.23733175048828, 189.10343587646486, 443.04512369384764, 656.0701481079102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.7 :	 1  tables
detected: (70.82962453613281, 289.630810144043, 473.1908451293945, 668.0730777954102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.9 :	 1  tables
detected: (108.38285482177734, 58.45486104736328, 473.6386905883789, 675.1153751586914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.10 :	 1  tables
detected: (99.55118978271484, 58.44947469482422, 472.69334757080077, 673.2177311157227) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.11 :	 1  tables
detected: (101.26425588378906, 58.23222005615234, 472.49397623291014, 669.6285587524414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.12 :	 1  tables
detected: (64.9965175415039, 59.009113671875, 477.64802896728514, 661.385760900879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.13 :	 1  tables
detected: (100.73766744384766, 60.30257833251953, 475.6848331665039, 288.75938760986327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.14 :	 1  tables
detected: (62.27412641296387, 445.0425838256836, 477.3642536376953, 642.3543277954102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.17 :	 1  tables
detected: (62.27412641296387, 428.91459310302736, 478.7754627441406,

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.18 :	 1  tables
detected: (60.05652272949219, 75.30814779052734, 476.55784189453124, 271.0070987915039) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.19 :	 1  tables
detected: (61.42292609727647, 261.3849910522461, 478.5033085083008, 373.0167423461914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (61.42292609727647, 422.4633601928711, 478.5033085083008, 679.0454899047852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.20 :	 2  tables
detected: (61.13172567138672, 55.55137288818359, 478.4226769978841, 674.0055118774414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.21 :	 1  tables
detected: (61.13172567138672, 55.55137288818359, 478.4226769978841, 483.49351846923827) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.22 :	 1  tables
detected: (61.06452596435547, 270.25538289794923, 477.28026163330077, 398.82152139892577) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (61.06452596435547, 454.71937215576173, 481.2450646931966, 676.021503088379) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.23 :	 2  tables
detected: (61.03092611083984, 54.70579183349609, 481.2450646931966, 660.9015079711914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.24 :	 1  tables
detected: (61.03092611083984, 54.825767877197265, 481.2450646931966, 649.6118961547852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.25 :	 1  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (61.03092611083984, 90.67472493896484, 478.74522135009767, 681.2631412719727) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.26 :	 1  tables
detected: (61.03092611083984, 54.99087941894531, 478.74522135009767, 612.315936682129) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.27 :	 1  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (62.15892609776088, 198.85165059814454, 478.2613956665039, 303.0615421508789) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.28 :	 1  tables
detected: (61.03092611083984, 86.21071279296875, 478.4226769978841, 681.2631412719727) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.29 :	 1  tables
detected: (61.03092611083984, 54.70172155151367, 478.4226769978841, 681.2631412719727) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.30 :	 1  tables
detected: (61.03092611083984, 54.69263875732422, 478.4226769978841, 579.455096838379) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.31 :	 1  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (62.33827280162464, 208.76737630615236, 475.02425838971817, 490.54951822509764) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.32 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 475.02632486572264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.33 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 233.30794107666014) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.34 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.35 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 263.34630167236327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 306.3417781616211, 477.8985050415039, 454.86632120361327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.36 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 239.55754434814452) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.37 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.38 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 249.0327335571289) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 292.0281795288086, 477.8985050415039, 673.602313635254) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.39 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 611.9127384399414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.40 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.41 :	 1  tables
detected: (63.375725425066264, 306.3417781616211, 477.8985050415039, 535.7079044555664) 	 at 

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.47 :	 1  tables
detected: (63.375725425066264, 306.3417781616211, 477.8985050415039, 504.6615177368164) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 263.34630167236327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.48 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 340.559131262207) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.49 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.50 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 537.119159338379) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.51 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 330.0759464477539) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.52 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.53 :	 1  tables
detected: (63.375725425066264, 306.3417781616211, 477.8985050415039, 482.888721105957) 	 at (0

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.55 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (66.8439754272461, 92.60607564697266, 463.487598059082, 136.37854349365233) 	 at (0.0, 0.0, 538.5830078125, 148.0857967163086)  in depth 1
detected in p.56 :	 2  tables
detected: (63.375725425066264, 306.3417781616211, 477.8985050415039, 479.86470377197264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 263.34630167236327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.57 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 304.87593424072264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.58 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.59 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 677.0294986938477) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.60 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 671.3847233032227) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.61 :	 1  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 180.48872720947264) 	 at 

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.62 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.63 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.64 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 167.9895206665039) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.65 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.66 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 668.5623356079102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.67 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 679.247089025879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.68 :	 1  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 472.60713541259764) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.69 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.2390934204102) 	 at (0.0

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.74 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.75 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 667.5543400024414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.76 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 676.827899572754) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.77 :	 1  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 677.4326969360352) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.78 :	 1  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 673.8039127563477) 	 at (0.0, 

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.80 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 671.3847233032227) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.81 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.82 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 676.627948400879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.83 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 681.1510196899414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.84 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 677.8358951782227) 	 at (0.0, 0.0, 538.5830078125, 737.007

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.89 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.90 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 679.247089025879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.91 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.92 :	 1  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 262.33833658447264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.93 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.94 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 616.1463199829102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.95 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.96 :	 1  tables
detected: (63.375725425066264, 296.664989831543, 477.8985050415039, 677.4326969360352) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 253.66954385986327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.97 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 282.0951114868164) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.98 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 672.1911197875977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.99 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 162.94954263916014) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.100 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.101 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 679.247089025879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.102 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 354.67113077392577) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.103 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.104 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 426.23912393798827) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.105 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.106 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 667.5543400024414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.107 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 525.2247501586914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.108 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.109 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.110 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 472.0023380493164) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.111 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.112 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 669.570331213379) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.119 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 223.02632486572264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.120 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.121 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.122 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 410.5143009399414) 	 a

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.127 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 674.408710119629) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.128 :	 1  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 673.1991153930664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.129 :	 1  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 521.5959049438477) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.130 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 677.2310978149414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.131 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 675.215106604004) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.132 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.133 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 677.0294986938477) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.134 :	 1  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.135 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 240.96873819580077) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.136 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.137 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 676.4247013305664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.138 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 679.0454899047852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.139 :	 1  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 327.455127355957) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.140 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.1339908813477) 	 at 

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.146 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.147 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 679.247089025879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.148 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 667.151141760254) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.149 :	 1  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 669.3687320922852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.150 :	 1  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 233.30794107666014) 	 at (

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.151 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.152 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.153 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 670.1751285766602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.154 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 674.408710119629) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.155 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 677.634296057129) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Fil

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.166 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.167 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.168 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 469.3815189575195) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.169 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.170 :	 1  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.178 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 441.963916418457) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.179 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.180 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 677.4326969360352) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.181 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 350.63911783447264) 	 at

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.187 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.188 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 673.1991153930664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.189 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 208.5111271118164) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.190 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.191 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 432.6903263305664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.192 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.193 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 671.5863224243164) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.194 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 674.408710119629) 	 at

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 379.871142980957) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.214 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.215 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 679.247089025879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.216 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 573.003924963379) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.217 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.218 :	 1  tables
detected: (62.29428670654296, 57.97056234130859, 477.7372257446289, 678.2390934204102) 	 at (

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.221 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.222 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 679.247089025879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.223 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 319.189502355957) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.224 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.225 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 224.0343204711914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.226 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.227 :	 1  tables
detected: (63.375725425066264, 306.3417781616211, 477.8985050415039, 485.9127384399414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 263.34630167236327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.228 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 299.634326574707) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.229 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.230 :	 1  tables
detected: (63.375725425066264, 306.3417781616211, 477.8985050415039, 668.5623356079102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 263.34630167236327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.231 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 528.4503360961914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.232 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 679.7220646118164) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.233 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 311.32713663330077) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.234 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.235 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 263.34630167236327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 306.3417781616211, 477.8985050415039, 458.0919376586914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.236 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 275.2407413696289) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.237 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.238 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 482.28389322509764) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.239 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 329.067920324707) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.240 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.245 :	 2  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 677.634296057129) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.246 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 233.71113931884764) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.247 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.248 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 263.34630167236327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 306.3417781616211, 477.8985050415039, 653.8455387329102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.249 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 526.0311466430664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.250 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.251 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 679.247089025879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.252 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 541.957538244629) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.253 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.069862

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.266 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 304.87593424072264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.267 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 673.4007145141602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.268 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 187.14155924072264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.269 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.270 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 679.247089025879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.271 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.565868

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.279 :	 1  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 604.050311682129) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.280 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.281 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 526.232684729004) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.282 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 401.64390909423827) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.283 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695)

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.289 :	 1  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 615.3399234985352) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.290 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.291 :	 1  tables
detected: (63.375725425066264, 306.3417781616211, 477.8985050415039, 679.247089025879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 263.34630167236327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.292 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 496.799121496582) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.293 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.2390934204102) 	

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.295 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.296 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 263.34630167236327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 306.3417781616211, 477.8985050415039, 454.86632120361327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.297 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 424.827930090332) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.298 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.299 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 679.247089025879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 34.39% of area is overlapping (Max is 30.00%)
error :  (63.375725425066264, 301.503368737793, 477.8985050415039, 679.247089025879)
detected in p.300 :	 1  tables
detected: (62.66052409261067, 56.76096761474609, 4

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.301 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.302 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 325.035937902832) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.303 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.304 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 679.247089025879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.305 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 230.28392374267577) 	 at

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.306 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 678.2390934204102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.307 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 347.41353189697264) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.308 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.309 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 674.0055118774414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.310 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 345.1959415649414) 	 

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.316 :	 2  tables
detected: (62.330127228461365, 110.18497884521484, 478.01946180148656, 600.4215275024414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
detected in p.317 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.318 :	 1  tables
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 677.0294986938477) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.319 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 379.4679142211914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.320 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.325 :	 2  tables
detected: (62.66052409261067, 56.76096761474609, 477.5658680175781, 287.13512003173827) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.326 :	 1  tables
detected: (62.32452714691162, 110.18497884521484, 478.0698620056152, 675.4167057250977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.327 :	 1  tables
detected: (62.307726902262374, 57.97056234130859, 477.7170648152669, 180.89192545166014) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.328 :	 1  tables
detected: (62.31732704206195, 148.0857967163086, 475.02425838971817, 430.0695072387695) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.329 :	 1  tables
detected: (63.375725425066264, 157.15775716552736, 477.8985050415039, 258.5079532836914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (63.375725425066264, 301.503368737793, 477.8985050415039, 503.0487247680664) 	

  0%|          | 0/9 [00:00<?, ?it/s]

detected: (0.0, 0.0, 595.2760009765625, 841.8900146484375) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 1  tables
detected: (640.4362094482422, 79.11028171386718, 1141.6215298095703, 111.94580959472657) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (41.976233251953126, 318.5662143310547, 540.6600836832682, 350.88531358235673) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.779157514105904, 379.9449985107422, 103.76186489257813, 408.44876979980467) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.779157514105904, 394.70137668457033, 558.9932339111328, 759.3649990478516) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (646.8011474609375, 494.1814170043946, 1142.2224297363282, 710.6767578125) 	 at (558.9932339111328, 394.70137668457033, 1190.550048828125, 759.3649990478516)  in depth 1
detected: (88.93021274414062, 160.45389938354492, 517.3770229736328, 221.45774315185548) 	 at (0.0, 111.94580959472657, 1190.550048828125, 318.5662143310547)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.1 :	 3  tables
detected: (41.26723361816406, 74.14928317871093, 540.6768276611328, 106.98481105957032) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (225.33073306884765, 646.3049228271484, 385.48877834472654, 667.170266381836) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.804165228271486, 678.7022006591797, 540.6768276611328, 754.0571987548828) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (629.8217074951172, 466.3999189615886, 1141.7717983642578, 588.0652553955078) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (440.33248020019533, 145.54169082641602, 1131.4844448486328, 463.4336418945312) 	 at (0.0, 106.98481105957032, 1190.550048828125, 466.3999189615886)  in depth 1
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 168.55% of area is overlapping (Max is 30.00%)
error :  (41.26723361816406, 74.14928317871093

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.2 :	 4  tables
detected: (632.2678744873047, 232.89423506266274, 1141.7758266845703, 602.4850552001953) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.779185064697266, 284.05514467569986, 540.8274420817057, 602.4850552001953) 	 at (0.0, 232.89423506266274, 632.2678744873047, 602.4850552001953)  in depth 1
detected: (48.779147871398926, 284.05513645019533, 540.8274420817057, 729.8982632080078) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (649.7280069986979, 284.05513645019533, 1141.7759372233072, 602.485107421875) 	 at (540.8274420817057, 284.05513645019533, 1190.550048828125, 729.8982632080078)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.3 :	 2  tables
detected: (48.779147871398926, 92.71623874511718, 546.7646572509766, 550.7486660400391) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (649.7241516113281, 92.71623874511718, 1142.480425341797, 550.7486660400391) 	 at (546.7646572509766, 92.71623874511718, 1190.550048828125, 550.7486660400391)  in depth 1
detected: (633.8916537841797, 92.71623874511718, 1142.4804165283203, 626.7171718994141) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.77915531005859, 92.71623874511718, 541.5358161051432, 550.7486572265625) 	 at (0.0, 92.71623874511718, 633.8916537841797, 626.7171718994141)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.4 :	 2  tables
detected: (48.835962672008165, 203.36717563476563, 540.8267910400391, 482.8163845458984) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (649.7240234375, 245.3671515625, 1141.7727837402344, 374.5801666259766) 	 at (540.8267910400391, 203.36717563476563, 1190.550048828125, 482.8163845458984)  in depth 1
detected: (649.8582960367838, 245.36716342773437, 1141.7727749267578, 374.5801784912109) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.904185064697266, 245.36716342773437, 540.8268215576172, 374.5801784912109) 	 at (0.0, 245.36716342773437, 649.8582960367838, 374.5801784912109)  in depth 1
detected: (70.66310764160156, 68.22891879882812, 1142.7768032470703, 203.36717563476563) 	 at (0.0, 0.0, 1190.550048828125, 203.36717563476563)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.5 :	 2  tables
detected: (649.7281406005859, 72.87404895629882, 1141.7758266845703, 727.7890102783203) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.779507926802204, 448.24013663940434, 540.8283169189453, 727.7890102783203) 	 at (0.0, 72.87404895629882, 649.7281406005859, 727.7890102783203)  in depth 1
detected: (48.779507926802204, 447.8007876953125, 545.7222988525391, 772.9190761962891) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (690.2151031494141, 447.8007876953125, 1141.775835498047, 727.7889862060547) 	 at (545.7222988525391, 447.8007876953125, 1190.550048828125, 772.9190761962891)  in depth 1
detected in p.6 :	 2  tables
detected: (41.26723361816406, 76.68126560058593, 540.3266689697266, 115.08277248535157) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (47.10306049194336, 172.24065280761718, 548.9706509033203, 672.3597500244141) 	 at (0.0, 0.0, 1190.550048828125, 841.890

  0%|          | 0/9 [00:00<?, ?it/s]

detected: (0.0, 0.0, 595.2760009765625, 841.8900146484375) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
exception :  The identified boxes have significant overlap: 34.65% of area is overlapping (Max is 30.00%)
error :  (0.0, 0.0, 595.2760009765625, 841.8900146484375)
detected: (640.4362094482422, 79.11028171386718, 1141.6215298095703, 111.94580959472657) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (41.976233251953126, 321.40123630371096, 540.4667243082682, 353.72022365722654) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (610.0922153076172, 156.12734104003906, 1146.5685512939453, 712.8389980712891) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (41.976233251953126, 321.4012580078125, 540.4667243082682, 353.7202453613281) 	 at (0.0, 156.12734104003906, 610.0922153076172, 712.8389980712891)  in depth 1
detected: (48.74037509765625, 382.77927924804686, 103.76186489257813, 411.28

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.2 :	 6  tables
detected: (493.2639682373047, 117.88960147705077, 1156.1040737548828, 385.1126186767578) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (60.28058505859375, 167.5259566467285, 493.2639682373047, 375.5717544555664) 	 at (0.0, 117.88960147705077, 493.2639682373047, 385.1126186767578)  in depth 1
exception :  The identified boxes have significant overlap: 35.92% of area is overlapping (Max is 30.00%)
error :  (493.2639682373047, 117.88960147705077, 1156.1040737548828, 385.1126186767578)
detected in p.3 :	 1  tables
detected: (41.26723361816406, 74.14928317871093, 540.6772182861329, 106.98481105957032) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (83.50323367919921, 421.81824375, 261.6277553955078, 515.781614851888) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (323.6266098022461, 421.81824375, 1130.935991748047, 515.781614851888) 	 at (261.6277553955078, 421.81824375, 1190

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  list index out of range
error :  (261.6277553955078, 447.60850932617194, 273.1904182434082, 513.5750987609864)
detected in p.4 :	 5  tables
detected: (615.3488681396484, 216.84328342285156, 1145.5576870361328, 500.13004421386717) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected in p.5 :	 1  tables
detected: (668.3992221435547, 176.33824802246093, 1123.0967495361328, 336.7329800048828) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (42.684233435058594, 470.73323703613283, 540.3266689697266, 509.13477443847654) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (666.1593017578125, 483.4439864318848, 1052.9785942871094, 509.13477443847654) 	 at (540.3266689697266, 470.73323703613283, 1190.550048828125, 509.13477443847654)  in depth 1
detected: (673.9952304443359, 484.98068118896487, 1111.8317348876953, 711.1896450439453) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (

Unnamed: 0,85.0,2017,2018,Unnamed: 4,2019,2020,2021,Unnamed: 8


exception :  table does not sound
errs :  1
detected in p.6 :	 2  tables
detected: (643.6292636474609, 74.85826755371093, 1141.2715542236328, 107.69379543457032) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
exception :  The identified boxes have significant overlap: 259.27% of area is overlapping (Max is 30.00%)
error :  (643.6292636474609, 74.85826755371093, 1141.2715542236328, 107.69379543457032)
detected: (46.41688419189453, 117.54052615966796, 394.8655483642578, 423.4892361083984) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
detected in p.8 :	 1  tables
17 7 17
./train_source/월간 나라재정 2023년 12월호.pdf /content/src/
Processing 월간 나라재정 2023년 12월호.pdf...
/content/src/train_source/월간 나라재정 2023년 12월호.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./train_source/월간 나라재정 2023년 12월호.pdf


  0%|          | 0/68 [00:00<?, ?it/s]

detected: (35.02728307495117, 518.5687068725586, 197.43199503173827, 690.026385900879) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.1 :	 1  tables
detected: (88.32394063720703, 137.88334310302736, 249.8691593383789, 613.6314274047852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (311.75870513916016, 146.6003797103882, 492.10239338378904, 598.8033752441406) 	 at (249.8691593383789, 137.88334310302736, 538.5830078125, 613.6314274047852)  in depth 1
detected: (47.57795369873047, 30.387466790771484, 504.57506143798827, 116.55561411132813) 	 at (0.0, 0.0, 538.5830078125, 137.88334310302736)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 74.05% of area is overlapping (Max is 30.00%)
error :  (47.57795369873047, 30.387466790771484, 504.57506143798827, 116.55561411132813)
detected in p.2 :	 2  tables
detected: (65.64953267822266, 596.2608455444336, 447.0150638793945, 678.6172672485352) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.3 :	 1  tables
detected: (93.55581319580078, 452.18653524169923, 460.531299230957, 667.9518009399414) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (93.9244197631836, 135.38346517333986, 461.6356388305664, 360.30134927978514) 	 at (0.0, 0.0, 538.5830078125, 452.18653524169923)  in depth 1
detected in p.4 :	 2  tables
detected: (78.11112630615234, 509.8281062866211, 446.2978702758789, 592.8019596313477) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.5 :	 1  tables
detected: (92.28512227783203, 424.7891353393555, 460.22188150634764, 525.7272526000977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.6 :	 1  tables
detected: (110.57269195963542, 274.9412196899414, 454.81872904052733, 354.0137668823242) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (84.13443410644531, 89.91088521728516, 461.2667423461914, 274.18199503173827) 	 at (0.0, 0.0, 538.5830078125, 274.9412196899414)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.10 :	 2  tables
detected: (73.67007863769531, 152.76388204345704, 444.69371378173827, 676.122821447754) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 40.34% of area is overlapping (Max is 30.00%)
error :  (73.67007863769531, 152.76388204345704, 444.69371378173827, 676.122821447754)
detected: (97.14513814697266, 121.3719295288086, 461.88142740478514, 638.5955997680664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (20.701694848632812, 107.36940419921875, 485.20906412353514, 121.3719295288086) 	 at (0.0, 0.0, 538.5830078125, 121.3719295288086)  in depth 1
exception :  The identified boxes have significant overlap: 41.74% of area is overlapping (Max is 30.00%)
error :  (97.14513814697266, 121.3719295288086, 461.88142740478514, 638.5955997680664)
exception :  The identified boxes have significant overlap: 34.70% of area is overlapping (Max is 30.00%)
error :  (20.7016948486

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.25 :	 3  tables
detected: (54.65306890258789, 73.1447109008789, 503.91478311767577, 585.6825748657227) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.26 :	 1  tables
detected: (51.3595250869751, 70.17990529785156, 504.200244543457, 610.4679142211914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.28 :	 1  tables
detected: (39.69974553833008, 70.88701284179687, 489.15135538330077, 687.9178043579102) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.29 :	 1  tables
detected: (63.25106466064453, 100.41054952392578, 504.29137003173827, 585.6036564086914) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.30 :	 1  tables
detected: (50.47379211832682, 79.19588125, 505.92723428955077, 634.8582950805664) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.32 :	 1  tables
detected: (269.0477107788086, 90.8807185913086, 490.29002725830077, 678.12

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.35 :	 2  tables
detected: (64.34244954833984, 102.87173116455078, 505.36876260986327, 621.8499332641602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.36 :	 1  tables
detected: (50.47379211832682, 108.6906322265625, 504.1600528930664, 589.5273014282227) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (45.44882619628906, 94.75456655273437, 503.96724283447264, 108.6906322265625) 	 at (0.0, 0.0, 538.5830078125, 108.6906322265625)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 184.85% of area is overlapping (Max is 30.00%)
error :  (45.44882619628906, 94.75456655273437, 503.96724283447264, 108.6906322265625)
detected in p.38 :	 1  tables
detected: (75.49112355957031, 518.472637536621, 447.0092045043945, 694.5885807250977) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.39 :	 1  tables
detected: (95.3794978881836, 104.9876521850586, 462.3404728149414, 624.6617008422852) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
exception :  The identified boxes have significant overlap: 37.72% of area is overlapping (Max is 30.00%)
error :  (95.3794978881836, 104.9876521850586, 462.3404728149414, 624.6617008422852)
detected: (77.4546932006836, 98.23254049072266, 446.034961340332, 593.8412662719727) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 34.86% of area is 

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.45 :	 1  tables
detected: (34.76388204345703, 83.2430232788086, 233.38811075439452, 380.58751260986327) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.47 :	 1  tables
detected: (50.50052297363281, 110.13720357666016, 258.0267521118164, 426.502490637207) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.48 :	 1  tables
detected: (35.33351552734375, 84.69482839355469, 238.87466776123046, 375.79646646728514) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected in p.49 :	 1  tables
detected: (48.4029010559082, 445.45206868896486, 250.09556925048827, 623.5872379516602) 	 at (0.0, 0.0, 538.5830078125, 737.0079956054688)  in depth 0
detected: (298.9723892211914, 451.82672477264407, 503.3966896484375, 614.9078521728516) 	 at (250.09556925048827, 445.45206868896486, 538.5830078125, 623.5872379516602)  in depth 1
detected: (53.193111779785156, 373.69877279052736, 273.1751896118164, 445.45206868896486) 	 at 

Unnamed: 0,Unnamed: 1
0,다. 그런데 그 보조금을 포기하시겠다는 말씀에 ‘얼마나 답
1,답하셨으면 그런 말씀을 하셨을까? 이분이 원격 프로그램
2,연결이 되지 않는다고 상담을 그냥 마치게 되면 다시 우리
3,에게 상담할 기회는 있는 걸까?’라는 생각에 마음이 먹먹해
4,졌다.


|                                                         |
|:--------------------------------------------------------|
| 다. 그런데 그 보조금을 포기하시겠다는 말씀에 ‘얼마나 답 |
| 답하셨으면 그런 말씀을 하셨을까? 이분이 원격 프로그램   |
| 연결이 되지 않는다고 상담을 그냥 마치게 되면 다시 우리  |
| 에게 상담할 기회는 있는 걸까?’라는 생각에 마음이 먹먹해 |
| 졌다.                                                   |
46 36 46


Processing PDFs:   0%|          | 0/9 [00:00<?, ?it/s]

./test_source/중소벤처기업부_혁신창업사업화자금(융자).pdf /content/src/
Processing 중소벤처기업부_혁신창업사업화자금(융자).pdf...
/content/src/test_source/중소벤처기업부_혁신창업사업화자금(융자).pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./test_source/중소벤처기업부_혁신창업사업화자금(융자).pdf


  0%|          | 0/3 [00:00<?, ?it/s]

detected: (50.36739993693034, 459.3435929199219, 543.0411950846354, 547.435429296875) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (47.14997414550781, 360.9316056152344, 543.0411950846354, 411.99054404296874) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (50.36739993693034, 174.17559365234376, 543.0411950846354, 324.5993697265625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (76.55198220214844, 66.39969185791016, 418.3935591796875, 130.9975936035156) 	 at (0.0, 0.0, 595.0, 174.17559365234376)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 4  tables
detected: (62.93660286865234, 125.38860634765625, 544.8163903971354, 300.74542685546874) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected in p.2 :	 1  tables
5 2 5
./test_source/보건복지부_부모급여(영아수당) 지원.pdf /content/src/
Processing 보건복지부_부모급여(영아수당) 지원.pdf...
/content/src/test_source/보건복지부_부모급여(영아수당) 지원.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./test_source/보건복지부_부모급여(영아수당) 지원.pdf


  0%|          | 0/3 [00:00<?, ?it/s]

detected: (51.182799361165365, 468.6935990234375, 548.0550785481771, 558.583378515625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 343.66961220703126, 543.9572391927084, 411.40903159179686) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 183.04558876953126, 543.9572391927084, 320.76337119140624) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 3  tables
detected: (79.84759835205078, 177.8915970703125, 544.8163903971354, 280.84637900390624) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (63.52735451660156, 59.96496895751953, 507.008183203125, 177.8915970703125) 	 at (0.0, 0.0, 595.0, 177.8915970703125)  in depth 1
detected in p.2 :	 2  tables
5 2 5
./test_source/보건복지부_노인장기요양보험 사업운영.pdf /content/src/
Processing 보건복지부_노인장기요양보험 사업운영.pdf...
/content/src/test_source/보건복지부_노인장기요양보험 사업운영.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./test_source/보건복지부_노인장기요양보험 사업운영.pdf


  0%|          | 0/4 [00:00<?, ?it/s]

detected: (51.182799361165365, 343.66961220703126, 544.2250695963542, 396.7604109863281) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 451.1925919433594, 544.2250695963542, 551.1513716796875) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.182799361165365, 183.04558876953126, 544.2250695963542, 320.76337119140624) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 3  tables
detected: (69.14724282226562, 305.23830727539064, 529.5009810546875, 759.9174239257812) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (54.02691201171875, 77.77754525146484, 545.4689986328125, 268.9314314941406) 	 at (0.0, 0.0, 595.0, 305.23830727539064)  in depth 1
exception :  The identified boxes have significant overlap: 32.55% of area is overlapping (Max is 30.00%)
error :  (69.14724282226562, 305.23830727539064, 529.5009810546875, 759.9174239257812)
detected in p.1 :	 1  tables
detected: (66.82345131835937, 120.62259987792969, 543.6897017578125, 767.13342734375) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected in p.2 :	 1  tables
detected: (79.84759835205078, 168.30157021484376, 544.8763981282552, 273.6544234375) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected in p.3 :	 1  tables
6 4 6
./test_source/산업통상자원부_에너지바우처.pdf /content/src/
Processing 산업통상자원부_에너지바우처.pdf...
/content/src/test_source/산업통상자원부_에너지바우처.pd

  0%|          | 0/11 [00:00<?, ?it/s]

detected: (51.49688489467076, 183.04558876953126, 543.8973156161221, 267.541386328125) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.49688489467076, 261.79961708984376, 543.8973156161221, 333.58942099609374) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.49688489467076, 460.0625870605469, 543.8973156161221, 513.1524092773437) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.49688489467076, 538.936641015625, 543.8973156161221, 758.1911666015625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (49.963500299072265, 335.03960732421876, 543.8973156161221, 431.52240439453124) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (61.71473244628906, 91.78479317626953, 395.3478133300781, 162.17738791503905) 	 at (0.0, 0.0, 595.0, 183.04558876953126)  in depth 1
detected: (51.54259931844076, 267.541386328125, 543.8574060221354, 333.5894254882812) 	 at (0.0, 267.541386328125, 595.0, 335.03960732421876)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 6  tables
detected: (50.10360014241537, 260.96062783203126, 545.176385921224, 758.4044234375) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (50.10360014241537, 107.72803048095703, 545.176385921224, 204.85040732421874) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
Filling in gap at top of table
detected in p.1 :	 2  tables
detected: (50.10360014241537, 70.00957802734375, 545.0563806315104, 379.3793990234375) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected in p.2 :	 1  tables
detected: (73.79060041707356, 608.2205765625, 544.516728125, 737.9063765625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (73.79060041707356, 318.13762978515626, 544.516728125, 370.97516318359374) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (66.79326943359375, 60.06196145019531, 538.3943526367187, 299.8015181640625) 	 at (0.0, 0.0, 595.0, 318.13762978515626)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.4 :	 3  tables
detected: (59.97760323486328, 447.35659340820314, 536.2213790039062, 742.2224166015625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (56.10096291503906, 130.66675881347658, 537.2121016601562, 384.337040625) 	 at (0.0, 0.0, 595.0, 447.35659340820314)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.5 :	 2  tables
detected: (61.07759916585286, 149.00262001953126, 544.8163903971354, 320.76337119140624) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (61.07759916585286, 360.45159462890626, 544.8163903971354, 660.7114302734375) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected in p.6 :	 2  tables
detected: (50.10360014241537, 113.7615921875, 546.1359094401041, 204.49042197265624) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected in p.8 :	 1  tables
detected: (50.10360014241537, 195.3925736328125, 547.6954187174479, 329.0344283203125) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (50.10360014241537, 381.42861489257814, 542.6573938151041, 549.1134078125) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (48.454459466552734, 82.44220284423828, 537.8236739257812, 181.8843581298828) 	 at (0.0, 0.0, 595.0, 195.3925736328125)  in depth 1
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 32.37% of area is overlapping (Max i

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.10 :	 2  tables
21 9 21
./test_source/국토교통부_행복주택출자.pdf /content/src/
Processing 국토교통부_행복주택출자.pdf...
/content/src/test_source/국토교통부_행복주택출자.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./test_source/국토교통부_행복주택출자.pdf


  0%|          | 0/3 [00:00<?, ?it/s]

detected: (51.32272430063883, 183.04558876953126, 544.3147530273437, 254.71539755859374) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.32272430063883, 249.0936234375, 544.3147530273437, 320.76337119140624) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.32272430063883, 520.4765580078125, 544.3147530273437, 611.3253829101562) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (51.32272430063883, 670.07262734375, 544.3147530273437, 750.9724166015625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (49.76991585693359, 322.2135880371094, 544.3147530273437, 425.5713240722656) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (50.33692864379883, 444.479609765625, 544.3147530273437, 497.60214682617186) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (64.52566842041016, 91.70276955566406, 398.33783408203124, 161.4123122314453) 	 at (0.0, 0.0, 595.0, 183.04558876953126)  in depth 1
detected: (51.54259931844076, 254.71539755859374, 544.3370609700521, 320.763375683593

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 7  tables
detected: (50.10360014241537, 100.81560830078125, 545.0563806315104, 455.25641318359374) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected in p.1 :	 1  tables
detected: (79.84759835205078, 657.0075638671875, 544.516728125, 767.5144087890625) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (73.3706009765625, 471.09060219726564, 544.516728125, 526.9122359375) 	 at (0.0, 0.0, 595.0, 841.0)  in depth 0
detected: (71.36094979248047, 104.38117531738281, 543.5512740234375, 384.24640341796874) 	 at (0.0, 0.0, 595.0, 471.09060219726564)  in depth 1
table does not sound


Unnamed: 0,연도 \n사업비,"사업비(예산액기준, \n1,267,123","사업비(예산액기준, \n2020 \n1,267,123","사업비(예산액기준, \n1,267,123.1","사업비(예산액기준, \n1,105,291","사업비(예산액기준, \n2021 \n1,105,291 \n등","사업비(예산액기준, \n1,105,291.1",Unnamed: 8,Unnamed: 9,추경편성한 \n계속사업 추진,"2022 \n775,293",연도에는 추경포함),"연도에는 추경포함) \n2023 \n684,607","연도에는 추경포함) \n2024 \n528,783"


exception :  table does not sound
errs :  1
detected in p.2 :	 2  tables
10 3 10
./test_source/「FIS 이슈 & 포커스」 22-4호 《중앙-지방 간 재정조정제도》.pdf /content/src/
Processing 「FIS 이슈 & 포커스」 22-4호 《중앙-지방 간 재정조정제도》.pdf...
/content/src/test_source/「FIS 이슈 & 포커스」 22-4호 《중앙-지방 간 재정조정제도》.pdf /content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/./test_source/「FIS 이슈 & 포커스」 22-4호 《중앙-지방 간 재정조정제도》.pdf


  0%|          | 0/9 [00:00<?, ?it/s]

detected: (0.0, 0.0, 595.2760009765625, 841.8900146484375) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
detected in p.0 :	 1  tables
detected: (41.976233251953126, 319.63951755371096, 540.4667243082682, 372.0141842285156) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (650.1707952969638, 456.69310642089846, 1137.9115230957032, 710.6308681884766) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.77916026916504, 640.4492468994141, 109.3260662475586, 668.9529724121094) 	 at (0.0, 456.69310642089846, 650.1707952969638, 710.6308681884766)  in depth 1
detected: (53.400568731689454, 456.69310642089846, 539.4053432861328, 649.5002899169922) 	 at (0.0, 456.69310642089846, 650.1707952969638, 710.6308681884766)  in depth 1
detected: (640.4362094482422, 79.21865726318359, 717.7757656494141, 111.94580959472657) 	 at (0.0, 0.0, 1190.550048828125, 319.63951755371096)  in depth 1
detected: (73.00881839599609, 216.019873388

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.3 :	 2  tables
detected: (649.7200534423828, 514.6562411865234, 1139.1693610270183, 643.8690884033203) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (631.3367831787109, 175.9061286529541, 1139.1693610270183, 298.70321818560427) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (50.225306098865325, 282.8756859102137, 540.6809780517578, 495.04789089355467) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (652.2243194580078, 282.8756859102137, 1131.658647998047, 294.6398954110987) 	 at (540.6809780517578, 282.8756859102137, 1190.550048828125, 495.04789089355467)  in depth 1
Filling in gap at top of table
detected in p.4 :	 3  tables


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (42.684233435058594, 297.0736606201172, 540.4255255777995, 349.01849777080827) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (649.724131266276, 297.0736606201172, 1141.7718783854166, 349.01849777080827) 	 at (540.4255255777995, 297.0736606201172, 1190.550048828125, 349.01849777080827)  in depth 1
detected: (51.10606060462365, 92.71623346322866, 540.4255255777995, 185.07935451660157) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (648.7638461669922, 272.0072543701172, 1137.1971828857422, 388.4651882568359) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (42.684233435058594, 297.07364417724614, 540.3266689697266, 350.3291778564453) 	 at (0.0, 272.0072543701172, 648.7638461669922, 388.4651882568359)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.5 :	 3  tables
detected: (649.7415140814887, 90.89851260986327, 1140.882475447591, 187.91401027832032) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (50.88535337148813, 418.13326145019533, 540.8289272705078, 533.1738979736328) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (645.2016143798828, 418.13326145019533, 1139.2622246582032, 524.0775146484375) 	 at (540.8289272705078, 418.13326145019533, 1190.550048828125, 533.1738979736328)  in depth 1
detected: (53.84312320556641, 187.91401027832032, 1141.7986538330078, 407.21408317871095) 	 at (0.0, 187.91401027832032, 1190.550048828125, 418.13326145019533)  in depth 1
detected in p.6 :	 4  tables
detected: (65.72615759379069, 562.4172885498047, 523.8187954345703, 592.4177944580078) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (66.26201511230468, 145.21005893554687, 527.2978603759766, 562.2171718994141) 	 at (0.0, 0.0, 1190.550048828125, 56

  0%|          | 0/11 [00:00<?, ?it/s]

detected: (0.0, 0.0, 595.2760009765625, 841.8900146484375) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 1  tables
detected: (640.4362094482422, 79.11028171386718, 1141.6215298095703, 111.94580959472657) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (41.976233251953126, 373.89122653808596, 540.6600836832682, 410.6947719970703) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.77916026916504, 427.54524875488283, 103.76186489257813, 456.0488216796875) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.77916026916504, 463.29561496582033, 555.1835415283203, 768.8075039306641) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (654.22216796875, 474.8810134094239, 1141.1421074707032, 727.83251953125) 	 at (555.1835415283203, 463.29561496582033, 1190.550048828125, 768.8075039306641)  in depth 1
detected: (76.8774478515625, 193.5495147705078, 517.4390346923828, 284.34775017089845) 	 at (0.0, 111.94580959472657, 1190.550048828125, 373.89122653808596)  in depth 1
dete

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected: (541.8133456787109, 136.15453220214843, 1140.2162563720703, 383.4853298583984) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (53.01486469116211, 136.15453220214843, 541.1344082275391, 382.51593017578125) 	 at (0.0, 136.15453220214843, 541.8133456787109, 383.4853298583984)  in depth 1


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.2 :	 2  tables
detected: (41.14223361816406, 73.869269140625, 540.2016689697266, 106.98481105957032) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.6540626475941, 137.85802341308593, 110.61536525878907, 171.26094936523438) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
exception :  The identified boxes have significant overlap: 98.55% of area is overlapping (Max is 30.00%)
error :  (41.14223361816406, 73.869269140625, 540.2016689697266, 106.98481105957032)
exception :  list index out of range
error :  (48.6540626475941, 137.85802341308593, 110.61536525878907, 171.26094936523438)
detected: (48.48532367553711, 123.48969150390624, 1166.5161831298828, 445.9509975830078) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected in p.4 :	 1  tables
detected: (650.0852184592507, 74.1189283610026, 712.0589687744141, 106.27625647379558) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (

Unnamed: 0,Unnamed: 1,| 지역 내 취창업자 수 |,"(단위: 명, 플랫폼당)"


exception :  table does not sound
errs :  1
detected in p.5 :	 3  tables
detected: (545.5513827880859, 137.40880466308593, 1134.8986294189453, 508.3971645751953) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (203.51947666015624, 191.54507972412108, 539.5894253173828, 493.37646484375) 	 at (0.0, 137.40880466308593, 545.5513827880859, 508.3971645751953)  in depth 1
detected: (67.91590762939452, 446.19036510009767, 203.51947666015624, 488.2271869262695) 	 at (0.0, 191.54507972412108, 203.51947666015624, 493.37646484375)  in depth 2


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.6 :	 3  tables
detected: (649.7239069802024, 73.45001865234374, 711.6851284423828, 106.27625647379558) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (668.8794345458984, 525.2554368787977, 890.0855343261719, 630.7846462646485) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (904.255558013916, 525.2554368787977, 1122.239962084961, 630.0366880628798) 	 at (890.0855343261719, 525.2554368787977, 1190.550048828125, 630.7846462646485)  in depth 1
detected: (45.70799899902344, 136.67014194335937, 557.1850674072266, 502.35181545410154) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (650.4291458129883, 154.13024284057616, 1141.3626274902344, 458.41880798339844) 	 at (557.1850674072266, 136.67014194335937, 1190.550048828125, 502.35181545410154)  in depth 1
detected: (580.8270175537109, 126.299444921875, 1136.8986294189453, 486.58036159667967) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)
  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  The identified boxes have significant overlap: 44.46% of area is overlapping (Max is 30.00%)
error :  (45.70799899902344, 136.67014194335937, 557.1850674072266, 502.35181545410154)
detected in p.7 :	 3  tables
detected: (643.6292636474609, 74.47775913085937, 1141.2715542236328, 106.99001939290365) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (357.5242221435547, 573.6383426269531, 384.6897671142578, 632.5056240478516) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (288.3272006591797, 573.6383426269531, 357.5242221435547, 632.5056240478516) 	 at (0.0, 573.6383426269531, 357.5242221435547, 632.5056240478516)  in depth 1
detected: (384.6897671142578, 573.6383426269531, 410.75776909179683, 632.5056240478516) 	 at (384.6897671142578, 573.6383426269531, 1190.550048828125, 632.5056240478516)  in depth 1
detected: (410.75776909179683, 573.6383426269531, 436.8257710693359, 632.5056240478516) 	 at (410.75776909179683, 573.63

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


exception :  No rows or columns detected
error :  (383.59224582519533, 573.6383426269531, 410.7577602783203, 632.5056240478516)
exception :  No rows or columns detected
error :  (409.66023898925783, 573.6383426269531, 436.82579277343746, 632.5055914984808)
detected in p.8 :	 1  tables
detected: (49.71914935913086, 77.94411350097656, 1144.9597866455078, 727.2830898681641) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected in p.9 :	 1  tables
detected: (47.80824924316406, 101.45551944580077, 538.9302456298828, 346.2957241455078) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
exception :  The identified boxes have significant overlap: 37.00% of area is overlapping (Max is 30.00%)
error :  (47.80824924316406, 101.45551944580077, 538.9302456298828, 346.2957241455078)
17 9 17
./test_source/「FIS 이슈&포커스」 22-2호 《재정성과관리제도》.pdf /content/src/
Processing 「FIS 이슈&포커스」 22-2호 《재정성과관리제도》.pdf...
/content/src/test_source/「FIS 이슈&포커스」 22-2호

  0%|          | 0/9 [00:00<?, ?it/s]

detected: (0.0, 0.0, 595.2760009765625, 841.8900146484375) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
exception :  The identified boxes have significant overlap: 37.05% of area is overlapping (Max is 30.00%)
error :  (0.0, 0.0, 595.2760009765625, 841.8900146484375)
detected: (640.4362094482422, 79.11028171386718, 1141.6215298095703, 111.94580959472657) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (41.976233251953126, 300.4252231201172, 540.4667243082682, 332.7440680582682) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.68223453369141, 361.8031985839844, 103.76186489257813, 390.30677150878904) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (47.60848499145508, 385.6074740966797, 551.4464809814453, 697.8247158447266) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (81.25430179443359, 202.4924774169922, 528.9308559814453, 242.1450981933594) 	 at (0

Unnamed: 0,재정성과관리 Performance Management \n성과계획서와 보고서,재정성과관리 Performance Management \n연간 성과지표와 목표치 관리


exception :  table does not sound
Filling in gap at top of table
errs :  1
detected in p.2 :	 2  tables
detected: (42.684233435058594, 75.56627536621093, 540.6784389892579, 108.40180324707032) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.84373355712891, 660.299202961077, 540.6784389892579, 755.3661871988933) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (649.4402099609375, 660.299202961077, 1142.0557061035156, 755.3661871988933) 	 at (540.6784389892579, 660.299202961077, 1190.550048828125, 755.3661871988933)  in depth 1
detected: (639.2204501708984, 450.29024387207033, 1141.9138271728516, 772.2595913330078) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.84373355712891, 660.2991941476005, 540.7663814941407, 755.4970703125) 	 at (0.0, 450.29024387207033, 639.2204501708984, 772.2595913330078)  in depth 1
detected: (40.50149799194336, 365.1898498535156, 552.5491421142578, 449.6146416503906) 	 

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


Filling in gap at top of table
detected in p.4 :	 5  tables
detected: (643.6292636474609, 80.08330417480468, 1141.2715542236328, 118.48481105957032) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (48.779185064697266, 378.3822543701172, 540.8269131103516, 720.1939175048828) 	 at (0.0, 0.0, 1190.550048828125, 841.8900146484375)  in depth 0
detected: (658.0279769897461, 378.3822543701172, 1140.9316582519532, 652.3133544921875) 	 at (540.8269131103516, 378.3822543701172, 1190.550048828125, 720.1939175048828)  in depth 1
detected: (51.85368610229492, 120.91467952728271, 1141.3512661376953, 378.38225437011727) 	 at (0.0, 118.48481105957032, 1190.550048828125, 378.3822543701172)  in depth 1
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 158.16% of area is overlapping (Max is 30.00%)
error :  (643.6292636474609, 80.08330417480468, 1141.2715542236328, 118.48481105957032)
detected in p.5 :	 3  tables
detected: (642.75365329

  0%|          | 0/16 [00:00<?, ?it/s]

detected: (0.0, 0.0, 595.2760009765625, 841.8900146484375) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.0 :	 1  tables
detected: (41.976233251953126, 281.14623142089846, 540.6600836832682, 313.9817593017578) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
detected: (48.77923465576172, 340.08125949707033, 103.76186489257813, 368.5849392333984) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
detected: (48.5129458984375, 373.93416477050783, 539.2971279541016, 772.9277431884766) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
exception :  The identified boxes have significant overlap: 236.16% of area is overlapping (Max is 30.00%)
error :  (41.976233251953126, 281.14623142089846, 540.6600836832682, 313.9817593017578)
table does not sound


Unnamed: 0,Unnamed: 1


exception :  table does not sound
exception :  The identified boxes have significant overlap: 37.48% of area is overlapping (Max is 30.00%)
error :  (48.5129458984375, 373.93416477050783, 539.2971279541016, 772.9277431884766)
errs :  1
detected: (44.81023288574219, 79.10924411621093, 545.9518317301432, 111.94477199707032) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
detected: (67.99853397216796, 405.66414523925783, 545.9518317301432, 540.5468227783203) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
detected: (63.659094580078126, 184.6033935546875, 547.7153408447266, 368.3145470458984) 	 at (0.0, 111.94477199707032, 595.2760009765625, 405.66414523925783)  in depth 1
exception :  The identified boxes have significant overlap: 91.90% of area is overlapping (Max is 30.00%)
error :  (44.81023288574219, 79.10924411621093, 545.9518317301432, 111.94477199707032)
detected in p.2 :	 2  tables
detected: (48.8415337483724, 473.5321261962891, 540.858773461914

  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.8 :	 3  tables
detected: (42.85623432006836, 75.56627536621093, 540.4986660400391, 108.40180324707032) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
Filling in gap at top of table
exception :  The identified boxes have significant overlap: 131.72% of area is overlapping (Max is 30.00%)
error :  (42.85623432006836, 75.56627536621093, 540.4986660400391, 108.40180324707032)
detected: (54.57323337402344, 92.84123874511718, 546.6217739501953, 755.9483730712891) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0


  df_temp = df.replace(to_replace=[None], value=0).fillna('').astype(str)


detected in p.10 :	 1  tables
detected: (48.77920986022949, 196.98920322265624, 540.8268346365793, 744.3124853384165) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
detected in p.13 :	 1  tables
detected: (48.35423541870117, 237.19322849121093, 545.9959804931641, 270.0287563720703) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
detected: (57.463042028808594, 74.41676975097656, 545.8958828369141, 180.10271572265626) 	 at (0.0, 0.0, 595.2760009765625, 237.19322849121093)  in depth 1
exception :  The identified boxes have significant overlap: 258.21% of area is overlapping (Max is 30.00%)
error :  (48.35423541870117, 237.19322849121093, 545.9959804931641, 270.0287563720703)
detected in p.14 :	 1  tables
detected: (47.82839084472656, 129.74704624023437, 453.05619167480467, 439.14811062011717) 	 at (0.0, 0.0, 595.2760009765625, 841.8900146484375)  in depth 0
detected in p.15 :	 1  tables
19 12 19


# Dataset Config

In [19]:
tab_ver = 'tab_v3.1'
model_dict = {
'large':"intfloat/multilingual-e5-large",
'base':"intfloat/multilingual-e5-base",
}

if tab_ver == 'tab_v0' : file_dir =BASE_DIR
else : file_dir = os.path.join(BASE_DIR,'processed',tab_ver)

model_option = 'large'
#model_option = 'base'
model_path = model_dict[model_option]
chunk_size = 512 #256
split_ver = 'split.2'

In [20]:
aug_type = "AugGPT"

In [21]:
aug_type= 'AugAEDA'

In [22]:
aug_type= 'GPTOnly'

In [23]:
aug_type= 'NoAug'

In [24]:
db_config = {
    'model' : model_option,
    'tab_process' : tab_ver,
    'aug' : aug_type,
    'chunck_size' : chunk_size,
    'split_ver' : split_ver
}

In [25]:
db_name = "{model}-ensemble-{tab_process}.{split_ver}-{chunck_size}".format(**db_config)

## Split train/valid

In [26]:
data_df = pd.read_csv(os.path.join(BASE_DIR,'train.csv'))
test_df = pd.read_csv(os.path.join(BASE_DIR,'test.csv'))

In [27]:
import matplotlib as mpl
from matplotlib import pyplot as plt
import seaborn as sns
import numpy as np

#fig,ax=plt.subplots()
#sns.histplot(data_df['Source'],ax=ax)
#a = len(data_df['Source'].unique())
#ax.plot(range(a),len(data_df)*0.2*np.ones(a),color='#112155')
#ax.plot(range(a),len(data_df)*0.225*np.ones(a),color='#115175')
#ax.plot(range(a),len(data_df)*0.175*np.ones(a),color='#115175')
#pass

In [28]:
import numpy as np

def check_split_possible(dist,ratio,err_ths):
  n_all = dist.sum()
  if dist.iloc[-1] > n_all * (ratio+err_ths) : return False
  cond = dist > n_all * (ratio+err_ths)
  if dist[cond].sum() > n_all*(1-ratio+err_ths) : return False
  else : return True

def split_by_files(data:pd.DataFrame,col,test_size,random_state,err_ths=0.025):
  if col not in data.columns : return False
  nprnd = np.random.RandomState(random_state)
  n_all = len(data)
  dist = data[col].value_counts().sort_values(ascending=False)
  if not check_split_possible(dist,test_size,err_ths): return False
  train_cond = dist > n_all*(test_size-err_ths)
  while dist[train_cond].sum() < n_all*(1-test_size-err_ths):
    cand = nprnd.choice(dist[~train_cond].index,1)[0]
    if dist[train_cond].sum()+dist.loc[cand] > n_all*(1-test_size+err_ths) : continue
    else : train_cond = (dist.index == cand) | train_cond
    # have to make hedge for infinite loop
  train_files,test_files = dist[train_cond].index, dist[~train_cond].index
  print(f'size : {dist[train_cond].sum()}, ratio : {dist[train_cond].sum()/n_all:.4f}, n_files : {len(train_files)}')
  print(f'size : {dist[~train_cond].sum()}, ratio : {dist[~train_cond].sum()/n_all:.4f}, n_files : {len(test_files)}')
  train_cond = data[col].isin(train_files)
  return data[train_cond], data[~train_cond]

In [29]:
from sklearn.model_selection import train_test_split

if split_ver == 'split.0' : train_df, valid_df = data_df, pd.DataFrame(columns=data_df.columns)
elif split_ver == 'split.1' :train_df,valid_df = train_test_split(data_df,test_size=0.2,stratify=data_df.Source,random_state=801)
elif split_ver == 'split.2' :train_df,valid_df = split_by_files(data_df,'Source',test_size=0.2,random_state=801)

size : 396, ratio : 0.7984, n_files : 11
size : 100, ratio : 0.2016, n_files : 5


## Apply Augmentation

In [30]:
ls {BASE_DIR}

241008_csv_checker.ipynb             [0m[01;34mgemma2_financeQA-finetune[0m/  [01;34mtest_source[0m/
combined_train_aug_v3.5_editted.csv  [01;34mprocessed[0m/                  train.csv
combined_train_aug_v3.csv            sample_submission.csv       [01;34mtrain_source[0m/
combined_train_aug_v3_editted.csv    [01;34msub[0m/                        Untitled0.ipynb
[01;34mdata[0m/                                [01;34mtemp[0m/
[01;34meval[0m/                                test.csv


In [31]:
refine_option = False
filter_option = False

file_config = {
    'refined' : refine_option,
    'filter' : filter_option,
}

In [32]:
if refine_option : aug_file,sep = 'combined_train_aug_v3.5_editted.csv','\tab'
else  : aug_file,sep = 'combined_train_aug_v3_editted.csv','|'
#aug_file = 'combined_train_aug_v3.csv'
aug_path = os.path.join(BASE_DIR,aug_file)

In [33]:
ques_dict={
    'NoAug' : 'Question',
    'AugGPT' : 'Question_aug_GPT',
    'AugAEDA' : 'AEDA_Question',
    'GPTOnly' : 'Question_aug_GPT',
}
ans_dict = {
    'NoAug' : 'Answer',
    'AugGPT' : 'Answer',
    'AugAEDA' : 'Answer',
    'GPTOnly' : 'Answer',
}

In [34]:
key_col = 'SAMPLE_ID'
info_col = ['Source', 'Source_path']
ques_base = ques_dict['NoAug']
ans_base = ans_dict['NoAug']
ques_col = ques_dict[aug_type]
ans_col = ans_dict[aug_type]

In [35]:
filter_list = ['TRAIN_451', 'TRAIN_452', 'TRAIN_453', 'TRAIN_454', 'TRAIN_455', 'TRAIN_456']

In [36]:
aug_df = pd.read_csv(aug_path,sep=sep)
aug_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 496 entries, 0 to 495
Data columns (total 9 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   SAMPLE_ID         496 non-null    object
 1   Source            496 non-null    object
 2   Source_path       496 non-null    object
 3   Question          496 non-null    object
 4   Answer            496 non-null    object
 5   Question_aug_GPT  496 non-null    object
 6   Answer_aug_GPT    496 non-null    object
 7   AEDA_Question     496 non-null    object
 8   AEDA_Answer       496 non-null    object
dtypes: object(9)
memory usage: 35.0+ KB


In [37]:
import numpy as np
train_id = train_df[key_col].values
#print(pd.Series(filter_list).isin(train_id))
cond = (aug_df[key_col].isin(train_id))
if filter_option : cond = cond & (~(aug_df[key_col].isin(filter_list)))
display(aug_df.columns), len(train_id), np.sum(cond)

Index(['SAMPLE_ID', 'Source', 'Source_path', 'Question', 'Answer',
       'Question_aug_GPT', 'Answer_aug_GPT', 'AEDA_Question', 'AEDA_Answer'],
      dtype='object')

(None, 396, 396)

In [38]:
col_list = [key_col]+info_col+[ques_base,ans_base]
train_adjst = aug_df.loc[cond,col_list]
train_df= train_adjst.rename(columns = {ques_col : "Question", ans_col : "Answer"})

In [39]:
if aug_type != 'NoAug':
  col_list = [key_col]+info_col+[ques_col,ans_col]
  aug_train = aug_df.loc[cond,col_list]
  aug_train = aug_train.rename(columns = {ques_col : "Question", ans_col : "Answer"})
  if 'Only' in aug_type : train_augged=aug_train
  else : train_augged= pd.concat([train_df,aug_train])
  train_augged.info()

In [40]:
train_df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 396 entries, 50 to 478
Data columns (total 5 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   SAMPLE_ID    396 non-null    object
 1   Source       396 non-null    object
 2   Source_path  396 non-null    object
 3   Question     396 non-null    object
 4   Answer       396 non-null    object
dtypes: object(5)
memory usage: 18.6+ KB


In [41]:
valid_id = valid_df[key_col].values
cond = (aug_df[key_col].isin(valid_id))
if filter_option : cond = cond & (~(aug_df[key_col].isin(filter_list)))
col_list = [key_col]+info_col+[ques_base,ans_base]
valid_adjst = aug_df.loc[cond,col_list]
valid_df= valid_adjst.rename(columns = {ques_col : "Question", ans_col : "Answer"})

In [42]:
free_cuda()

freed :  30
freed :  0


# DB 생성

In [43]:
temp_path = '/content/processed/src/'
file_dir, os.listdir(file_dir)

('/content/drive/MyDrive/kdt-EST-AI/project/dacon_fis/src/processed/tab_v3.1',
 ['train_source', 'tables', 'test_source'])

In [44]:
src_dirs = list(filter(lambda x : x != 'pdf_db',os.listdir(file_dir)))
file_path = ' '.join([os.path.join(file_dir,sub) for sub in src_dirs])
if not os.path.exists(temp_path) : os.makedirs(temp_path)

In [49]:
!rsync -rvzh {file_path} {temp_path} --bwlimit 4096000000000000 --progress

sending incremental file list
tables/
tables/tab_trn.pkl
          3.21M 100%  168.29MB/s    0:00:00 (xfr#1, to-chk=76/80)
tables/tab_tst.pkl
        250.27K 100%  253.27kB/s    0:00:00 (xfr#2, to-chk=75/80)
test_source/
test_source/err_국토교통부_행복주택출자.pkl
          1.09K 100%    0.00kB/s    0:00:00 (xfr#3, to-chk=74/80)
test_source/err_보건복지부_노인장기요양보험 사업운영.pkl
              5 100%    0.01kB/s    0:00:00 (xfr#4, to-chk=73/80)
test_source/err_보건복지부_부모급여(영아수당) 지원.pkl
              5 100%    0.00kB/s    0:00:00 (xfr#5, to-chk=72/80)
test_source/err_산업통상자원부_에너지바우처.pkl
              5 100%    0.00kB/s    0:00:01 (xfr#6, to-chk=71/80)
test_source/err_중소벤처기업부_혁신창업사업화자금(융자).pkl
              5 100%    0.00kB/s    0:00:00 (xfr#7, to-chk=70/80)
test_source/err_「FIS 이슈 & 포커스」 22-4호 《중앙-지방 간 재정조정제도》.pkl
              5 100%    0.00kB/s    0:00:01 (xfr#8, to-chk=69/80)
te

In [50]:
#use it when table ver < 2
def process_pdf(file_path, chunk_size=256, chunk_overlap=32):
    """PDF 텍스트 추출 후 chunk 단위로 나누기"""
    # PDF 파일 열기
    doc = pymupdf4llm.to_markdown(file_path)

    headers_to_split_on = [
        ("#","Header 1"),
        ("##","Header 2"),
        ("###","Header 3"),
    ]

    md_splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split_on, strip_headers=False)
    md_chunks = md_splitter.split_text(doc)

    splitter = RecursiveCharacterTextSplitter(
        chunk_size=chunk_size,
        chunk_overlap=chunk_overlap
    )
    chunks = splitter.split_documents(md_chunks)

    return chunks


def create_vector_db(chunks, model_path="intfloat/multilingual-e5-small"):
    """FAISS DB 생성"""
    # 임베딩 모델 설정
    model_kwargs = {'device': 'cuda'}
    encode_kwargs = {'normalize_embeddings': True}
    embeddings = HuggingFaceEmbeddings(
        model_name=model_path,
        model_kwargs=model_kwargs,
        encode_kwargs=encode_kwargs
    )
    # FAISS DB 생성 및 반환
    db = FAISS.from_documents(chunks, embedding=embeddings)
    return db


#앙상블
def process_pdfs_from_dataframe(df, base_dir, chunk_size=256, model_path = "intfloat/multilingual-e5-small"):
    """딕셔너리에 pdf명을 키로해서 DB, retriever 저장"""
    pdf_databases = {}
    unique_paths = df['Source_path'].unique()

    for file_path in tqdm(unique_paths, desc="Processing PDFs"):
        # 경로 정규화 및 절대 경로 생성
        full_path = process_path(base_dir,file_path)
        full_path = processed_path_matcher(base_dir,full_path)
        pdf_title = os.path.basename(full_path)
        print(f"Processing {pdf_title}...")

        # PDF 처리 및 벡터 DB 생성
        chunks = process_pdf(full_path,chunk_size)
        db = create_vector_db(chunks, model_path=model_path)

        kiwi_bm25_retriever = KiwiBM25Retriever.from_documents(chunks)
        faiss_retriever = db.as_retriever()
        retriever = EnsembleRetriever(
            retrievers=[kiwi_bm25_retriever, faiss_retriever],
            weights=[0.5, 0.5],
            search_type="mmr",
        )

        # 결과 저장
        pdf_databases[pdf_title] = {
                'db': db,
                'retriever': retriever
        }
    return pdf_databases


In [45]:
if float(tab_ver[5:]) >= 2:
  pkl_dir = os.path.join(temp_path,'tables')
  tab_dict_trn = load_pkl(os.path.join(pkl_dir,'tab_trn.pkl'))
  tab_dict_tst= load_pkl(os.path.join(pkl_dir,'tab_tst.pkl'))

In [46]:
tab_word, tab_ver

('!표{0}!', 'tab_v3.1')

In [53]:
if float(tab_ver[5:]) >= 2:
  train_db, detect_rate = process_pdfs_from_df(train_df, temp_path, tab_dict_trn, tab_word, chunk_size=chunk_size, model_path=model_path)
else : train_db = process_pdfs_from_dataframe(train_df, temp_path, chunk_size=chunk_size, model_path=model_path)

Processing PDFs:   0%|          | 0/11 [00:00<?, ?it/s]

Processing 2024 나라살림 예산개요.pdf...
Processing /content/processed/src/train_source/2024 나라살림 예산개요.pdf...
/content/processed/src/train_source/2024 나라살림 예산개요.pdf
table mark detect rate : 0.43074


  embeddings = HuggingFaceEmbeddings(
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/387 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/160k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/57.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/690 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.24G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/418 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.1M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/280 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/201 [00:00<?, ?B/s]

Processing 재정통계해설.pdf...
Processing /content/processed/src/train_source/재정통계해설.pdf...
/content/processed/src/train_source/재정통계해설.pdf
table mark detect rate : 0.82308
Processing 국토교통부_전세임대(융자).pdf...
Processing /content/processed/src/train_source/국토교통부_전세임대(융자).pdf...
/content/processed/src/train_source/국토교통부_전세임대(융자).pdf
table mark detect rate : 0.63636
Processing 고용노동부_청년일자리창출지원.pdf...
Processing /content/processed/src/train_source/고용노동부_청년일자리창출지원.pdf...
/content/processed/src/train_source/고용노동부_청년일자리창출지원.pdf
table mark detect rate : 0.71429
Processing 보건복지부_노인일자리 및 사회활동지원.pdf...
Processing /content/processed/src/train_source/보건복지부_노인일자리 및 사회활동지원.pdf...
/content/processed/src/train_source/보건복지부_노인일자리 및 사회활동지원.pdf
table mark detect rate : 1.00000
Processing 국토교통부_소ᄀ

In [54]:
free_cuda()

freed :  20
freed :  0


In [55]:
if split_ver == 'split.2':
  if float(tab_ver[5:]) >= 2:
    valid_db, detect_rate = process_pdfs_from_df(valid_df, temp_path, tab_dict_trn, tab_word, chunk_size=chunk_size, model_path=model_path)
  else : valid_db = process_pdfs_from_dataframe(valid_df, temp_path, chunk_size=chunk_size, model_path=model_path)
else : valid_db = train_db

Processing PDFs:   0%|          | 0/5 [00:00<?, ?it/s]

Processing 1-1 2024 주요 재정통계 1권.pdf...
Processing /content/processed/src/train_source/1-1 2024 주요 재정통계 1권.pdf...
/content/processed/src/train_source/1-1 2024 주요 재정통계 1권.pdf
table mark detect rate : 0.82564
Processing 고용노동부_내일배움카드(일반).pdf...
Processing /content/processed/src/train_source/고용노동부_내일배움카드(일반).pdf...
/content/processed/src/train_source/고용노동부_내일배움카드(일반).pdf
table mark detect rate : 0.83333
Processing 중소벤처기업부_창업사업화지원.pdf...
Processing /content/processed/src/train_source/중소벤처기업부_창업사업화지원.pdf...
/content/processed/src/train_source/중소벤처기업부_창업사업화지원.pdf
table mark detect rate : 0.80000
Processing 보건복지부_생계급여.pdf...
Processing /content/processed/src/train_source/보건복지부_생계급여.pdf...
/content/processed/src/train_source/보건복지부_생계급여.pdf
table mark detect rate : 0.50000
Processing 월간 나라재정 2023년 12ᄋ

In [47]:
aug_type

'NoAug'

In [48]:
if float(tab_ver[5:]) >= 2:
  test_db, detect_rate = process_pdfs_from_df(test_df, temp_path, tab_dict_tst, tab_word[1:-1], chunk_size=chunk_size, model_path=model_path)
else : test_db  = process_pdfs_from_dataframe(test_df, temp_path, chunk_size=chunk_size, model_path=model_path)

Processing PDFs:   0%|          | 0/9 [00:00<?, ?it/s]

Processing 중소벤처기업부_혁신창업사업화자금(융자).pdf...
Processing /content/processed/src/test_source/중소벤처기업부_혁신창업사업화자금(융자).pdf...
/content/processed/src/test_source/중소벤처기업부_혁신창업사업화자금(융자).pdf
table mark detect rate : 0.60000


  embeddings = HuggingFaceEmbeddings(
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Processing 보건복지부_부모급여(영아수당) 지원.pdf...
Processing /content/processed/src/test_source/보건복지부_부모급여(영아수당) 지원.pdf...
/content/processed/src/test_source/보건복지부_부모급여(영아수당) 지원.pdf
table mark detect rate : 0.60000
Processing 보건복지부_노인장기요양보험 사업운영.pdf...
Processing /content/processed/src/test_source/보건복지부_노인장기요양보험 사업운영.pdf...
/content/processed/src/test_source/보건복지부_노인장기요양보험 사업운영.pdf
table mark detect rate : 0.83333
Processing 산업통상자원부_에너지바우처.pdf...
Processing /content/processed/src/test_source/산업통상자원부_에너지바우처.pdf...
/content/processed/src/test_source/산업통상자원부_에너지바우처.pdf
table mark detect rate : 0.76190
Processing 국토교통부_행복주택출자.pdf...
Processing /content/processed/src/test_source/국토교통부_행복주택출자.pdf...
/content/processed/src/test_source/국토교통부_행복주택출자.pdf
table mark 

In [None]:
file_dir

In [None]:
db_name

In [None]:
#db_path = os.path.join('/content','pdf_db')
db_path = os.path.join(file_dir,'pdf_db')
#save_pkl(db_path, f'{db_name}_train.dat',train_db)
#save_pkl(db_path, f'{db_name}_test.dat',test_db)

In [None]:
os.listdir(db_path)

In [None]:
train_db_name = 'large-ensemble-tab_v1.7-256_train.dat'
test_db_name = 'large-ensemble-tab_v1.7-256_test.dat'
train_db_path = os.path.join(db_path,train_db_name)
test_db_path = os.path.join(db_path,test_db_name)

In [None]:
file_path = ' '.join([#train_db_path,
            test_db_path
                      ])
temp_path = '/content/pdf_db'
if not os.path.exists(temp_path) : os.makedirs(temp_path)

In [None]:
!rsync -vzh {file_path} {temp_path} --bwlimit 4096000000000000 --progress

In [None]:
#train_db = load_pkl(os.path.join(temp_path,train_db_name))
test_db = load_pkl(os.path.join(temp_path,test_db_name))

# Create Dataset

In [49]:
def normalize_string(s):
    """유니코드 정규화"""
    return unicodedata.normalize('NFC', s)

def format_docs(docs):
    """검색된 문서들을 하나의 문자열로 포맷팅"""
    context = ""
    for doc in docs:
        context += doc.page_content
        context += '\n'
    return context

def make_dataset(df, pdf_databases):
    dataset = dict()
    dataset['context'] = list()
    dataset['question'] = list()
    dataset['answer'] = list()
    normalized_keys = {normalize_string(k): v for k, v in pdf_databases.items()}

    for _, row in tqdm(df.iterrows(), total=len(df), desc="Making"):
        # 소스 문자열 정규화
        source = normalize_string(row['Source'])+'.pdf'
        question = row['Question']
        dataset['question'].append(question)
        if 'Answer' in df.columns:
          dataset['answer'].append(row['Answer'])
        else: dataset['answer'].append('')

        # 정규화된 키로 데이터베이스 검색
        retriever = normalized_keys[source]['retriever']
        context = format_docs(retriever.invoke(question))
        dataset['context'].append(context)
    return dataset


# Dataset

In [50]:
if aug_type != 'NoAug':
  train_df = train_augged

In [51]:
if (refine_option, filter_option) == (False, False) : prefix = ''
elif (refine_option, filter_option) == (False, True) : prefix = 'filtered'
elif (refine_option, filter_option) == (True, False) : prefix = 'refined0'
elif (refine_option, filter_option) == (True, True) : prefix = 'refined'

In [52]:
dataset_name = "kdt3/DACON-QA-{model}-ensemble-{tab_process}.{split_ver}-{prefix}{aug}-{chunck_size}".format(prefix=prefix,**db_config)
train_name = "kdt3/DACON-QA-{model}-ensemble-{tab_process}.{split_ver}-{prefix}{aug}-{chunck_size}".format(prefix=prefix,**db_config)
#fname = "gemma2_large_ensemble_markdown_256_5epoch_reprocessed_result.csv"

push_url = dataset_name
push_url

'kdt3/DACON-QA-large-ensemble-tab_v3.1.split.2-NoAug-512'

In [None]:
## 만약 데이터셋을 분할해서 업로드해줘야할 경우 합치는 방법 참조 코드
from datasets import load_dataset, concatenate_datasets
from datasets import Dataset

train_dataset = load_dataset(dataset_name)['train']

train_dataset = concatenate_datasets([train_dataset, Dataset.from_dict(make_dataset(train_df.iloc[296:], train_db))])
train_dataset.push_to_hub(dataset_name, private=True, split='train')


## Train 데이터 생성 & 업로드

In [60]:
from datasets import Dataset
train_dataset = make_dataset(train_df, train_db)
train_dataset = Dataset.from_dict(train_dataset)
train_dataset.push_to_hub(push_url, private=True, split='train')


Making:   0%|          | 0/396 [00:00<?, ?it/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

CommitInfo(commit_url='https://huggingface.co/datasets/kdt3/DACON-QA-large-ensemble-tab_v3.1.split.2-NoAug-512/commit/0ae152851d854fffa6c71e373e564c2133bab448', commit_message='Upload dataset', commit_description='', oid='0ae152851d854fffa6c71e373e564c2133bab448', pr_url=None, repo_url=RepoUrl('https://huggingface.co/datasets/kdt3/DACON-QA-large-ensemble-tab_v3.1.split.2-NoAug-512', endpoint='https://huggingface.co', repo_type='dataset', repo_id='kdt3/DACON-QA-large-ensemble-tab_v3.1.split.2-NoAug-512'), pr_revision=None, pr_num=None)

## Valid 데이터 생성 & 업로드

In [61]:
from datasets import Dataset
valid_dataset = make_dataset(valid_df, valid_db)
valid_dataset = Dataset.from_dict(valid_dataset)
valid_dataset.push_to_hub(push_url, private=True, split='valid')

Making:   0%|          | 0/100 [00:00<?, ?it/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

README.md:   0%|          | 0.00/347 [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/datasets/kdt3/DACON-QA-large-ensemble-tab_v3.1.split.2-NoAug-512/commit/738c5c0364d5c23148ab98f61c5d1da9f9f2f510', commit_message='Upload dataset', commit_description='', oid='738c5c0364d5c23148ab98f61c5d1da9f9f2f510', pr_url=None, repo_url=RepoUrl('https://huggingface.co/datasets/kdt3/DACON-QA-large-ensemble-tab_v3.1.split.2-NoAug-512', endpoint='https://huggingface.co', repo_type='dataset', repo_id='kdt3/DACON-QA-large-ensemble-tab_v3.1.split.2-NoAug-512'), pr_revision=None, pr_num=None)

## Test 데이터 생성 & 업로드

In [53]:
from datasets import Dataset
test_dataset = make_dataset(test_df, test_db)
test_dataset = Dataset.from_dict(test_dataset)
test_dataset.push_to_hub(push_url, private=True, split='test')

Making:   0%|          | 0/98 [00:00<?, ?it/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

README.md:   0%|          | 0.00/447 [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/datasets/kdt3/DACON-QA-large-ensemble-tab_v3.1.split.2-NoAug-512/commit/23b951ff8468e25232cd3005ca5cad78a254ce26', commit_message='Upload dataset', commit_description='', oid='23b951ff8468e25232cd3005ca5cad78a254ce26', pr_url=None, repo_url=RepoUrl('https://huggingface.co/datasets/kdt3/DACON-QA-large-ensemble-tab_v3.1.split.2-NoAug-512', endpoint='https://huggingface.co', repo_type='dataset', repo_id='kdt3/DACON-QA-large-ensemble-tab_v3.1.split.2-NoAug-512'), pr_revision=None, pr_num=None)

# debug
-

In [None]:
def get_tables_from_pdf(full_path,tab_word='[[TABLE_{0}]]'):
    pdf = pymupdf.open(full_path)
    chunks, tables_dict, cnt= list(), defaultdict(list),0
    err_dict = dict()
    if pdf is None : return None, tables_dict
    for pnum, page in enumerate(tqdm(pdf)):
      if pnum < 7 : continue
      return page
      page_area = tuple(page.mediabox)
      try:
        tab_rslt = search_page(page,page_area,get_ths(page_area))
        bboxes, errors = tab_rslt['tabs'],tab_rslt['errs']
        tables,doc =list(), PyMuPDFPage(page)
        formatter = define_formatter()
        for box in bboxes:
          table = make_table(box['bbox'],doc,page_area,formatter,get_ths(page_area))
          if table is None :
            print('error : ',box['bbox'])
            continue
          if type(table) is not dict : errors.append(table)
          else :
            table['depth'] = box['depth']
            tables.append(table)

      except:
        save_pkl(ERRORDIR,'page_doc_error,pkl',[full_path,pnum,page_area])
        raise Exception()
      if len(errors)>0 :
        err_dict[pnum] = errors
        print('errs : ',len(errors))
      if len(tables) == 0 : continue
      print(f'detected in p.{pnum} :\t',len(tables),' tables')
      tables = sorted(tables,key=lambda x : (x['bbox'][0],x['bbox'][1]))
      for idx,tab in enumerate(tables):
        tab_mark = tab_word.format(cnt+idx)
        table_md = clean_table(tab['content'].to_markdown(index=False))
        tables_dict[pnum].append((tab_mark,table_md + f"\n{tab['caption']}"))

        try :
          area = tab['bbox']
          page.add_redact_annot(area)
          page.apply_redactions()
          page.draw_rect(area,color=(.0,0,0),fill=(.99,.99,.99))
          rc = page.insert_htmlbox(area,tab_mark,scale_low=0)
        except :
          print(page.mediabox)
          print(page.cropbox)
          print(tab['bbox'])
          display(tab['content'])
          print(table_md)

      cnt+=len(tables)

    print(cnt, len(tables_dict), sum([len(tables) for tables in tables_dict.values()]))
    return pdf, tables_dict,err_dict




In [None]:
s = '''
index,,
0,"02 재정지출","104"
1,"1. 국가별","104"
2,"2. 기능별 재정지출","106"
3,"3. 공공 사회 복지 지출","107"
4,"03 재정건전성","108"
5,"1. 재정수지","108"
6,"(1) 재정수지","108"
7,"(2) 기초 재정수지","109"
8,"2. 부채","110"
9,"(1) 부채","110"
10,"(2) 1인당 부채","111"
11,"주요재정통계","113"
12,"01 총수입･국세수입","116"
13,"02 조세부담률 및 국민부담률","118"
14,"03 총지출","120"
15,"04 분야별 재정지출","122"
16,"05 의무지출･재량지출 통합재정수지･국가채무","124"
17,"06","126"
18,"07 지방재정조정","126"
'''
import sys
if sys.version_info[0] < 3:
    from StringIO import StringIO
else:
    from io import StringIO

content = StringIO(s)
df = pd.read_table(content, sep=",",index_col=0)
display(df.info())
check_table_df_soundness(df)

In [None]:
df,base_path,save_dir = train_df, temp_path,PROCESSEDDIR
path = './train_source/1-1 2024 주요 재정통계 1권.pdf'
tab_rslt,err_rslt=extract_table_and_pdf(path,base_path,save_dir)

In [None]:
base_dir = temp_path
file_path = './train_source/1-1 2024 주요 재정통계 1권.pdf'
full_path = process_path(base_dir,file_path)
full_path = processed_path_matcher(base_dir,full_path)

page = get_tables_from_pdf(full_path)

In [None]:
def search_page(page,area,ths=0,depth=0):
  if not larger_v_ths(area,ths) : return defaultdict(list)
  if depth >=10 : raise Exception(depth)
#  print('depth : ',depth,'area : ',area)
  page.set_cropbox(area) #area : page에서 절대적 위치. cropbox를 하게 되면 상대적 위치로 바뀜
  rslt,searched = defaultdict(list),list()
#  set_trace()
  detected = find_tables(page,get_page_size(area),ths)
#  return detected
  if len(detected)==0 : return rslt
  for target in detected:
    if target is None : continue
    if type(target) is not dict : rslt['errs'].append(target)
    rect = infer_bbox_pos(area,target['bbox'])
    bbox = {'bbox':rect,'depth':depth}
    rslt['tabs'].append(bbox)
    print('detected:',rect,'\t at',area,f' in depth {depth}')

    left_area = get_area(rect,area,True)
    right_area = get_area(rect,area,False)
    if depth > 5 : print(f'left {left_area}\tright{right_area}')
    left = search_page(page,left_area,ths,depth+1)
    right = search_page(page,right_area,ths,depth+1)
    searched.append((area[0],rect[1],area[2],rect[3]))
    if depth > 5 : print(searched)
    extend_list_dict(rslt,left)
    extend_list_dict(rslt,right)

  if len(rslt)==0 : return rslt
  searched = organize_box(searched,area,ths)
  blanks = get_blank(searched,area,ths)
  if depth >5 : print(f'in {area}\n\t',blanks)
  for row in blanks:
    detected = search_page(page,row,ths,depth+1)
    extend_list_dict(rslt,detected)

#  rslt= organize_box(rslt,area,ths) : can't apply directly like this
#  print('in search page, err : ',len(rslt['errs']))
  page.set_cropbox(area)
  return rslt

In [None]:
search_page(page,page.mediabox)

In [None]:
#check_table_df_soundness(detected[0])
df_temp = detected[0].replace(to_replace=[None], value=0).astype(str)
df_temp = df_temp.fillna('').astype(str)
val = '0'
cols =range(len(df_temp.columns))
df_temp.columns =cols
cond = df_temp == val
for col in cols:
  display(np.sum(cond[col]))
#target = list(filter(lambda col : np.sum(cond[col]) != len(cond),cols))
#erase_constant_rowcol(df_temp,'0')

In [None]:
ls {PROCESSEDDIR}/train_source

In [None]:
errs = load_pkl(os.path.join(PROCESSEDDIR,'train_source','err_1-1 2024 주요 재정통계 1권.pkl'))
len(errs)

In [None]:
errs[6]

In [None]:
def check_table_df_soundness(df,ths=0.5):
  if len(df) < 1 : return False
  df_temp = df.replace(to_replace=None, value=0).astype(str)
  if maybe_numeric_table(df_temp) : return True
  df_temp = df.replace(to_replace=None, value='PD_NONE')
  cols = list(df.columns.astype(np.string_))
  cols = list(map(lambda x : '' if x is None else str(x),cols))
  cols = list(map(lambda x : re.sub(r'[\s]*','',x),cols))
  null_col = list(filter(lambda x: len(x)<1,cols))
  if len(null_col) > len(cols) * ths : return False
  if np.sum(df_temp=='PD_NONE') > len(df)*len(cols)*ths : return False
  return True


def maybe_numeric_table(df,ths=0.35):
  df_temp = df.fillna('').astype(str)
  df_temp = df.replace(r'(?:(\d+?)),(\d+?)',r'\1\2',regex=True)
  df_temp = df_temp[list(df_temp.columns)].apply(pd.to_numeric,errors='coerce')
  rslt = ~df_temp.isna()
  rowwise = rslt.apply(sum,axis=1)
  colwise = rslt.apply(sum,axis=0)
  if np.sum(rowwise == len(rslt)) > 0 : return True
  if np.sum(colwise== len(rslt.columns)) > 0 : return True
  if np.sum(rslt.values) > len(rslt)*len(rslt.columns)*ths : return True
  return False

In [None]:
bad = [{"index":0,"":"2018년부터 발간한 ｢주요 재정통계｣는 디지털예산회계시스템의 재정정보를"},{"index":1,"":"계에 따른 시계열 통계로 구성･제공하고 있습니다. 아울러 중앙-지방정부 등 여러"},{"index":2,"":"산재되어 있는 재정통계를 수록하여 재정분석의 기초자료로 활용될 수 있도록"},{"index":3,"":"다. 또한 단순 정보전달에 그치지 않고, 일반 국민도 쉽게 이해할 수 있도록 재정통계의 가독성을 높이고자 노력하였습니다."},{"index":4,"":"특히, 올해부터는 한권으로 제공되던 ｢주요 재정통계｣를 Ⅰ, Ⅱ 권으로 분권하여,"},{"index":5,"":"편의성을 제고하고자 하였습니다. <제Ⅰ권>에서 우리나라의 재정체계, 주요 재정지표"},{"index":6,"":"재정 전반에 대한 이해를 돕기 위해 주요 재정 제도의 설명과 함께 관련 통계를 수록하였고,"},{"index":7,"":"OECD, IMF 회원국 간 재정 동향을 비교･분석할 수 있는 통계로 구성하였습니다."},{"index":8,"":"<제Ⅱ권>에서는 예산체계에 따른 16대 분야별 재정 구조와 추이, 사업유형별 주요사업 정보를 담아 각 분야별 재정지출에 대한 이해도를 높일 수 있도록 구성하였습니다. 이와 함께, 부록에서는 국내 핵심 재정 통계를 선정하여 10년 이상의 장기 시계열"},{"index":9,"":"표를 추가 제공하였습니다."},{"index":10,"":"한국재정정보원은 ｢2024 주요 재정통계｣가 국민들의 재정에 대한 이해도를"},{"index":11,"":"재정당국에게는 재정정책 수립의 기초자료로 활용되길 바랍니다. 앞으로 시의적절함과"},{"index":12,"":"동시에 정합성을 담보한 다양한 재정자료를 지속적으로 제공할 수 있도록 최선을 습니다."},{"index":13,"":"2024년"}]
df = pd.DataFrame(bad)
df_temp = df.replace(to_replace=[None], value=0).astype(str)
df_temp = df_temp.fillna('').astype(str)
df_temp = df.replace(r'(?:(\d+?)),(\d+?)',r'\1\2',regex=True)
df_temp = df_temp[list(df_temp.columns)].apply(pd.to_numeric,errors='coerce')
rslt = ~df_temp.isna()
rowwise = rslt.apply(sum,axis=1)
colwise = rslt.apply(sum,axis=0)

In [None]:
np.sum(rowwise)

In [None]:
cols

In [None]:
rslt

In [None]:
np.sum(rslt)

In [None]:
df_temp = df.replace(to_replace=[None], value=0).astype(str)
if maybe_numeric_table(df_temp) : print(True)

In [None]:
file_path