# Document Conversion - Custom settings

## Getting started

The [Deep Search Toolkit](https://ds4sd.github.io/deepsearch-toolkit/) allows document conversion with the following few lines of code. It's that simple! For more info or step-by-step guide:
- Visit https://ds4sd.github.io/deepsearch-toolkit/guide/convert_doc/
- Follow this example notebook

### Set notebook parameters

In [1]:
from dsnotebooks.settings import ProjectNotebookSettings

# notebook settings auto-loaded from .env / env vars
notebook_settings = ProjectNotebookSettings()

PROFILE_NAME = notebook_settings.profile  # the profile to use
PROJ_KEY = notebook_settings.proj_key     # the project to use

### Import example dependencies

In [2]:
import deepsearch as ds
from deepsearch.documents.core.models import ConversionSettings, DefaultConversionModel, ProjectConversionModel, \
    OCRSettings

### Connect to Deep Search

In [3]:
api = ds.CpsApi.from_env(profile_name=PROFILE_NAME)

### Convert with custom settings

In [4]:
## Modify conversion pipeline
cs = ConversionSettings.from_project(api, proj_key=PROJ_KEY)

# OCR
cs.ocr.enabled = True ## Enable or disable OCR
# cs.ocr.merge_mode = "prioritize-ocr" # Pick how OCR cells are treated when mixed with programmatic content

# backends = OCRSettings.get_backends(api) # list OCR backends
cs.ocr.backend = "alpine-ocr" ## Pick OCR backend

documents = ds.convert_documents(
    api=api,
    proj_key=PROJ_KEY,
    source_path="../../data/samples/2206.01062.pdf",
    conversion_settings=cs,
    progress_bar=True
)           
documents.download_all(result_dir="./converted_docs")
info = documents.generate_report(result_dir="./converted_docs")
print(info) 

Processing input:     : 100%|[38;2;15;98;254m██████████████████████████████[0m| 1/1 [00:00<00:00, 40.40it/s][38;2;15;98;254m[0m
Submitting input:     : 100%|[38;2;15;98;254m██████████████████████████████[0m| 1/1 [00:15<00:00, 15.28s/it][38;2;15;98;254m[0m
Converting input:     : 100%|[38;2;15;98;254m██████████████████████████████[0m| 1/1 [00:41<00:00, 41.05s/it][38;2;15;98;254m[0m


{'Total documents': 1, 'Successfully converted documents': 1}
