# Making Neural Search Queries Accessible to Everyone with Gradio — Deploying Haystack’s Semantic Document Search with Hugging Face models in Gradio in Three Easy Steps

This is the code accompanying this Medium blog [article](https://medium.com/@duerr.sebastian/making-neural-search-queries-accessible-to-everyone-with-gradio-haystack-726e77aca047).

In [4]:
import gradio as gr

from haystack.nodes import FARMReader, PreProcessor, PDFToTextConverter, DensePassageRetriever
from haystack.nodes import ElasticsearchRetriever
from haystack.document_stores import ElasticsearchDocumentStore
from haystack.utils import launch_es
from haystack.pipelines import ExtractiveQAPipeline

In [2]:
preprocessor = PreProcessor(
    clean_empty_lines=True,
    clean_whitespace=True,
    clean_header_footer=True,
    split_by="word",
    split_length=100,
    split_respect_sentence_boundary=True,
    split_overlap=3
)

def print_answers(results):
    fields = ["answer", "score"]  # "context",
    answers = results["answers"]
    filtered_answers = []
    
    for ans in answers:
        filtered_ans = {
            field: getattr(ans, field)
            for field in fields
            if getattr(ans, field) is not None
        }
        filtered_answers.append(filtered_ans)

    return filtered_answers

def run_once(f):
    def wrapper(*args, **kwargs):
        if not wrapper.has_run:
            wrapper.has_run = True
            return f(*args, **kwargs)
    wrapper.has_run = False
    return wrapper

reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2")

INFO - haystack.modeling.utils -  Using devices: CPU
INFO - haystack.modeling.utils -  Number of GPUs: 0
INFO - haystack.modeling.model.language_model -  LOADING MODEL
INFO - haystack.modeling.model.language_model -  Could not find deepset/roberta-base-squad2 locally.
INFO - haystack.modeling.model.language_model -  Looking on Transformers Model Hub (in local cache and online)...
INFO - haystack.modeling.model.language_model -  Loaded deepset/roberta-base-squad2
INFO - haystack.modeling.logger -  ML Logging is turned off. No parameters, metrics or artifacts will be logged to MLFlow.
INFO - haystack.modeling.utils -  Using devices: CPU
INFO - haystack.modeling.utils -  Number of GPUs: 0
INFO - haystack.modeling.infer -  Got ya 9 parallel workers to do inference ...
INFO - haystack.modeling.infer -   0     0     0     0     0     0     0     0     0  
INFO - haystack.modeling.infer -  /w\   /w\   /w\   /w\   /w\   /w\   /w\   /|\  /w\ 
INFO - haystack.modeling.infer -  /'\   / \   /'\   

# Sparse Passage Retriever

In [6]:
launch_es()
document_store = ElasticsearchDocumentStore(host="localhost", username="", password="", index="document")

In [7]:
retriever_es = ElasticsearchRetriever(document_store=document_store)

pipe = ExtractiveQAPipeline(reader, retriever_es)

@run_once
def written_document(pdf_file):
    converter = PDFToTextConverter(remove_numeric_tables=True, valid_languages=["en"])
    document = [converter.convert(file_path=pdf_file.name, meta=None)[0]]
    preprocessed_docs = preprocessor.process(document)
    document_store.write_documents(preprocessed_docs)
    return None
    

def predict(question, pdf_file):
    written_document(pdf_file)
    result = pipe.run(query=question, params={
        "Retriever": {"top_k": 20}, "Reader": {"top_k": 5}})
    answers = print_answers(result)
    return answers


title = "Search PDF Business Reports with Sparse Passage Retrieval"
description = """
<center>Sample Questions: What are strategic initiatives? </center>
"""

iface = gr.Interface(fn=predict,
                    inputs=[gr.inputs.Textbox(lines = 3, label='Ask an open question!'),
                    gr.inputs.File(file_count="single", type="file", label="Upload a pdf"),
                        ],
                    outputs="text",
                    title=title, description=description,
                    flagging_options=["top", "medium", "bad"],
                    interpretation="default",
                    theme="dark-grass"  # "default", "huggingface", "dark-grass", "peach"
                    )

iface.launch(
    # share=True,
    # auth=("admin", "pass1234"),
    # enable_queue=True # cannot be enabled with auth enabled
)

Running on local URL:  http://127.0.0.1:7860/

To create a public link, set `share=True` in `launch()`.


(<fastapi.applications.FastAPI at 0x106e74eb0>, 'http://127.0.0.1:7860/', None)

pdftotext version 4.03 [www.xpdfreader.com]
Copyright 1996-2021 Glyph & Cog, LLC
100%|██████████| 1/1 [00:00<00:00,  6.36docs/s]
INFO - haystack.modeling.model.optimization -  apex not found, won't use it. See https://nvidia.github.io/apex/
INFO - haystack.modeling.model.optimization -  apex not found, won't use it. See https://nvidia.github.io/apex/
INFO - haystack.modeling.model.optimization -  apex not found, won't use it. See https://nvidia.github.io/apex/
INFO - haystack.modeling.model.optimization -  apex not found, won't use it. See https://nvidia.github.io/apex/
INFO - haystack.modeling.model.optimization -  apex not found, won't use it. See https://nvidia.github.io/apex/
INFO - haystack.modeling.model.optimization -  apex not found, won't use it. See https://nvidia.github.io/apex/
INFO - haystack.modeling.model.optimization -  apex not found, won't use it. See https://nvidia.github.io/apex/
INFO - haystack.modeling.model.optimization -  apex not found, won't use it. See https:

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggi

  start_indices = flat_sorted_indices // max_seq_len
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.73 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  9.09 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 10.64 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 11.20 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 11.29 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  8.98 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 10.65 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 10.88 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 11.33 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 11.27 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 10.59 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.93 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 11.11 Batches/s