# Simple Search Examples

## Preliminaries

In [1]:
# only needed when developing code
%load_ext autoreload
%autoreload 2

In [2]:
from IPython.display import display, HTML
# enable horizontal scrolling in notebook
display(HTML("<style>pre { white-space: pre !important; }</style>"))

## Search function

In [3]:
from leat.search import Search

In [None]:
search = Search(predefined_configuration='BasicSearch', doc_store="../../tests/data/docset1")

In [5]:
search.all_concepts()

['Test', 'Performance Metrics', 'Ethical Principles', 'Data Ethics']

In [6]:
list(search.search_documents())

[<leat.search.result.result.DocResult at 0x114f42630>]

In [7]:
doc_results = [doc_result for doc_result in search]
doc_results

[<leat.search.result.result.DocResult at 0x1120a74d0>]

Note that iterating over `search` performs a `search.search_documents()`

Look at the results in a simple text format

In [8]:
for r in doc_results:
    print(r.astext())

../../tests/data/docset1/simple-document-1.txt
le Document Dec 18, 2022  There are tradeoffs between precision and recall as well as sensitivity and specificity.  Recall and sensitivity are mathematically equivalent.   Thus, the two tradeoffs differ, and the choice between them may introduce bias.  Fairness typically refers to equality and/or equity as a type of distributive justice. Utilitarianism is a different theory of justice.  
                    2022[Test]
                                                      precision[Performance Metrics]
                                                                    recall[Performance Metrics]
                                                                                      sensitivity[Performance Metrics]
                                                                                                      specificity[Performance Metrics]
                                                                                                

## Save or display the results

### Write results to a file

In [9]:
from leat.search.writer import TextWriter, HTMLWriter

In [10]:
def write_search_result_text(search, output_file):
    with open(output_file, "w") as ofp:
        w = TextWriter(stream=ofp)
        for doc_result in search:
            w.write_doc_result(doc_result)
            
def write_search_result_html(search, output_file):
    with open(output_file, "w") as ofp:
        w = HTMLWriter(stream=ofp)
        for doc_result in search:
            w.write_doc_result(doc_result)            

In [11]:
write_search_result_text(search, 'temp.txt')

TextWriter generates the simple text format like above.
HTMLWriter generates a html format.

In [12]:
write_search_result_html(search, 'temp.html')

### Write results to a string

If no stream is specified, then the Writer returns a string.

In [13]:
html_writer = HTMLWriter()
HTML(html_writer.get_doc_result_html(doc_results[0]))

The html uses summary/detail for the results, so click on the summary line to toggle whether the details show

### Color Scheme

Can change the colors for each concept with a color scheme

In [13]:
COLOR_SCHEME = {
    "concept_colors": {
        "Test": "#088F8F",  # blue green
        "Performance Metrics": "#00FF00",  # green
        "Ethical Principles": "#0096FF",  # bright blue
        "Data Ethics": "#87CEEB",  # sky blue
        "Be Verbs": "#800000",  # maroon
        "Conjunctions": "#800000",  # maroon
        "Articles": "#800000",  # maroon
        "Clause": "#ADD8E6",  # light blue
        "First 7": "#ADD8E6",  # light blue
        "Second 5": "#FF0000",  # red
    }
}

In [14]:
html_writer = HTMLWriter(scheme=COLOR_SCHEME)
HTML(html_writer.get_doc_result_html(doc_results[0]))