# Signals Boosting

NOTE: This notebook depends upon the the Retrotech dataset. If you have any issues, please rerun the [Setting up the Retrotech Dataset](1.setting-up-the-retrotech-dataset.ipynb) notebook.

In [40]:
%load_ext run_cell_extension
%run_cell ch04/1.setting-up-the-retrotech-dataset.ipynb 1
%run_cell ch04/1.setting-up-the-retrotech-dataset.ipynb 12

The run_cell_extension extension is already loaded. To reload it, use:
  %reload_ext run_cell_extension
{'query': 'query', 'fields': ['upc', 'name', 'manufacturer', 'score'], 'limit': 5, 'params': {'qf': 'name manufacturer longDescription', 'defType': 'edismax', 'indent': 'true', 'sort': 'score desc, upc asc'}}


## Keyword Search with No Signals Boosting

### Listing 4.5

In [43]:
query = "ipad"
collection = "products"
request = product_search_request(query)
response = engine.search(collection, request)
display_product_search(query, engine.docs(response))

## Create Signals Boosts (Signals Aggregation)

### Listing 4.6

In [21]:
products_collection="products"
signals_collection="signals"
signals_boosting_collection="signals_boosting"

engine.create_collection(signals_boosting_collection)

print("Aggregating Signals to Create Signals Boosts...")
signals_opts={"zkhost": "aips-zk", "collection": signals_collection}
df = spark.read.format("solr").options(**signals_opts).load()
df.createOrReplaceTempView("signals")

signals_aggregation_query = """
select q.target as query, c.target as doc, count(c.target) as boost
  from signals c left join signals q on c.query_id = q.query_id
  where c.type = 'click' AND q.type = 'query'
  group by query, doc
  order by boost desc
"""

signals_boosting_opts={"zkhost": "aips-zk", "collection": signals_boosting_collection, 
                       "gen_uniq_key": "true", "commit_within": "5000"}
spark.sql(signals_aggregation_query).write.format("solr").options(**signals_boosting_opts).mode("overwrite").save()
print("Signals Aggregation Completed!")

Wiping 'signals_boosting' collection
[('action', 'CREATE'), ('name', 'signals_boosting'), ('numShards', 1), ('replicationFactor', 1)]
Creating 'signals_boosting' collection
Status: Success
Aggregating Signals to Create Signals Boosts...
Signals Aggregation Completed!


## Search with Signals Boosts Applied

### Listing 4.7

In [23]:
query = "ipad"
signals_boosts_query = {
    "query": query,
    "fields": ["doc", "boost"],
    "limit": 10,
    "params": {
        "defType": "edismax",
        "qf": "query",
        "sort": "boost desc"
    }
}
response = engine.search(signals_boosting_collection, signals_boosts_query)
print(f"Boost Documents: \n{engine.docs(response)}")
product_boosts = " ".join(
    [f'"{entry["doc"]}"^{str(entry["boost"])}' 
     for entry in response])
print(f"\nBoost Query: \n{product_boosts}")

request = product_search_request(query)
request["params"]["boost"] = "sum(1,query({! df=upc v=$signals_boosting}))"
request["params"]["signals_boosting"] = product_boosts
response = engine.search(products_collection, request)
display_product_search(query, engine.docs(response))

Boost Documents: 
[{'doc': '885909457588', 'boost': 966}, {'doc': '885909457595', 'boost': 205}, {'doc': '885909471812', 'boost': 202}, {'doc': '886111287055', 'boost': 109}, {'doc': '843404073153', 'boost': 73}, {'doc': '635753493559', 'boost': 62}, {'doc': '885909457601', 'boost': 62}, {'doc': '885909472376', 'boost': 61}, {'doc': '610839379408', 'boost': 29}, {'doc': '884962753071', 'boost': 28}]


TypeError: byte indices must be integers or slices, not str

## Success!

You have now implemented your first AI-powered search algorithm: Signals Boosting. This is an overly simplistic implementation (we'll dive much deeper into signals boosting improvements in chapter 8), but it demonstrates the power of leveraging reflected intelligence quite well. We will dive into other Reflected Intelligence techniques in future chapters, such as Collaborative Filtering (in chapter 9 - Personalized Search) and Machine-learned Ranking (in chapter 10 - Learning to Rank).

Up next: Chapter 5 - [Knowledge Graph Learning](../ch05/1.open-information-extraction.ipynb)

