# Implementing Scout
In this notebook, I will be implementing the scout search engine I have created in order to see how well it can search for podcasts; with the hope that it can improve a user's search for new podcasts based on interest.

In [1]:
from scout_client import Scout

scout = Scout('http://localhost:8000')

results = scout.get_documents(q='food', page=1)
# print(results)
print(results.keys())
print(results["document_count"])

dict_keys(['document_count', 'documents', 'filtered_count', 'filters', 'ordering', 'page', 'pages', 'ranking', 'search_term'])
6573105


As we can see from above, there are over 657,000 documents in the database

In [15]:
# getting page results based on some topic
page_results=[]
for page_no in range(1,5):
    results = scout.get_documents(q='food', page=page_no)
    doc_results = results["documents"]
    for doc_result in doc_results:
        page_results.append(doc_result)
    print(len(page_results))

50
100
150
200


In this example, I am searching for podcasts about food and starting to get page results

In [20]:
# last result in food podcast search
page_results[-1]['metadata']['episode_id']

'4kHK1QPnx70zU4KNwOOJ1X'

In [23]:
epi_id = []
for page_result in page_results:
    metadata = page_result['metadata']['episode_id']
    epi_id.append(metadata)
print(epi_id)

['6vEHAUCjTy94rkMAWgotoq', '3c1YD96eBOCjdDqE0jYO3z', '4GkHcdlSmA4be0oCt63RqG', '1Br3sOn0SY7oFtgQ6vL2Yy', '27NP9zsDmOj24rWxnUFdvP', '2p5rzo9djahFeptR8jJA0H', '4YXplCf4bXeGS9A9q1zK0Y', '7yEtdOjvzW3vaxOTTHQZUT', '3k8xFD4t31ommXlJLcP20d', '235AiGsKWukOAYrMLmAmEe', '4RmePfcXXXHJhoAHrnoRmu', '0x72i2GlvFSnrjrfXH3G2X', '2EfynHGA4hPxMjkAUuTHiN', '1UnThtDdLH7sbjpGedOeCL', '045uI9EkPpK4EpDnBMdEZa', '0x72i2GlvFSnrjrfXH3G2X', '7wHWk3PlezuZg5lfIj5UDL', '2Hi8V4e6qpHBKFD310Px0x', '77rgXZ9qetfZ2ikPAEL1Hd', '6CeWINlHaGCmG4pvbr3htK', '77oDgWzp4yck7UN2hroHLM', '0XGBlGw6dcCRDkSmN6Gcia', '79biNkExDpnEZJPqxqQCon', '5Q7aBl67Xp97kl20uOYXEq', '4854Bp3xzwP0pLDzTtcYJT', '7Bbq3ibeGT60hSGFhxR9C0', '12eYX8VpLZD4gNddxYxo7x', '3mG3PEroeT7pgccsqMt4YY', '6ZcO15xfsamI9OCiSobapA', '5amAa3AJcd557NwMoPPbF3', '6lVa8L4SlywQ5hcnRUrwwS', '0rLnRBHXWhPbqZe1FnRHwj', '15Asa7lA1AEUmvGKDi6Ssz', '1TP25beI3VUYMsPKKiU4kQ', '0YEUgnnZK39lCAZ1YQf2Rb', '2xSq9YCCPcdxcFPk1Wrx6v', '29sWx53ANjcxjlMQSqgRIx', '1gV4aLTtN2ICNMJ9U6BInC', '29M4wIpuh3

Here we can see all of the podcast episodes that mention food or are about food

In [24]:
# searching for the episoodes with the most common mentions of food
from collections import Counter
epi_counter = Counter(epi_id)
epi_counter.most_common()

[('4GkHcdlSmA4be0oCt63RqG', 4),
 ('7yEtdOjvzW3vaxOTTHQZUT', 3),
 ('29M4wIpuh3VSOCBodzbMsc', 3),
 ('7mwVOm9NeJnu0BkZt5AJjt', 3),
 ('0x72i2GlvFSnrjrfXH3G2X', 2),
 ('4854Bp3xzwP0pLDzTtcYJT', 2),
 ('2aeAtoeq9LCAhdeateZ1J9', 2),
 ('03sIbB9HMXjvNA8FHIX0ZK', 2),
 ('1jbr7MB9WedWuwC6QFFACN', 2),
 ('1PZdHO9V64vH4OBBc3gRTd', 2),
 ('3DSira4N03wdJNLSJJcxI1', 2),
 ('26AWlhiNu6tSM1INBXsQOl', 2),
 ('1rJAxJszAe7Ufu5JmYp0zP', 2),
 ('3f1dfmnYNbVwQpuVRPHcDT', 2),
 ('6vEHAUCjTy94rkMAWgotoq', 1),
 ('3c1YD96eBOCjdDqE0jYO3z', 1),
 ('1Br3sOn0SY7oFtgQ6vL2Yy', 1),
 ('27NP9zsDmOj24rWxnUFdvP', 1),
 ('2p5rzo9djahFeptR8jJA0H', 1),
 ('4YXplCf4bXeGS9A9q1zK0Y', 1),
 ('3k8xFD4t31ommXlJLcP20d', 1),
 ('235AiGsKWukOAYrMLmAmEe', 1),
 ('4RmePfcXXXHJhoAHrnoRmu', 1),
 ('2EfynHGA4hPxMjkAUuTHiN', 1),
 ('1UnThtDdLH7sbjpGedOeCL', 1),
 ('045uI9EkPpK4EpDnBMdEZa', 1),
 ('7wHWk3PlezuZg5lfIj5UDL', 1),
 ('2Hi8V4e6qpHBKFD310Px0x', 1),
 ('77rgXZ9qetfZ2ikPAEL1Hd', 1),
 ('6CeWINlHaGCmG4pvbr3htK', 1),
 ('77oDgWzp4yck7UN2hroHLM', 1),
 ('0XGBl

In [41]:
relevant_episode = []
for page_result in page_results:
    if page_result['metadata']['episode_id'] == '7mwVOm9NeJnu0BkZt5AJjt':
        relevant_episode.append(page_result['content'])
print(relevant_episode)

[" It's really a question of how your relationship is with food to start with. So maybe change the name that you have for food, maybe start calling it nourishment know it you're like the French do we don't have a word like food in French Ivo, D. We have no word that equals food in French. We don't have a short word for food food is called New richer or Margie. So eat.", ' So in France food is sacred language language of the word food in France says the nutrition no heat your comes from new he comes from nourishing. So we actually call Food nourishment in friends, like eat your food most I know feature each your nourishment. Can you imagine have right off the bat the real relationship with food is so different than in the United States choose from', ' And yeah, I guess health comes first looking at it that way and starting to call you food nourishment as opposed to food could be the way to go food for thought.']


Now, that we have the most common mentions of food, we need to find the most relevant episode pertaining to food. As we can see from the output above, this episode is talking about people's relationship with food in terms of their nationality or home country. Now I will use the metadata in order to get a more accurate picture of what this specific podcast is about.

In [33]:
import pandas as pd
metadata = pd.read_csv('metadata.tsv', sep ='\t')

In [34]:
metadata.head()

Unnamed: 0,show_uri,show_name,show_description,publisher,language,rss_link,episode_uri,episode_name,episode_description,duration,show_filename_prefix,episode_filename_prefix
0,spotify:show:2NYtxEZyYelR6RMKmjfPLB,Kream in your Koffee,A 20-something blunt female takes on the world...,Katie Houle,['en'],https://anchor.fm/s/11b84b68/podcast/rss,spotify:episode:000A9sRBYdVh66csG2qEdj,1: It’s Christmas Time!,On the first ever episode of Kream in your Kof...,12.700133,show_2NYtxEZyYelR6RMKmjfPLB,000A9sRBYdVh66csG2qEdj
1,spotify:show:15iWCbU7QoO23EndPEO6aN,Morning Cup Of Murder,Ever wonder what murder took place on today in...,Morning Cup Of Murder,['en'],https://anchor.fm/s/b07181c/podcast/rss,spotify:episode:000HP8n3hNIfglT2wSI2cA,The Goleta Postal Facility shootings- January ...,"See something, say something. It’s a mantra ma...",6.019383,show_15iWCbU7QoO23EndPEO6aN,000HP8n3hNIfglT2wSI2cA
2,spotify:show:6vZRgUFTYwbAA79UNCADr4,Inside The 18 : A Podcast for Goalkeepers by G...,Inside the 18 is your source for all things Go...,Inside the 18 GK Media,['en'],https://anchor.fm/s/81a072c/podcast/rss,spotify:episode:001UfOruzkA3Bn1SPjcdfa,Ep.36 - Incorporating a Singular Goalkeeping C...,Today’s episode is a sit down Michael and Omar...,43.616333,show_6vZRgUFTYwbAA79UNCADr4,001UfOruzkA3Bn1SPjcdfa
3,spotify:show:5BvKEjaMSuvUsGROGi2S7s,Arrowhead Live!,Your favorite podcast for everything @Chiefs! ...,Arrowhead Live!,['en-US'],https://anchor.fm/s/917dba4/podcast/rss,spotify:episode:001i89SvIQgDuuyC53hfBm,Episode 1: Arrowhead Live! Debut,Join us as we take a look at all current Chief...,58.1892,show_5BvKEjaMSuvUsGROGi2S7s,001i89SvIQgDuuyC53hfBm
4,spotify:show:7w3h3umpH74veEJcbE6xf4,FBoL,"The comedy podcast about toxic characters, wri...",Emily Edwards,['en'],https://www.fuckboisoflit.com/episodes?format=rss,spotify:episode:0025RWNwe2lnp6HcnfzwzG,"The Lion, The Witch, And The Wardrobe - Ashley...",The modern morality tail of how to stay good f...,51.78205,show_7w3h3umpH74veEJcbE6xf4,0025RWNwe2lnp6HcnfzwzG


In [42]:
metadata[metadata['episode_filename_prefix'] == '7mwVOm9NeJnu0BkZt5AJjt']['episode_description'].iloc[0]

'Ingrid De La Mare-Kenny gets real on content theft from blogs like goop or instagram copy cows and copy-paster. She stomps her foot and reclaims her philosophy that French etiquette and not eating with your mouthful can impact your waistline. She discusses American faux-Pas that can be fixed by french-etiquette to benefit your digestion, gut health and keep you as skinny as a wine drinking dessert eating french woman. She discusses her stand on whey protein powder and protein powders in general and why she decided to bring one on the market. She reconciles your relationship with food reminding you that in French the word food is called «\xa0nourriture\xa0» nourishment and perhaps that’s the key to the positive connotation on food and why food should be a happy place rather than a dreaded one that causes indecisiveness that makes French waiters so impatient when waiting on American patrons ... a lot of loose ends are tied together into a beautiful French now of gangster chicness. Show 

From the podcast description above, we get an interesting podcast about food and etiquette. Overall we can see that the Scout search engine does a good job of helping us search for specific podcasts but is not perfect. Maybe instead of using the transcripts, we can find a way to use the podcast audio to improve our search.