# Answer Questions using Youtube

This notebook shows how to give long-form answers to questions using information found on Youtube.

Inspired by [Abstractive Question Answering](https://docs.pinecone.io/docs/abstractive-question-answering) and [Long Form Question Answering in Haystack](https://www.pinecone.io/learn/haystack-lfqa/).

This tutorial builds on the previous tutorial, Create an Atlas service[todo add link], that showed how to index Youtube videos and return matching sections of a video given a search term. 

This tutorial will be covering how to take those matching sections and combine them together to generate a long-form answer.

At a high-level it is a 2 step process:

1. Find sentences that have the relevant sections

2. Combine the sections together to form a coherent answer

## Get Relevant Context

First we are going to send a query "best exercises for longevity" and it will return all the videos that are related to the topics, exercise and longevity.

In [4]:
%pip install pinecone-client requests

Note: you may need to restart the kernel to use updated packages.


## Get API Keys

1. You will need a [Pinecone API key (free)](https://app.pinecone.io/).
2. Deploy [this model](https://huggingface.co/tomiwa1a/openai-whisper-endpoint) as an inference endpoint.

In [2]:
from getpass import getpass
# getpass tip: https://stackoverflow.com/a/54577734/5405197
PINECONE_API_KEY = getpass('Enter PINECONE_API_KEY')
HUGGING_FACE_API_KEY = getpass('Enter HUGGING_FACE_API_KEY')
# replace this with your HUGGING_FACE_ENDPOINT_URL
HUGGING_FACE_ENDPOINT_URL = "https://rl2hxotyspedkt19.us-east-1.aws.endpoints.huggingface.cloud"

In [5]:
import requests
import pinecone
import json
from typing import Union

pinecone_index_id = "youtube-search"

pinecone.init(
    api_key=PINECONE_API_KEY,
    environment="us-west1-gcp"
)

def send_encoding_request(query: Union[str, list]):
    payload = json.dumps({
        "inputs": "",  # inputs key is not used but our endpoint expects it
        "query": query,
    })
    headers = {
        'Authorization': f'Bearer {HUGGING_FACE_API_KEY}',
        'Content-Type': 'application/json'
    }

    response = requests.request("POST", HUGGING_FACE_ENDPOINT_URL, headers=headers, data=payload)
    return response.json()

pinecone_index = pinecone.Index(pinecone_index_id)
def query_model(query, video_id=""):
    encoded_query = send_encoding_request(query)
    metadata_filter = {"video_id": {"$eq": video_id}} if video_id else None
    vectors = encoded_query['encoded_segments'][0]['vectors']
    return pinecone_index.query(vectors, top_k=5,
                                include_metadata=True,
                                filter=metadata_filter).to_dict()

In [8]:
results = query_model("best exercises for longevity")
results['matches'][3]

{'id': '92kYDVjX0G0-t11',
 'score': 24.5990486,
 'values': [],
 'sparseValues': {},
 'metadata': {'duration': 13.0,
  'end': 24.0,
  'id': '92kYDVjX0G0-t11',
  'length': 371.0,
  'start': 11.0,
  'text': "living longer what do I need to do it's it's like a super well crafted exercise program that is geared towards strength muscle mass and cardiorespiratory Fitness so it's all",
  'thumbnail': 'https://i.ytimg.com/vi/92kYDVjX0G0/sddefault.jpg',
  'title': 'Peter Attia on The Best Exercises for Longevity',
  'url': 'https://www.youtube.com/watch?v=92kYDVjX0G0&t=11',
  'video_id': '92kYDVjX0G0',
  'views': 2955787.0}}