<a href="https://colab.research.google.com/github/atilatech/atila-core-service/blob/add_long_form_answering/atlas/notebooks/question_answering_youtube.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Answer Questions using Youtube

This notebook shows how to give long-form answers to questions using Youtube.

Inspired by [Abstractive Question Answering](https://docs.pinecone.io/docs/abstractive-question-answering) and [Long Form Question Answering in Haystack](https://www.pinecone.io/learn/haystack-lfqa/).

This tutorial builds on the previous tutorial, Create an Atlas service[todo add link], that showed how to index Youtube videos and return matching sections of a video given a search term. 

This tutorial will be covering how to take those matching sections and combine them together to generate a long-form answer.

At a high-level it is a 2 step process:

1. Find sentences that have the relevant sections

2. Combine the sections together to form a coherent answer

## Get Relevant Context

First we are going to send a query "best exercises for longevity" and it will return all the videos that are related to the topics, exercise and longevity.

In [1]:
%pip install pinecone-client requests

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pinecone-client
  Downloading pinecone_client-2.1.0-py3-none-any.whl (170 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m170.6/170.6 KB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
Collecting loguru>=0.5.0
  Downloading loguru-0.6.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 KB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: loguru, pinecone-client
Successfully installed loguru-0.6.0 pinecone-client-2.1.0


## Get API Keys

1. You will need a [Pinecone API key (free)](https://app.pinecone.io/).
2. Deploy [this model](https://huggingface.co/tomiwa1a/openai-whisper-endpoint) as an inference endpoint.

In [2]:
from getpass import getpass
# getpass tip: https://stackoverflow.com/a/54577734/5405197
PINECONE_API_KEY = getpass('Enter PINECONE_API_KEY')
HUGGING_FACE_API_KEY = getpass('Enter HUGGING_FACE_API_KEY')
# replace this with your HUGGING_FACE_ENDPOINT_URL
HUGGING_FACE_ENDPOINT_URL = "https://rl2hxotyspedkt19.us-east-1.aws.endpoints.huggingface.cloud"

Enter PINECONE_API_KEY··········
Enter HUGGING_FACE_API_KEY··········


In [3]:
import requests
import pinecone
import json
from typing import Union

pinecone_index_id = "youtube-search"

pinecone.init(
    api_key=PINECONE_API_KEY,
    environment="us-west1-gcp"
)

def send_encoding_request(query: Union[str, list]):
    payload = json.dumps({
        "inputs": "",  # inputs key is not used but our endpoint expects it
        "query": query,
    })
    headers = {
        'Authorization': f'Bearer {HUGGING_FACE_API_KEY}',
        'Content-Type': 'application/json'
    }

    response = requests.request("POST", HUGGING_FACE_ENDPOINT_URL, headers=headers, data=payload)
    return response.json()

pinecone_index = pinecone.Index(pinecone_index_id)
def query_model(query, video_id=""):
    encoded_query = send_encoding_request(query)
    metadata_filter = {"video_id": {"$eq": video_id}} if video_id else None
    vectors = encoded_query['encoded_segments'][0]['vectors']
    return pinecone_index.query(vectors, top_k=5,
                                include_metadata=True,
                                filter=metadata_filter).to_dict()

In [None]:
query = "best exercises for longevity"
results = query_model(query)
results['matches'][3]

## Create Generator Model

Next, we create our generator, which will take the given paragraphs and combine them together to give an answer.

> Generators are sequence-to-sequence (Seq2Seq) models that take the query and retrieved contexts as input and use them to generate an output, the answer.

[Long-Form Question-Answering](https://www.pinecone.io/learn/haystack-lfqa/#:~:text=Generators%20are%20sequence%2Dto%2Dsequence%20(Seq2Seq)%20models%20that%20take%20the%20query%20and%20retrieved%20contexts%20as%20input%20and%20use%20them%20to%20generate%20an%20output%2C%20the%20answer.)

You can think of it as a model that takes a piece of text, transforms it and generates another piece of text. We will use the [bart_lfqa model](https://towardsdatascience.com/long-form-qa-beyond-eli5-an-updated-dataset-and-approach-319cb841aabb) which [can be found on huggingface](https://huggingface.co/vblagoje/bart_lfqa).

In [None]:
%pip install -U transformers torch

In [27]:
import torch
from transformers import AutoTokenizer, AutoModel, AutoModelForSeq2SeqLM

model_name = "vblagoje/bart_lfqa"
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
model = model.to(device)


In [35]:
def generate_answer(query, documents):

    # concatenate question and support documents into BART input
    conditioned_doc = "<P> " + " <P> ".join([d for d in documents])
    query_and_docs = "question: {} context: {}".format(query, conditioned_doc)

    model_input = tokenizer(query_and_docs, truncation=False, padding=True, return_tensors="pt")

    generated_answers_encoded = model.generate(input_ids=model_input["input_ids"].to(device),
                                            attention_mask=model_input["attention_mask"].to(device),
                                            min_length=64,
                                            max_length=256,
                                            do_sample=False, 
                                            early_stopping=True,
                                            num_beams=8,
                                            temperature=1.0,
                                            top_k=None,
                                            top_p=None,
                                            eos_token_id=tokenizer.eos_token_id,
                                            no_repeat_ngram_size=3,
                                            num_return_sequences=1)
    answer = tokenizer.batch_decode(generated_answers_encoded, skip_special_tokens=True,clean_up_tokenization_spaces=True)
    return answer
# # below is the abstractive answer generated by the model
# ["When you heat water to

In [36]:
query = "what is egcg"
context_results = query_model(query)

answer_context = [sentence['metadata']['text'] for sentence in context_results['matches']]

generate_answer(query, answer_context)

query_and_docs question: what is egcg context: <P>  grain. EGCG is a polyphenol found in green tea and a potent antioxidant that has shown effectiveness  against various conditions, including androgenic alopecia. Combating hair loss is not just about looks,  understanding the mechanisms of senescent alopecia and ways to reverse it can provide insights into  other aspects of aging. In this new study, the researchers used an emerging micro needle technology  to deliver drugs directly to the inner layers of the skin. Cone like micro needles were loaded  with nanoparticles containing rapamycin, EGCG, or a combination. The micro needles were applied to <P>  using dissolvable micro needles loaded with brappa mice in and epi-galocatican galate or EGCG  and active ingredients in green tea. Studies have found that rapamycin, one of the most promising  general protective drugs, not only stimulates hair regrow, but can also partially reverse hair  grain. EGCG is a polyphenol found in green tea an

['Epi-Galocatican Galate or EGCG is a polyphenol found in green tea and a potent antioxidant that has shown effectiveness  against various conditions, including androgenic alopecia. In a study, the researchers used an emerging micro needle technology  to deliver drugs directly to the inner layers of the skin. The micro needles were applied to a dissolvable micro needles loaded with brappa mice in and epi-galocatican galate  and active ingredients in Green tea. The results were dose-dependent, with moderate doses of rapamycin being the  most effective. The researchers also confirmed that the treatment resulted in increased  autophagy in follicular regions, and promoting Autophagy is currently thought to be the central mechanism of action. This study reiterates the health potential of two  molecules popular in the longevity field. Additionally, this micro needle-based  drug delivery method could potentially be used to treat various other skin conditions.']

In [39]:
context_results['matches'][0]

{'id': 'GK5YNAJrRWc-t38',
 'score': 26.6459885,
 'values': [],
 'sparseValues': {},
 'metadata': {'end': 45.0,
  'id': 'GK5YNAJrRWc-t38',
  'length': 252.0,
  'start': 38.0,
  'text': ' grain. EGCG is a polyphenol found in green tea and a potent antioxidant that has shown effectiveness  against various conditions, including androgenic alopecia. Combating hair loss is not just about looks,  understanding the mechanisms of senescent alopecia and ways to reverse it can provide insights into  other aspects of aging. In this new study, the researchers used an emerging micro needle technology  to deliver drugs directly to the inner layers of the skin. Cone like micro needles were loaded  with nanoparticles containing rapamycin, EGCG, or a combination. The micro needles were applied to',
  'thumbnail': 'https://i.ytimg.com/vi/GK5YNAJrRWc/sddefault.jpg',
  'title': '"Longevity Molecules" Preserve Hair & Hearing in Mice',
  'url': 'https://youtu.be/GK5YNAJrRWc?t=38',
  'video_id': 'GK5YNAJrRWc'