### DESCRIPTION:
    This example shows how to summarize Youtube video transcripts using Azure OpenAI and Langchain chains.

### REQUIREMENTS:
    Create an .env file with your OpenAI API key and save it in the root directory of this project.

### For more information about Langchain agent toolkits, see:
  TBD - blog post here


In [1]:
from langchain.document_loaders import YoutubeLoader
# https://www.youtube.com/watch?v=FaV0tIaWWEg
loader = YoutubeLoader.from_youtube_url(
    "https://www.youtube.com/watch?v=FaV0tIaWWEg",
    add_video_info=False
)
video_transcripts = loader.load()

In [2]:
# Access the content and metadata of each document - let's ee how it looks like
for document in video_transcripts:
    content = document.page_content
    metadata = document.metadata
    print(content)

foreign have you built systems on top of Microsoft azure's data and analytics services are you wondering what Microsoft fabric means for your existing Investments and skills I recently spoke to someone who is uniquely well placed to talk about these things in fact he had so much to say about fabric though we split this recording into three parts if you want to hear the other two please make sure that you subscribed to this channel so with me today I have Tom peplow who is a principal at milliman and we also have engine's very own ad Freeman who is a senior data engineer now I'm really excited to be able to talk to Tom today about fabric because milliman have been Pioneers in the world of high performance cloud-based computation and analytics they've been Building Systems in this way since 2010 so they know a great deal about how to do this so Tom don't you already have everything you need does fabric bring anything to to the party for you yeah um we do have a lot right and we've built 

In [7]:
# let's use the Wikipedia tool and create a langchain agent that uses it
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.agents import load_tools
import utils

llm = utils.init_llm()
tools = load_tools(["wikipedia"], llm=llm)
agent = initialize_agent(tools,
                         llm,
                         agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
                         verbose=False)

# let's use the agent to get information about Microsoft
agent.run("Microsoft history")

'Microsoft was founded in 1975 by Bill Gates and Paul Allen. Its current best-selling products are the Microsoft Windows operating system, Microsoft Office, Xbox, Bing, and Microsoft Azure. Microsoft Office has gone through various versions since its first release in 1989, with Word being a part of the suite. The first version of Word was released in 1983 and has gone through various updates and revisions over the years.'

In [14]:
# let's use a sequence of prompts to create a chain
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain, SequentialChain
from langchain.memory import SimpleMemory

# chain 1 - get background info Microsoft
ch1 = agent

# chain 2 - summarize the video transcript from youtube
template = """Create an article summarizing the text below for an article, be concise and use bullets to explain. Use no more than {article_length} words.

Company information:
{input}

Video transcript:
{video_transcript}

Create an article:"""

prompt_template = PromptTemplate(input_variables=["input", "article_length", "video_transcript"], template=template)
ch2 = LLMChain(llm=llm, prompt=prompt_template) 

In [10]:
# check the expected input variables 
print(ch1.agent.llm_chain.prompt.input_variables)

['input', 'agent_scratchpad']


In [17]:
seq_chain = SequentialChain(
                input_variables=["input", "video_transcript"],
                memory=SimpleMemory(memories={"article_length": "500"}),
                chains=[ch1, ch2],
                verbose=False)

In [18]:
seq_chain.run(input="The history of Microsoft", video_transcript=video_transcripts[0].page_content)

"Microsoft CEO Satya Nadella kicked off the company's Build developer conference with a keynote address highlighting the importance of innovation and technology in driving economic growth and human development. Nadella discussed the company's history, tracing its roots back to the early days of computing and highlighting the role of AI in the current era. He also discussed Microsoft's latest initiatives, including the launch of Microsoft Fabric, a unified data analytics platform that combines compute and storage. Nadella also highlighted the company's efforts to promote AI safety, emphasizing the importance of building safety into the everyday toolchain. Finally, he showcased the potential of AI to transform people's lives, sharing a video about Jugalbandi, a chatbot that provides access to information about government schemes and services for people living in media-dark areas in India. Overall, Nadella's keynote emphasized the importance of innovation and technology in driving economi