In [2]:
import warnings
warnings.filterwarnings("ignore")

In [4]:
!pip install python-dotenv
!pip install --upgrade --quiet langchain
!pip install --upgrade --quiet langchain-core
!pip install --upgrade --quiet langchain-community
!pip install --upgrade --quiet langchain-together



In [13]:
import os
from dotenv import load_dotenv
from langchain_together import ChatTogether

load_dotenv(override=True)

# create an object of an LLM model using TogetherAI
llm = ChatTogether(api_key=os.getenv("TOGETHER_API_KEY"), temperature=0.0, model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo")

In [5]:
!pip install youtube_transcript_api
!pip install pytube



In [16]:
from langchain_community.document_loaders import YoutubeLoader

# get data from youtube url using YoutubeLoader
video_url = "https://www.youtube.com/watch?v=5k1zkYCuF-8"
loader = YoutubeLoader.from_youtube_url(video_url, add_video_info = False)
data = loader.load()

In [17]:
# show transcript from a video
data[0].page_content

"prompt engineering makes complete sense you tell the AI precisely what you want to do in order to squeeze out every last bit of its performance until you start to get some unreasonable improvements by telling the chat Bots to read the question again or simply giving them a tip maybe because they're born and trained in the US and look at that they even made prompt engineering into a real high-paying job and just to talk to chat Bots all day how ridiculous but this seemingly basic practice has been in evolving into something extremely complex and arguably one of the most over engineered aspects of large language modeling like we got all these methods to choose from with some of the latest research still competing against each other about what is the best prompting method but at the end of the day do all these really matter because beneath all these variations they are all ultimately focused on enhancing the autor regressive process within llm that is the process of predicting the next t

In [18]:
# using static approach to pass a prompt with a transcript of a video to LLM
# define "system" and "human" messages for LLM as list of tuples

messages = [
    (
        "system",
        """
        Read through the entire transcript carefully.
        Provide a concise summary of the video's main topic and purpose.
        Extract and list the five most interesting or important points from the transcript.
        For each point: State the key idea in a clear and concise manner.

        - Ensure your summary and key points capture the essence of the video without including unnecessary details.
        - Use clear, engaging language that is accessible to a general audience.
        - If the transcript includes any statistical data, expert opinions, or unique insights, prioritize including these in your summary or key points.
        """
    ),
    ("human", data[0].page_content)
]

llm_msg = llm.invoke(messages)
print(llm_msg)

content='**Summary:**\n\nThe video discusses the concept of prompt engineering, a technique used to optimize the performance of large language models (LLMs) by providing them with the right input to elicit the desired output. The speaker explains that prompt engineering is not a magic solution, but rather a way to tap into the capabilities that the model has already learned during its pre-training stage. The speaker also discusses the concept of Chain of Thought, a method that involves having the model articulate intermediate reasoning steps when solving complex problems. The video also touches on the idea of test-time compute, a new concept that may be the next big breakthrough in LLMs.\n\n**Five Key Points:**\n\n1. **Prompt Engineering is Not a Magic Solution**: Prompt engineering is not a way to boost a model\'s performance, but rather a way to tap into the capabilities that the model has already learned during its pre-training stage.\n\n2. **Chain of Thought is Not a New Concept**:

In [19]:
# using dynamic approach with LLMChain for constructing a prompt with a template

from langchain.prompts import PromptTemplate
from langchain import LLMChain

# spicify summarizer_template
summarizer_template = PromptTemplate(
    input_variables=["video_transcript"],
    template="""
    Read through the entire transcript carefully.
    Provide a concise summary of the video's main topic and purpose.
    Extract and list the five most interesting or important points from the transcript.
    For each point: State the key idea in a clear and concise manner.

    - Ensure your summary and key points capture the essence of the video without including unnecessary details.
    - Use clear, engaging language that is accessible to a general audience.
    - If the transcript includes any statistical data, expert opinions, or unique insights, prioritize including these in your summary or key points.
    
    Video transcript: {video_transcript}
    """
)

In [20]:
# invoke the chain with the transcript
chain = LLMChain(llm=llm, prompt=summarizer_template)

summary = chain.invoke({
    "video_transcript": data[0].page_content
})

  chain = LLMChain(llm=llm, prompt=summarizer_template)


In [21]:
from IPython.display import Markdown, display
display(Markdown(summary["text"]))

**Summary:**

The video discusses the concept of prompt engineering, a technique used to optimize the performance of large language models (LLMs) by providing them with specific instructions or examples to improve their understanding and generation of text. The speaker explains that prompt engineering is not a magic solution, but rather a way to tap into the capabilities that the model has already learned during its pre-training stage. The video also touches on the concept of Chain of Thought, a method that allows the model to articulate intermediate reasoning steps when solving complex problems, and how it can be implemented in different ways, such as through special tokens or human-augmented synthetic data.

**Five Key Points:**

1. **Prompt Engineering is not a magic solution**: Prompt engineering is not a way to boost a model's performance, but rather a way to tap into its existing capabilities by providing it with specific instructions or examples.

2. **Chain of Thought is not a single method**: Chain of Thought is a concept that can be implemented in different ways, such as through special tokens or human-augmented synthetic data, and its effectiveness depends on how it is implemented.

3. **LLMs are sensitive to prompt variations**: Large language models are highly sensitive to prompt variations, such as word order, spacing, and capitalization, which can drastically impact their performance.

4. **Self-reflection and self-verification can improve performance**: Self-reflection and self-verification, which allow the model to iteratively check its answers, can significantly improve its performance on reasoning and generation tasks.

5. **Prompt engineering may be an illusion**: The speaker suggests that prompt engineering methods may be an illusion, and that the model's performance improvements may be due to coincidence or the model's ability to generate more tokens, rather than the specific method used.