# 1. Model - abstracts over the LLM API
# 2. Prompt Template - abstracts over the Prompts sent to the LLMs
# 3. Output Parser - transforms raw output into workable formats (strings, jsons, etc...)

# Document Loaders, Retrievers, Vector Stores, Agents, Tools, ......

In [1]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")

llm.invoke("What is LangChain?")

AIMessage(content='LangChain is a framework designed for developing applications that leverage large language models (LLMs). It provides a set of tools and components that make it easier to build applications that require natural language processing and understanding capabilities. LangChain allows developers to create workflows that incorporate LLMs, enabling various functionalities such as text generation, question answering, summarization, and more.\n\nKey features of LangChain include:\n\n1. **Modular Components**: LangChain provides reusable components for different tasks, such as prompt management, memory handling, and chaining together multiple operations.\n\n2. **Integration with APIs**: The framework supports integration with various LLMs and APIs, allowing developers to connect their applications to models from providers like OpenAI, Hugging Face, and others.\n\n3. **Chaining**: LangChain allows for the creation of complex workflows by chaining together multiple calls to LLMs 

In [2]:
output = llm.invoke("What is LangChain?")

output

AIMessage(content='LangChain is an open-source framework designed for building applications that leverage large language models (LLMs). It provides a set of tools and components that make it easier for developers to integrate LLMs into their applications, allowing for functionalities such as natural language understanding, text generation, and conversational interfaces.\n\nThe key features of LangChain include:\n\n1. **Modularity**: LangChain is designed with modular components that can be easily combined to create different types of applications. This modularity allows developers to customize and extend the functionality of their applications as needed.\n\n2. **Integration with LLMs**: LangChain supports various LLMs and APIs, enabling developers to connect and utilize different models for their specific use cases.\n\n3. **Memory Management**: The framework provides tools for managing conversation history and context, which is essential for creating applications that require stateful 

In [3]:
type(output)

langchain_core.messages.ai.AIMessage

In [4]:
output.response_metadata

{'token_usage': {'completion_tokens': 337,
  'prompt_tokens': 12,
  'total_tokens': 349,
  'completion_tokens_details': {'accepted_prediction_tokens': 0,
   'audio_tokens': 0,
   'reasoning_tokens': 0,
   'rejected_prediction_tokens': 0},
  'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}},
 'model_name': 'gpt-4o-mini-2024-07-18',
 'system_fingerprint': 'fp_79129002ea',
 'finish_reason': 'stop',
 'logprobs': None}

In [5]:
from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()

output_parser.invoke(output)

'LangChain is an open-source framework designed for building applications that leverage large language models (LLMs). It provides a set of tools and components that make it easier for developers to integrate LLMs into their applications, allowing for functionalities such as natural language understanding, text generation, and conversational interfaces.\n\nThe key features of LangChain include:\n\n1. **Modularity**: LangChain is designed with modular components that can be easily combined to create different types of applications. This modularity allows developers to customize and extend the functionality of their applications as needed.\n\n2. **Integration with LLMs**: LangChain supports various LLMs and APIs, enabling developers to connect and utilize different models for their specific use cases.\n\n3. **Memory Management**: The framework provides tools for managing conversation history and context, which is essential for creating applications that require stateful interactions, such

# LCEL! - LangChain Expression Language
It is a special language to combine langchain components (lego pieces!) into CHAINS! (REUSABLE BUILDING BLOCKS)

In [6]:
chain = llm | output_parser

chain.invoke("What is LangChain?")

'LangChain is a framework designed for building applications that utilize large language models (LLMs). It provides tools and components to simplify the development process by allowing developers to create robust applications that can interact with LLMs in various ways. LangChain is particularly useful for tasks such as:\n\n1. **Prompt Management**: It helps manage prompts and responses effectively, enabling the creation of dynamic and context-aware interactions with the LLMs.\n\n2. **Chaining**: The framework allows developers to create chains of calls to LLMs or other tools, enabling more complex workflows that can involve multiple steps or components.\n\n3. **Integration**: LangChain can integrate with various data sources, APIs, and tools, making it easier to build applications that require external data or functionality.\n\n4. **Memory**: It can manage conversational context and memory, allowing applications to maintain state over interactions, which is crucial for building chatbo

In [7]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("What is {ai_python_framework}")

chain = prompt | llm | output_parser

chain.invoke({"ai_python_framework": "LangChain"})

'LangChain is a framework designed to facilitate the development of applications that incorporate large language models (LLMs). It provides a structured way to build applications that leverage the capabilities of LLMs for various tasks, such as natural language processing, data retrieval, and conversational AI.\n\nKey features of LangChain include:\n\n1. **Modularity**: LangChain is built around the concept of modular components, which allows developers to easily integrate different functionalities, such as input/output handling, memory management, and API interaction.\n\n2. **Chains**: The framework allows developers to create "chains" of operations, where the output of one module can serve as the input to another. This enables complex workflows and interactions with LLMs.\n\n3. **Agents**: LangChain supports the creation of agents, which can make decisions based on the inputs they receive and can dynamically choose which actions to take.\n\n4. **Memory**: It provides capabilities for

In [8]:
chain.invoke({"ai_python_framework": "Pydantic?"})

"Pydantic is a data validation and settings management library for Python, primarily used for defining and validating data structures through Python type annotations. It is built on top of Python's type hints and provides a way to enforce type constraints, validate data, and serialize/deserialize data to and from various formats, such as JSON.\n\nKey features of Pydantic include:\n\n1. **Data Validation**: Pydantic checks that the data conforms to the types and constraints defined in your models. If the data does not match the expected types, Pydantic raises errors, making it easier to catch issues early.\n\n2. **Type Annotations**: Pydantic leverages Python's type hints, allowing you to define models using standard Python types, such as `str`, `int`, `float`, `List`, `Dict`, etc. This makes your code more readable and maintainable.\n\n3. **Nested Models**: Pydantic supports nesting of models, allowing you to create complex data structures that can also be validated according to the sa

In [11]:
from langchain_ollama import ChatOllama

llm = ChatOllama(model="llama3.2")

chain = prompt | llm | output_parser

print(chain.invoke({"ai_python_framework": "LangChain"}))
print(chain.invoke({"ai_python_framework": "Pydantic?"}))

LangChain is an open-source, decentralized protocol for automating data fetching and processing. It's built on top of Web3 and utilizes the Polkadot network to enable seamless interactions between web applications and blockchain-based data sources.

LangChain was founded by Maxime Lagarde in 2021 with the goal of making it easier for developers to integrate blockchain data into their web applications. The protocol uses a novel approach to data fetching, called "data chaining," which allows users to chain together multiple data sources and automate complex workflows.

Key Features of LangChain:

1. **Data Chaining**: LangChain enables users to link together multiple data sources, such as blockchain networks, external APIs, and databases, into a single workflow.
2. **Web3 Integration**: LangChain integrates seamlessly with Web3 frameworks like Solidity, Rust, and JavaScript, making it easy for developers to access blockchain data from their web applications.
3. **Polkadot Network**: Lang

In [12]:
def reusable_chain(llm):
    chain = prompt | llm | output_parser
    return chain

llm_openai = ChatOpenAI(model="gpt-4o-mini")
llm_ollama = ChatOllama(model="llama3.2")

chain1 = reusable_chain(llm_openai)
chain2 = reusable_chain(llm_ollama)

chain1.invoke({"ai_python_framework": "LangChain"})
chain2.invoke({"ai_python_framework": "Pydantic?"})

"Pydantic is a popular Python library used for building robust, fast, and scalable data models. It provides a powerful way to define data structures, validate input data, and generate code for serialization and deserialization.\n\nPydantic is often compared to other libraries like Marshmallow or Django's built-in validation mechanisms. However, Pydantic has some key advantages that make it a favorite among developers:\n\n1. **Type Safety**: Pydantic uses type hinting to ensure that the data being validated matches the expected structure.\n2. **Validation**: Pydantic can validate data against a set of rules defined in the model, ensuring that the data conforms to the expected format.\n3. **Serialization and Deserialization**: Pydantic provides built-in support for serializing data to JSON or other formats, and deserializing data from these formats.\n4. **Automatic Generation of Code**: Pydantic can automatically generate code for serialization and deserialization based on the model defi

# Processing PDFs and organizing them into a Table 

In [14]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("./assets-resources/attention-paper.pdf")

docs = loader.load()

docs[0]

Document(metadata={'source': './assets-resources/attention-paper.pdf', 'page': 0}, page_content='Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.com\nNoam Shazeer∗\nGoogle Brain\nnoam@google.com\nNiki Parmar∗\nGoogle Research\nnikip@google.com\nJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.com\nAidan N. Gomez∗ †\nUniversity of Toronto\naidan@cs.toronto.edu\nŁukasz Kaiser∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗ ‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks that include an encoder and a decoder. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple networ

In [17]:
from IPython.display import Markdown

Markdown(str(docs[0]))

page_content='Provided proper attribution is provided, Google hereby grants permission to
reproduce the tables and figures in this paper solely for use in journalistic or
scholarly works.
Attention Is All You Need
Ashish Vaswani∗
Google Brain
avaswani@google.com
Noam Shazeer∗
Google Brain
noam@google.com
Niki Parmar∗
Google Research
nikip@google.com
Jakob Uszkoreit∗
Google Research
usz@google.com
Llion Jones∗
Google Research
llion@google.com
Aidan N. Gomez∗ †
University of Toronto
aidan@cs.toronto.edu
Łukasz Kaiser∗
Google Brain
lukaszkaiser@google.com
Illia Polosukhin∗ ‡
illia.polosukhin@gmail.com
Abstract
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best
performing models also connect the encoder and decoder through an attention
mechanism. We propose a new simple network architecture, the Transformer,
based solely on attention mechanisms, dispensing with recurrence and convolutions
entirely. Experiments on two machine translation tasks show these models to
be superior in quality while being more parallelizable and requiring significantly
less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-
to-German translation task, improving over the existing best results, including
ensembles, by over 2 BLEU. On the WMT 2014 English-to-French translation task,
our model establishes a new single-model state-of-the-art BLEU score of 41.8 after
training for 3.5 days on eight GPUs, a small fraction of the training costs of the
best models from the literature. We show that the Transformer generalizes well to
other tasks by applying it successfully to English constituency parsing both with
large and limited training data.
∗Equal contribution. Listing order is random. Jakob proposed replacing RNNs with self-attention and started
the effort to evaluate this idea. Ashish, with Illia, designed and implemented the first Transformer models and
has been crucially involved in every aspect of this work. Noam proposed scaled dot-product attention, multi-head
attention and the parameter-free position representation and became the other person involved in nearly every
detail. Niki designed, implemented, tuned and evaluated countless model variants in our original codebase and
tensor2tensor. Llion also experimented with novel model variants, was responsible for our initial codebase, and
efficient inference and visualizations. Lukasz and Aidan spent countless long days designing various parts of and
implementing tensor2tensor, replacing our earlier codebase, greatly improving results and massively accelerating
our research.
†Work performed while at Google Brain.
‡Work performed while at Google Research.
31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.
arXiv:1706.03762v7  [cs.CL]  2 Aug 2023' metadata={'source': './assets-resources/attention-paper.pdf', 'page': 0}

In [18]:
docs[0].metadata

{'source': './assets-resources/attention-paper.pdf', 'page': 0}

In [19]:
def load_docs_to_string(docs):
    return "\n".join([doc.page_content for doc in docs])

docs_string = load_docs_to_string(docs)

docs_string

'Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.com\nNoam Shazeer∗\nGoogle Brain\nnoam@google.com\nNiki Parmar∗\nGoogle Research\nnikip@google.com\nJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.com\nAidan N. Gomez∗ †\nUniversity of Toronto\naidan@cs.toronto.edu\nŁukasz Kaiser∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗ ‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks that include an encoder and a decoder. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple network architecture, the Transformer,\nbased solely on attention mechanisms, dispensing with recurren

In [21]:
llm = ChatOpenAI(model="gpt-4o-mini")

In [22]:
def pdf_summarizer(docs_string):
    prompt = ChatPromptTemplate.from_template("Summarize the following document: {document} as bullet points:")
    chain_summarizer = prompt | llm | output_parser
    return chain_summarizer.invoke({"document": docs_string})


summary_pdf = pdf_summarizer(docs_string)

Markdown(summary_pdf)

- **Title**: "Attention Is All You Need"
- **Authors**: Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin.
- **Abstract**: Introduction of the Transformer model, which uses only attention mechanisms for sequence transduction, eliminating the need for recurrent or convolutional neural networks. The model shows superior performance in machine translation tasks, achieving state-of-the-art BLEU scores for English-to-German and English-to-French translations while being more parallelizable and requiring less training time.

- **Introduction**:
  - Traditional models use recurrent or convolutional architectures.
  - Attention mechanisms enhance these models but are typically used with recurrence.
  - The Transformer model is proposed to use self-attention exclusively, allowing for more parallel processing.

- **Model Architecture**:
  - Comprises an encoder-decoder structure with stacked layers.
  - Each encoder layer includes multi-head self-attention and feed-forward networks.
  - The decoder has an additional layer for encoder-decoder attention.

- **Attention Mechanism**:
  - Introduces "Scaled Dot-Product Attention" and "Multi-Head Attention" to capture dependencies across input sequences.
  - Attention can handle long-range dependencies more effectively than recurrent models.

- **Positional Encoding**:
  - Since the model lacks recurrence and convolution, positional encodings are added to input embeddings to retain sequence information.

- **Training Details**:
  - Trained on WMT 2014 English-German and English-French datasets.
  - Uses the Adam optimizer with a learning rate schedule and regularization techniques like dropout and label smoothing.

- **Results**:
  - Transformer outperforms previous models in BLEU scores for both translation tasks.
  - Demonstrates effectiveness in English constituency parsing, achieving competitive results with limited data.

- **Conclusion**:
  - The Transformer architecture introduces a significant advancement in sequence transduction, achieving faster training times and better performance than traditional models.
  - Future work may include application to various tasks beyond text and exploration of local attention mechanisms.

- **Code Availability**: The implementation is available on GitHub (tensorflow/tensor2tensor).

In [24]:
import pandas as pd

def update_summary_dataframe(docs, summary, df=None):
    """
    Updates a pandas DataFrame with document metadata and summary.
    Creates a new DataFrame if none is provided.
    
    Args:
        docs: List of documents with metadata
        summary: Summary text of the documents
        df: Optional existing DataFrame to update
        
    Returns:
        Updated pandas DataFrame
    """
    # Create a dictionary with the new data
    new_data = {
        'paper_path': [docs[0].metadata['source']],
        'summary': [summary]
    }
    
    # Create new DataFrame with the data
    new_df = pd.DataFrame(new_data)
    
    # If existing df provided, concatenate with new data
    if df is not None:
        return pd.concat([df, new_df], ignore_index=True)
    
    return new_df

# Create initial DataFrame
df = update_summary_dataframe(docs, summary_pdf)

# Display the DataFrame 
df

Unnamed: 0,paper_path,summary
0,./assets-resources/attention-paper.pdf,"- **Title**: ""Attention Is All You Need""\n- **..."


In [25]:
pdf_path2 = "./assets-resources/llm_paper_know_dont_know.pdf"

loader2 = PyPDFLoader(pdf_path2)

docs2 = loader2.load()

docs2[0]

docs2_string = load_docs_to_string(docs2)

docs2_string

summary_pdf2 = pdf_summarizer(docs2_string)

Markdown(summary_pdf2)

The document "Do Large Language Models Know What They Don’t Know?" explores the self-knowledge capabilities of large language models (LLMs) regarding their understanding of unanswered or unknowable questions. Despite their impressive performance in various natural language processing tasks, LLMs have limitations in retaining and comprehending information. The authors emphasize the importance of self-knowledge, defined as the ability of models to recognize their limitations and convey uncertainty when faced with unanswerable questions.

To evaluate this self-knowledge, the authors introduce a novel dataset called SelfAware, which includes 1,032 unanswerable questions across five categories and 2,337 corresponding answerable questions. They propose an automated methodology to assess uncertainty in model responses using text similarity algorithms and quantify self-knowledge via F1 scores.

Through experiments involving 20 LLMs, including GPT-3 and LLaMA, the authors find that while these models exhibit some degree of self-knowledge, they still lag behind human proficiency. The best-performing model, GPT-4, achieved a self-knowledge score of 75.47%, compared to a human benchmark of 84.93%. The study shows that in-context learning and instruction tuning can enhance LLM self-knowledge, but significant gaps remain between model and human performance.

The key contributions of the study include the development of the SelfAware dataset, an innovative evaluation technique for measuring uncertainty in model outputs, and a comprehensive analysis of the self-knowledge capabilities of various LLMs. The findings highlight the need for further research to improve the self-awareness of LLMs, which is crucial for their reliable application in various fields.

In [26]:
df2 = update_summary_dataframe(docs2, summary_pdf2, df)

df2

Unnamed: 0,paper_path,summary
0,./assets-resources/attention-paper.pdf,"- **Title**: ""Attention Is All You Need""\n- **..."
1,./assets-resources/llm_paper_know_dont_know.pdf,"The document ""Do Large Language Models Know Wh..."
