# Research on Large Language Models

This investigation evaluates multiple large language models (LLMs) to determine their effectiveness for a retrieval-augmented generation (RAG) application. We assessed models including FLAN, BLOOM, Mistral 7B, GPT-4o, and Gemini Pro based on their accuracy, content generation capabilities, and response quality. The aim is to identify the model that best meets our requirements for precise and engaging outputs in a RAG context.

In [None]:
pip install python-dotenv==1.0.1 langchain==0.2.1 langchain-community==0.2.1

In [1]:
import os
import time
from dotenv import load_dotenv

In [2]:
load_dotenv()

True

In [3]:
questions = ["who won the FIFA World Cup in the year 1998?", 
             "what is the capital of Jamaica?", 
             "who is the lead singer in Aerosmith?",
             "how many NBA titles did Michael Jordan win?",
             "write me a four line poem about computers"]

In [4]:
def llm_test(llm, prompt_list):
    print("-----")
    for prompt in prompt_list:
        print(f"{prompt}\n {llm.predict(prompt)}\n")
        print("-----")
        time.sleep(2)

<br>
<br>

# Hugging Face

Hugging Face is a company and open-source community known for its contributions to natural language processing (NLP) and machine learning. It provides a platform that hosts a vast collection of pre-trained models, datasets, and tools, making it easier for developers and researchers to implement state-of-the-art AI in their projects.

For using Hugging Face:
  * Create an access token here: https://huggingface.co/settings/tokens
  * Create an environment variable called: HUGGINGFACEHUB_API_TOKEN

In [None]:
pip install langchain-huggingface==0.0.3

In [60]:
from langchain import HuggingFaceHub
from langchain_huggingface.llms import HuggingFaceEndpoint

## FLAN

- https://huggingface.co/google/flan-t5-large

Flan-T5 is an open-source LLM that’s available for commercial usage. Published by Google researchers, Flan-T5 is an encoder-decoder model pre-trained on a variety of language tasks. The model has been trained on supervised and unsupervised datasets with the goal of learning mappings between sequences of text.

In [61]:
flan = HuggingFaceHub(repo_id="google/flan-t5-large", model_kwargs={"temperature": 0.5})

llm_test(flan, questions)

-----
who won the FIFA World Cup in the year 1998?
 argentina

-----
what is the capital of Jamaica?
 Kingston

-----
who is the lead singer in Aerosmith?
 Steven Tyler

-----
how many NBA titles did Michael Jordan win?
 three

-----
write me a four line poem about computers
 i have a computer i use it all the time i can do anything 

-----


Accuracy of responses:
- The model correctly identified the capital of Jamaica and the lead singer of Aerosmith.
- However, it provided incorrect answers for the FIFA World Cup winner in 1998 (it was France, not Argentina) and the number of NBA titles won by Michael Jordan (he won six, not three).
- The model provides some correct answers but also makes significant errors, particularly in factual questions. This inconsistency suggests it may not be reliable for a RAG application where accuracy is critical.

Quality of generated content:
- The poem generated by the model is simplistic and lacks the expected structure or creativity typically found in poetry.
- The model's ability to generate content is very basic and might not meet the expectations for creative or complex tasks.

> Fit for purpose: NO ❌
>
> Given the errors and simplistic content generation, the FLAN model is not considered fit for the RAG application.

## BLOOM

- https://huggingface.co/bigscience/bloom

BLOOM is an autoregressive Large Language Model (LLM), trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources.

In [62]:
bloom = HuggingFaceHub(repo_id="bigscience/bloom", model_kwargs={"temperature": 0.5})

llm_test(bloom, questions)

-----
who won the FIFA World Cup in the year 1998?
 who won the FIFA World Cup in the year 1998?

-----
what is the capital of Jamaica?
 what is the capital of Jamaica?"

"Port Royal."

"How far is it from Kingston?"

"About

-----
who is the lead singer in Aerosmith?
 who is the lead singer in Aerosmith?” The response is Steven Tyler.

-----
how many NBA titles did Michael Jordan win?
 how many NBA titles did Michael Jordan win?”
If you want to be able to answer this question, you need to understand what the question

-----
write me a four line poem about computers
 write me a four line poem about computers and I will send you a free copy of my book."
Here's a link to a sample

-----


- The model shows significant issues in processing and responding to queries. The repetition of questions and incomplete answers make it less suitable for tasks where clear and direct responses are needed.

> Fit for purpose: NO ❌
>
> Given the model's current behavior, it doesn't seem fit for a RAG application. It fails to provide reliable, accurate, or even coherent responses, which are essential for such a use case.

## Mistral 7B

- https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

Mistral 7B is a 7-billion-parameter language model released by Mistral AI. Mistral 7B is a carefully designed language model that provides both efficiency and high performance to enable real-world applications. Due to its efficiency improvements, the model is suitable for real-time applications where quick responses are essential. At the time of its release, Mistral 7B outperformed the best open source 13B model (Llama 2) in all evaluated benchmarks.

In [77]:
mistral = HuggingFaceEndpoint(repo_id="mistralai/Mistral-7B-Instruct-v0.2", 
                           temperature=0.5,  
                           model_kwargs={"max_length":64, "token":os.getenv("HUGGINGFACEHUB_API_TOKEN")})

llm_test(mistral, questions)

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to C:\Users\masso\.cache\huggingface\token
Login successful
-----
who won the FIFA World Cup in the year 1998?
 

The FIFA World Cup in the year 1998 was won by the French national football team. They defeated Brazil 3–0 in the final match held in Paris, France. This was the second World Cup title for France, and they became the first European team to win the tournament on home soil. The French team was led by star players such as Zinedine Zidane, Didier Deschamps, and Marcel Desailly.

-----
what is the capital of Jamaica?
 

Kingston is the capital city of Jamaica. It is located on the southeastern coast of the island and is the largest city in Jamaica. Kingston is known for its vibrant culture, beaut

Accuracy of responses:
- The model correctly identifies France as the winner of the 1998 FIFA World Cup, including accurate details about the match and players.
- It correctly names Kingston as the capital of Jamaica and provides additional relevant information.
- Steven Tyler is correctly identified as the lead singer of Aerosmith, with accurate biographical details.
- The model accurately states that Michael Jordan won six NBA championships, providing context around his career.

Quality of generated content:
- The poem about computers is creative, well-structured, and reflects a poetic quality that was missing in the other models.

> Fit for purpose: YES ✅
>
> Based on these results, the Mistral 7B model appears to be well-suited for a RAG application. It provides accurate, detailed responses and generates quality content, making it a strong candidate for our needs.


<br>
<br>

# Azure OpenAI

Microsoft Azure, often referred to as Azure is a cloud computing platform run by Microsoft, which offers access, management, and development of applications and services through global data centers. It provides a range of capabilities, including software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS). Microsoft Azure supports many programming languages, tools, and frameworks, including Microsoft-specific and third-party software and systems.

Azure OpenAI is an Azure service with powerful language models from OpenAI including GPT, Codex and Embeddings model series for content generation, summarization, semantic search, and natural language to code translation.

To access AzureOpenAI models you'll need:
- Create an Azure account
- Create a deployment of an Azure OpenAI model
- Get the name and endpoint for your deployment
- Create 2 environment variables: AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_API_KEY

## GPT-4o

- https://platform.openai.com/docs/models/gpt-4o

GPT-4o is OpenAI's most advanced model. It is multimodal (accepting text or image inputs and outputting text), and it has the same high intelligence as GPT-4 Turbo but is much more efficient—it generates text 2x faster and is 50% cheaper.

In [None]:
pip install langchain-openai==0.1.22

In [10]:
from langchain_openai import AzureChatOpenAI

In [12]:
gpt = AzureChatOpenAI(
    deployment_name="gpt-4o",
    api_version="2023-06-01-preview"
)

llm_test(gpt, questions)

-----
who won the FIFA World Cup in the year 1998?
 The FIFA World Cup in 1998 was won by France. The tournament was held in France, and the French national team defeated Brazil 3-0 in the final match held on July 12, 1998, at the Stade de France in Saint-Denis. This victory marked France's first World Cup title.

-----
what is the capital of Jamaica?
 The capital of Jamaica is Kingston.

-----
who is the lead singer in Aerosmith?
 The lead singer of Aerosmith is Steven Tyler. He is known for his distinctive voice, energetic stage presence, and flamboyant style. Tyler has been with the band since its formation in 1970 and has played a significant role in its success over the decades.

-----
how many NBA titles did Michael Jordan win?
 Michael Jordan won a total of six NBA titles during his career. He achieved this with the Chicago Bulls, securing championships in the years 1991, 1992, 1993, 1996, 1997, and 1998.

-----
write me a four line poem about computers
 In circuits vast, where 

Accuracy of responses:
- The model correctly identifies France as the winner of the 1998 FIFA World Cup, providing precise details about the event.
- It accurately states that Kingston is the capital of Jamaica, delivering a concise response.
- Steven Tyler is correctly identified as the lead singer of Aerosmith, with additional relevant context about his career.
- The model correctly states that Michael Jordan won six NBA titles and provides the specific years in which these victories occurred.

Quality of generated content:
- The poem generated by the model is well-crafted, with a good sense of rhythm and imagery. It reflects creativity and an understanding of poetic structure.

> Fit for purpose: YES ✅
>
> Based on these results, the GPT-4o model appears to be very well-suited for a RAG application. It consistently provides accurate and detailed information, along with high-quality generative text.

<br>
<br>

# Google AI

For using Google AI:
  * Create an API key here: https://aistudio.google.com/app/apikey

In [None]:
pip install langchain-google-genai==1.0.6

In [36]:
from langchain_google_genai import GoogleGenerativeAI, ChatGoogleGenerativeAI

## Bison

- https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/text-bison

PaLM 2 large language model that understands and generates language. It's a foundation model that performs well at a variety of natural language tasks such as sentiment analysis, entity extraction, and content creation. The type of content that text-bison can create includes document summaries, answers to questions, and labels that classify content.

In [81]:
bison = GoogleGenerativeAI(model="models/text-bison-001", 
                            temperature=0.5,
                            google_api_key=os.getenv("GOOGLE_API_KEY"))

llm_test(bison, questions)

-----
who won the FIFA World Cup in the year 1998?
 France

-----
what is the capital of Jamaica?
 Kingston

-----
who is the lead singer in Aerosmith?
 Steven Tyler

-----
how many NBA titles did Michael Jordan win?
 6

-----
write me a four line poem about computers
 **Computers**

Tools of the modern age,
Bringing information to the masses,
A constant companion,
A window to the world.

-----


Accuracy of responses:
- The model provides correct and concise answers to factual questions.

Quality of generated content:
- The poem is simple and direct. It lacks the creative flair or complexity seen in the poems generated by the Mistral 7B and GPT-4o models.

> Fit for purpose: NO ❌
>
> The model delivers accurate answers but tends to be very brief. This could be an advantage in some cases where concise responses are preferred, but it might be limiting in scenarios requiring more detailed information.

## Gemini Pro

- https://ai.google.dev/gemini-api/docs

Gemini is a family of generative AI models that lets developers generate content and solve problems. These models are designed and trained to handle both text and images as input. Gemini 1.0 Pro is optimized for natural language tasks, multi-turn text and code chat, and code generation.

In [75]:
gemini = ChatGoogleGenerativeAI(model="gemini-1.5-pro", 
                            temperature=0.5,
                            google_api_key=os.getenv("GOOGLE_API_KEY"))

llm_test(gemini, questions)

-----
who won the FIFA World Cup in the year 1998?
 France won the FIFA World Cup in 1998. 


-----
what is the capital of Jamaica?
 The capital of Jamaica is **Kingston**. 


-----
who is the lead singer in Aerosmith?
 The lead singer of Aerosmith is **Steven Tyler**. 


-----
how many NBA titles did Michael Jordan win?
 Michael Jordan won **6** NBA titles, all with the Chicago Bulls. 


-----
write me a four line poem about computers
 Silicon heart, a screen's soft glow,
A universe of knowledge, fast or slow.
From bits and bytes, worlds come alive,
With every click, a new you can thrive. 


-----


Accuracy of responses:
- The model delivers accurate responses with a slight emphasis on key details, like mentioning that Michael Jordan won all six titles with the Chicago Bulls. This extra layer of context enhances the quality of its factual responses.

Quality of generated content:
- The poem is creative and well-crafted.

> Fit for purpose: YES ✅
>
> The Gemini Pro model appears to be highly suitable for a RAG application, offering a good balance between accuracy, detail, and creative generative capabilities.