# <center>Generative AI : From History to Practical Applications </center>
<div style="text-align: right">INFO 7390 Advances Data Sciences and Architecture SEC 03 Spring 2024</div>
<div style="text-align: right">Creating Data using Generative AI</div>
<div style="text-align: right">Aditi A. Deodhar, NUID: 002279575</div>

![image.png](attachment:image.png)

The roots of **Generative AI** can be traced back to the early days of artificial neural networks and computational creativity. In the 1950s and 1960s, pioneers like Alan Turing and Christopher Strachey laid the groundwork for computer-generated art and music, showcasing the potential of machines to produce creative outputs.

However, it wasn't until the late 20th century that significant strides were made in generative modeling techniques. The emergence of Markov models, _Hidden Markov Models (HMMs)_, and _Probabilistic Context-Free Grammars (PCFGs)_ paved the way for probabilistic approaches to data generation and text synthesis.

The advent of _Generative Adversarial Networks (GANs)_ in 2014 marked a watershed moment in the field of Generative AI. Proposed by Ian Goodfellow and his colleagues, GANs introduced a novel framework for training generative models by pitting two neural networks, the Generator and the Discriminator, against each other in a game-like scenario. This adversarial training approach revolutionized data generation by enabling the creation of high-fidelity, realistic outputs across various domains, including images, text, and audio.

In parallel, the rise of _recurrent neural networks (RNNs)_, particularly variants like _Long Short-Term Memory (LSTM)_ networks, fueled advancements in sequence generation tasks, such as text generation, language translation, and dialogue systems. These models demonstrated the capacity to capture complex sequential dependencies and generate coherent, contextually relevant outputs.

The past decade has witnessed a proliferation of _large-scale language models (LLMs)_, exemplified by OpenAI's GPT series, which leverage transformer architectures to process and generate natural language text at scale. These models have pushed the boundaries of language understanding and generation, enabling applications ranging from content creation and language translation to question answering and dialogue systems.

As we delve into the realm of Generative AI in this Jupyter Notebook, we embark on a journey through the rich tapestry of techniques, models, and applications that define this burgeoning field. From the foundational concepts to the latest advancements, we explore the intricate interplay between creativity, technology, and human ingenuity, shaping the future of AI-driven content generation and beyond. Welcome to the world of Generative AI – where imagination knows no bounds.

![image.png](attachment:image.png)

### What is Generative AI?

Generative AI encompasses a class of artificial intelligence algorithms designed to create new, synthetic data that closely resembles existing datasets. Unlike traditional AI, which is focused on solving specific tasks, generative AI aims to understand and replicate the underlying patterns and structures of the data it is trained on by operating on principles derived from statistical models, neural networks, or other mathematical frameworks to generate data points that mimic the underlying patterns of the original data.

### Relevance of Data Generation in Data Science

Data generation is crucial in data science as it addresses challenges related to data scarcity and enhances the diversity of available datasets. This is particularly important for training machine learning models effectively, ensuring they generalize well to new, unseen data. <br>

The chosen generative AI technique, such as Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs), operates on principles rooted in probability theory and neural network architecture. For instance, GANs consist of a generator and a discriminator network engaged in a adversarial training process, while VAEs leverage probabilistic models for encoding and decoding data.

Let's understand GAN in more detail - 

![image.png](attachment:image.png)

Imagine you're at a party with two friends: an artist and a detective. The artist's job is to create fake paintings that look just like real ones, while the detective's job is to figure out which paintings are fake and which are real.

**The Artist**: This friend is super talented and can paint anything. But they're not interested in just any painting; they want to make copies of famous artworks. So, they start by painting something random and show it to the detective.

**The Detective**: This friend has a keen eye for detail. They carefully examine the painting and try to guess if it's real or fake. If they think it's fake, they tell the artist what gave it away.

Here's where it gets interesting:

 - Competition Begins: The artist tries again, this time making small changes based on the detective's feedback. They show the new painting to the detective, who tries to guess if it's real or fake again.

 - Learning from Mistakes: Every time the detective catches a fake painting, the artist learns from their mistakes and tries to make the next one even better. It's like a game of cat and mouse, with the artist always trying to outsmart the detective.

 - Getting Better Together: Over time, the artist becomes so good at copying famous paintings that the detective can't tell the difference anymore. They're both amazed at how realistic the fake paintings look!

 - Perfecting the Art: Eventually, the artist becomes a master at creating fake paintings that are almost indistinguishable from the real ones. The detective is left scratching their head, unable to spot the fakes anymore.

And that's basically how a Generative Adversarial Network (GAN) works! It's like a friendly competition between an artist and a detective, where they both push each other to be better at their jobs.

Now, let's fast forward to the present day, where we find a different kind of magic happening in the world of AI.


Imagine if our artist and detective were replaced by two super-powered wizards who could understand and create language like never before. That's where ***transformers*** come in! Instead of relying on traditional methods like remembering past words or looking at small parts of a sentence, transformers use a special spell called ***attention***. This attention spell allows them to understand the entire context of a sentence and draw connections between words, making them incredibly powerful for tasks like translating languages, summarizing text, and even generating new stories!

![image.png](attachment:image.png)

***Transformers*** are a type of neural network architecture introduced in the paper "Attention is All You Need" by Vaswani et al. in 2017. Unlike traditional recurrent or convolutional architectures, transformers rely solely on an attention mechanism to draw global dependencies between input and output data. This architecture has proven highly effective for various natural language processing tasks, including machine translation, text summarization, and text generation.

Now, imagine those super-powered wizards joining forces to create something truly extraordinary – ***large language models*** (LLMs). These are like transformers on steroids, with millions or even billions of parameters that allow them to learn from massive amounts of text data. Take, for example, OpenAI's GPT series – they're like the wise sages of the language world, pre-trained on vast libraries of books, articles, and conversations. When you ask them a question or give them a prompt, they can weave together words and phrases to form human-like responses, answer complex questions, or even engage in meaningful conversation. It's like having a wise mentor who knows everything about language and can help you with anything you need!

***Large language models***, such as OpenAI's GPT series, are examples of transformers scaled to vast amounts of parameters, enabling them to learn complex patterns and generate coherent text. LLMs are typically pre-trained on large corpora of text data and fine-tuned for specific downstream tasks. They have demonstrated remarkable capabilities in generating human-like text, answering questions, and even engaging in conversation.

![image-2.png](attachment:image-2.png)

### LLMs Fundamentals:

1. **Pre-training**: Imagine a model learning the basics of language by reading lots of books without any specific task in mind. It's like learning the alphabet and basic grammar rules before diving into specific topics like writing stories or answering questions.

2. **Fine-tuning**: After learning the basics, the model gets some extra training on specific tasks, like writing essays or answering trivia questions. This fine-tuning helps it become better at these specific tasks while still using what it learned during the initial reading.

3. **Attention Mechanism**: Think of attention as focusing on important words in a sentence. Just like you pay more attention to key words when reading, this mechanism helps the model focus on important parts of the text when generating responses.

4. **Tokenization**: Tokenization is like breaking down sentences into smaller pieces, such as words or parts of words. It's like breaking down "cats are cute" into "cat", "are", and "cute".

5. **Positional Encoding**: Since the model reads text one token at a time, positional encoding helps it remember the order of words in a sentence. It's like putting numbers on each word to remember where they belong in the sentence.

6. **Beam Search**: When the model tries to predict the next word in a sentence, beam search helps it consider different options at each step. It's like exploring different paths to find the best way to complete a sentence, like a choose-your-own-adventure story.

7. **Evaluation Metrics**: To see how well the model is doing, we use metrics like perplexity and BLEU. Perplexity is like a measure of how confused the model is when predicting words, and BLEU tells us how similar the model's output is to human-written text.

These concepts help us understand how Large Language Models learn and generate text, making it easier for them to communicate with us in a way that feels natural and human-like.

### Training LLMs
Training LLMs is a complex process that involves instructing the model to comprehend and produce human-like text. Here's a simplified breakdown of how LLM training works:

1. **Providing Input Text:**
LLMs are initially exposed to extensive text data, encompassing various sources such as books, articles, and websites.
The model's task during training is to predict the next word or token in a sequence based on the context provided. It learns patterns and relationships within the text data.
2. **Optimizing Model Weights:**
The model comprises different weights associated with its parameters, reflecting the significance of various features.
Throughout training, these weights are fine-tuned to minimize the error rate. The objective is to enhance the model's accuracy in predicting the next word.
3. **Fine-tuning Parameter Values:**
LLMs continuously adjust parameter values based on error feedback received during predictions.
The model refines its grasp of language by iteratively adjusting parameters, improving accuracy in predicting subsequent tokens.
The training process may vary depending on the specific type of LLM being developed, such as those optimized for continuous text or dialogue.

LLM performance is heavily influenced by two key factors:

1. **Model Architecture:** The design and intricacy of the LLM architecture impact its ability to capture language nuances.
2. **Dataset:** The quality and diversity of the dataset utilized for training are crucial in shaping the model's language understanding.

After the initial training, LLMs can be easily customized for various tasks using relatively small sets of supervised data, a procedure referred to as fine-tuning.

There are three prevalent learning models:

1. **Zero-shot learning:** The base LLMs can handle a wide range of requests without explicit training, often by using prompts, though the accuracy of responses may vary.
2. **Few-shot learning:** By providing a small number of pertinent training examples, the performance of the base model significantly improves in a specific domain.
3. **Domain Adaptation:** This extends from few-shot learning, where practitioners train a base model to adjust its parameters using additional data relevant to the particular application or domain.

### Popular LLM use-cases

![image-2.png](attachment:image-2.png)

 ***1. Content Generation:***
LLMs excel in content generation by understanding context and generating coherent and contextually relevant text. They can be employed to automatically generate creative content for marketing, social media posts, and other communication materials, ensuring a high level of quality and relevance. eg: Marketing platforms, Social media management tools, Content creation platforms, Advertising agencies

***2. Language Translation:***
LLMs can significantly improve language translation tasks by understanding the nuances of different languages. They can provide accurate and context-aware translations, making them valuable tools for businesses operating in multilingual environments. This can enhance global communication and outreach. eg: Translation services, Global communication platforms, International business applications

***3. Text Summarization:***
LLMs are adept at summarizing lengthy documents by identifying key information and maintaining the core message. This capability is valuable for content creators, researchers, and businesses looking to quickly extract essential insights from large volumes of text, improving efficiency in information consumption. eg: Research tools, News aggregators, Content curation platforms

***4. Question Answering and Chatbots:***
LLMs can be employed for question answering tasks, where they comprehend the context of a question and generate relevant and accurate responses. They enable these systems to engage in more natural and context-aware conversations, understanding user queries and providing relevant responses. eg: Customer support systems, Chatbots, Virtual assistants, Educational platforms

***5. Content Moderation:***
LLMs can be utilized for content moderation by analyzing text and identifying potentially inappropriate or harmful content. This helps in maintaining a safe and respectful online environment by automatically flagging or filtering out content that violates guidelines, ensuring user safety. eg: Social media platforms, Online forums, Community management tools

***6. Information Retrieval:***
LLMs can enhance information retrieval systems by understanding user queries and retrieving relevant information from large datasets. This is particularly useful in search engines, databases, and knowledge management systems, where LLMs can improve the accuracy of search results. eg: Search engines, Database systems, Knowledge management platforms

***7. Educational Tools:***
LLMs contribute to educational tools by providing natural language interfaces for learning platforms. They can assist students in generating summaries, answering questions, and engaging in interactive learning conversations. This facilitates personalized and efficient learning experiences. eg: E-learning platforms, Educational chatbots, Interactive learning applications

### Ethical Concerns surrounding Gen AI:

![image.png](attachment:image.png)

1. ***Bias and Fairness:*** LLMs trained on biased datasets can perpetuate and amplify societal biases, affecting outcomes in critical areas like hiring and legal contexts. Microsoft's chatbot Tay's racist and sexist behavior highlights the consequences of unchecked bias in AI systems.

2. ***Misinformation and Disinformation:*** LLMs have the potential to generate convincing fake news and propaganda, raising concerns about their role in spreading disinformation and undermining trust in credible sources of information.

3. ***Dependency and Deskilling:*** Overreliance on LLMs may lead to a decline in human skills and critical thinking abilities, as individuals become dependent on AI-generated solutions without understanding the underlying rationale.

4. ***Privacy and Security Threats:*** LLMs pose significant privacy and security risks, including the potential for leaking sensitive information, profiling individuals, and facilitating cyberattacks through the generation of malicious content.

5. ***Lack of Accountability:*** The difficulty in assigning responsibility for AI-generated content complicates legal and ethical matters, leading to challenges in legal proceedings and concerns about the responsible use of AI.

6. ***Filter Bubbles and Echo Chambers:*** LLMs contribute to filter bubbles and echo chambers by reinforcing users' existing beliefs and limiting exposure to diverse perspectives, which can hinder healthy public discourse and shared understanding in society.

### What is Prompt Engineering? Why is it important?

![image.png](attachment:image.png)

Large language models are trained through a process called unsupervised learning on vast amounts of diverse text data. During training, the model learns to predict the next word in a sentence based on the context provided by the preceding words. This process allows the model to capture grammar, facts, reasoning abilities, and even some aspects of common sense.

**Prompting** is a crucial aspect of using these models effectively. Here's why prompting LLMs the right way is essential:

***1. Contextual Understanding:*** LLMs are trained to understand context and generate responses based on the patterns learned from diverse text data. When you provide a prompt, it's crucial to structure it in a way that aligns with the context the model is familiar with. This helps the model make relevant associations and produce coherent responses.<br>
***2. Training Data Patterns:*** During training, the model learns from a wide range of text, capturing the linguistic nuances and patterns present in the data. Effective prompts leverage this training by incorporating similar language and structures that the model has encountered in its training data. This enables the model to generate responses that are consistent with its learned patterns.<br>
***3. Transfer Learning:*** LLMs utilize transfer learning. The knowledge gained during training on diverse datasets is transferred to the task at hand when prompted. A well-crafted prompt acts as a bridge, connecting the general knowledge acquired during training to the specific information or action desired by the user.<br>
***4. Contextual Prompts for Contextual Responses***: By using prompts that resemble the language and context the model was trained on, users tap into the model's ability to understand and generate content within similar contexts. This leads to more accurate and contextually appropriate responses.<br>
***5. Mitigating Bias:*** The model may inherit biases present in its training data. Thoughtful prompts can help mitigate bias by providing additional context or framing questions in a way that encourages unbiased responses. This is crucial for aligning model outputs with ethical standards.<br>

### Basics of Prompting 

The basic principles of prompting involve the inclusion of specific elements tailored to the task at hand. These elements include:

**Instruction:** Clearly specify the task or action you want the model to perform. This sets the context for the model's response and guides its behavior.<br>
**Context:** Provide external information or additional context that helps the model better understand the task and generate more accurate responses. Context can be crucial in steering the model towards the desired outcome.<br>
**Input Data:** Include the input or question for which you seek a response. This is the information on which you want the model to act or provide insights.<br>
**Output Indicator:** Define the type or format of the desired output. This guides the model in presenting the information in a way that aligns with your expectations.<br>

Here's an example prompt for a text classification task:

Prompt:
```
Classify the text into neutral, negative, or positive
Text: I think the food was okay.
Sentiment:
```
In this example:

 - Instruction: "Classify the text into neutral, negative, or positive."
 - Input Data: "I think the food was okay."
 - Output Indicator: "Sentiment."
Note that this example doesn't explicitly use context, but context can also be incorporated into the prompt to provide additional information that aids the model in understanding the task better.

It's important to highlight that not all four elements are always necessary for a prompt, and the format can vary based on the specific task. The key is to structure prompts in a way that effectively communicates the user's intent and guides the model to produce relevant and accurate responses.

### What is Fine-Tuning? Why do we need it?

While large language models are indeed trained on a diverse set of tasks, the need for fine-tuning arises because these large generic models are designed to perform reasonably well across various applications, but not necessarily excel in a specific task. The optimization of generic models is aimed at achieving decent performance across a range of tasks, making them versatile but not specialized.

**Fine-tuning** becomes essential to ensure that a model attains exceptional proficiency in a particular task or domain of interest. The emphasis shifts from achieving general competence to achieving mastery in a specific application. This is particularly crucial when the model is intended for a focused use case, and overall general performance is not the primary concern.

In essence, generic large language models can be considered as being proficient in multiple tasks but not reaching the level of mastery in any. Fine-tuned models, on the other hand, undergo a tailored optimization process to become masters of a specific task or domain. Therefore, the decision to fine-tune models is driven by the necessity to achieve superior performance in targeted applications, making them highly effective specialists in their designated areas.

For a deeper understanding, explore why fine-tuning models for tasks in new domains is deemed crucial for several compelling reasons.

***1. Domain-Specific Adaptation:*** Pre-trained LLMs may not be optimized for specific tasks or domains. Fine-tuning allows adaptation to the nuances and characteristics of a new domain, enhancing performance in domain-specific tasks. For instance, large generic LLMs might not be sufficiently trained on tasks like document analysis in the legal domain. Fine-tuning can allow the model to understand legal terminology and nuances for tasks like contract review.<br>
***2. Shifts in Data Distribution:*** Models trained on one dataset may not generalize well to out-of-distribution examples. Fine-tuning helps align the model with the distribution of new data, addressing shifts in data characteristics and improving performance on specific tasks. For example: Fine-tuning a sentiment analysis model for social media comments. The distribution of language and sentiments on social media may differ significantly from the original training data, requiring adaptation for accurate sentiment classification.<br>
***3. Cost and Resource Efficiency:*** Training a model from scratch on a new task often requires a large labeled dataset, which can be costly and time-consuming. Fine-tuning allows leveraging a pre-trained model's knowledge and adapting it to the new task with a smaller dataset, making the process more efficient. For example: Adapting a pre-trained model for a small e-commerce platform to recommend products based on user preferences. Fine-tuning is more resource-efficient than training a model from scratch with a limited dataset.<br>
***4. Out-of-Distribution Data Handling:*** Fine-tuning mitigates the suboptimal performance of pre-trained models when dealing with out-of-distribution examples. Instead of starting training anew, fine-tuning allows building upon the existing model's foundation with a relatively modest dataset. For example: Fine-tuning a speech recognition model for a new regional accent. The model can be adapted to recognize speech patterns specific to the new accent without extensive retraining.<br>
***5. Knowledge Transfer:*** Pre-trained models capture general patterns and knowledge from vast amounts of data during pre-training. Fine-tuning facilitates the transfer of this general knowledge to specific tasks, making it a valuable tool for leveraging pre-existing knowledge in new applications. For example: Transferring medical knowledge from a pre-trained model to a new healthcare chatbot. Fine-tuning with medical literature enables the model to provide accurate and contextually relevant responses in healthcare conversations.<br>
***6. Task-Specific Optimization:*** Fine-tuning enables the optimization of model parameters for task-specific objectives. For example, in the medical domain, fine-tuning an LLM with medical literature can enhance its performance in medical applications. For example: Optimizing a pre-trained model for code generation in a software development environment. Fine-tuning with code examples allows the model to better understand and generate code snippets.<br>
***7. Adaptation to User Preferences:*** Fine-tuning allows adapting the model to user preferences and specific task requirements. It enables the model to generate more contextually relevant and task-specific responses. For example: Fine-tuning a virtual assistant model to align with user preferences in language and tone. This ensures that the assistant generates responses that match the user's communication style.<br>
***8. Continual Learning:*** Fine-tuning supports continual learning by allowing models to adapt to evolving data and user requirements over time. It enables models to stay relevant and effective in dynamic environments. For instance: Continually updating a news summarization model to adapt to evolving news topics and user preferences. Fine-tuning enables the model to stay relevant and provide timely summaries.<br>

Fine-tuning methods for language models can be broadly categorized into two main approaches: supervised and unsupervised. Here's a summary of each type:

**Unsupervised Fine-Tuning Methods:**
1. **Unsupervised Full Fine-Tuning:** This method involves updating the knowledge base of a language model without modifying its existing behavior. It utilizes unstructured datasets relevant to the target domain, enabling the model to refine its understanding without labeled examples.
   
2. **Contrastive Learning:** Contrastive learning focuses on training the model to discern between similar and dissimilar examples in the latent space. By encouraging the model to distinguish subtle nuances and patterns, contrastive learning enhances its ability to capture complex relationships in the data.

**Supervised Fine-Tuning Methods:**
1. **Parameter-Efficient Fine-Tuning (PEFT):** PEFT aims to reduce computational expenses by selectively updating a small set of parameters, known as a low-dimensional matrix. This approach minimizes resource requirements while adapting the model to specific tasks or domains.

2. **Supervised Full Fine-Tuning:** In contrast to PEFT, supervised full fine-tuning involves updating all parameters of the language model during training. While resource-intensive, this comprehensive approach ensures thorough adaptation to the target task or domain.

3. **Instruction Fine-Tuning:** Instruction fine-tuning augments input-output examples with explicit instructions, guiding the model to generalize effectively to new tasks. By providing clear task-specific instructions, this method enhances the model's adaptability and performance.

4. **Reinforcement Learning from Human Feedback (RLHF):** RLHF incorporates human evaluators' feedback to guide the model's training process. By optimizing model parameters based on human preferences, RLHF aligns the model's behavior with desirable outcomes.

Each fine-tuning method offers unique advantages and is suited to different scenarios based on factors such as computational resources, task complexity, and the availability of labeled data. By understanding and leveraging these methods, practitioners can effectively adapt language models to diverse applications and domains.

### Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) is an AI framework designed to enhance the quality of responses generated by Large Language Models (LLMs) by integrating up-to-date and contextually relevant information from external sources. This framework addresses the limitations of LLMs, such as inconsistency and lack of domain-specific knowledge, thereby reducing the likelihood of generating incorrect or irrelevant responses. 


![image.png](attachment:image.png)

The diagram above outlines the fundamental RAG pipeline, consisting of three key components:

- **Ingestion:** Documents undergo segmentation into chunks, and embeddings are generated from these chunks, subsequently stored in an index. Chunks are essential for pinpointing the relevant information in response to a given query, resembling a standard retrieval approach.
- **Retrieval:** Leveraging the index of embeddings, the system retrieves the top-k documents when a query is received, based on the similarity of embeddings.
- **Synthesis:** Examining the chunks as contextual information, the LLM utilizes this knowledge to formulate accurate responses.

Overall, the RAG framework enhances accuracy, allows for source verification, and reduces the need for continuous model retraining.

-------------------------------------------------------------------------------------------------------------------------------

### My Data Generation Task

Now that we've got a good handle on generative AI and fine-tuning methods, let's roll up our sleeves and see how all this theory translates into real-world action with MediPedia. With a clever blend of open source technologies like Sentence Transformers, Faiss CPU, and the Llama 2 large language model, MediPedia is your go-to resource for medical knowledge and assistance.


This is the architecture diagram of the application -

![image.png](attachment:image.png)

Unfortunately, due to the nature of MediPedia as a Chainlit application that runs locally, the code cannot be directly included in this Jupyter Notebook. However, you can access the complete codebase on the GitHub repository provided here and also watch the code walkthrough and demo of the application.

### **Checkout the following link to see what I have implemented :** https://github.com/deodharaditi/MediPedia-Your-Go-To-Resource-For-Medical-Knowledge-And-Assistance/tree/main

### **Demo :** https://youtu.be/Ry03k8ODI0g (Watch it in 2x for best experience)

To run MediPedia on your own machine, simply follow the instructions outlined in the README file of the repository.

The input to MediPedia consists of medical queries or prompts entered by the user, which can range from questions about symptoms, conditions, treatments, to general health-related inquiries. The output comprises contextually relevant responses generated by the Llama 2 large language model, tailored to address the user's query effectively.

Below are some examples of input queries and corresponding output responses:

``Input: "What are the symptoms of COVID-19?"``

``Output: "Common symptoms of COVID-19 include fever, cough, and shortness of breath. However, it's important to note that symptoms may vary from person to person."``

``
Input: "How can I prevent the flu?"
``

``
Output: "To prevent the flu, it's recommended to get vaccinated annually, practice good hand hygiene, avoid close contact with sick individuals, and maintain a healthy lifestyle."
``

Find below two main scripts required to run the application effectively: ingest.py and model.py. These scripts work together to ingest medical documents, set up a question-answering system, and enable seamless interaction with users.


**NOTE** : **These scripts have been included for reference only and will not run in a jupyter notebook.**

In [None]:
## injest.py

from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyPDFLoader, DirectoryLoader
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS

DATA_PATH = "data/"
DB_FAISS_PATH = "vectorstores/db_faiss"

# create vector database
def create_vector_db():
    loader = DirectoryLoader(DATA_PATH, glob='*.pdf', loader_cls=PyPDFLoader)
    documents = loader.load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 50)
    texts = text_splitter.split_documents(documents)

    embeddings = HuggingFaceEmbeddings(model_name = 'sentence-transformers/all-MiniLM-L6-v2', 
                                       model_kwargs = {'device' : 'cpu'})

    db = FAISS.from_documents(texts, embeddings)
    db.save_local(DB_FAISS_PATH)


if __name__ == '__main__':
    create_vector_db()

The `injest.py` script ingests PDF documents from a specified directory, splits their text into smaller chunks, generates embeddings for these text chunks using a pre-trained language model, and creates a vector database from the embeddings using the FAISS library. Finally, it saves the vector database to a specified location.

In [None]:
# model.py

from langchain_core.prompts import PromptTemplate
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS 
from langchain_community.llms import CTransformers 
from langchain.chains import RetrievalQA 
import chainlit as cl 

DB_FAISS_PATH = "vectorstores/db_faiss"

custom_prompt_template =  """" Use the following pieces of information to answer the user's question. If ypu don't know the answer, just say that you don't know, do not try to make up the answer.

Context : {context}
Question : {question}

Only returns the helpful answer below and nothing else.
Helpful answer:
"""

def set_custom_prompt():
    """
    Prompt template for QA retrieval for each vector stores
    """

    prompt = PromptTemplate(template=custom_prompt_template, 
                            input_variables=['context', 'question'])

    return prompt

def retrieval_qa_chain(llm, prompt, db):
    qa_chain = RetrievalQA.from_chain_type(
        llm = llm,
        chain_type = 'stuff',
        retriever = db.as_retriever(search_kwargs={'k':2}),
        return_source_documents = True,
        chain_type_kwargs =  {'prompt' : prompt}
    )

    return qa_chain

def load_llm():
    llm = CTransformers(
       model = "TheBloke/Llama-2-7B-Chat-GGML",
       model_type = "llama",
       max_new_tokens = 512,
       temperature = 0.5 
    )

    return llm

def qa_bot():
    embeddings = HuggingFaceEmbeddings(model_name = 'sentence-transformers/all-MiniLM-L6-v2', 
                                       model_kwargs = {'device' : 'cpu'},
                                       )
    db = FAISS.load_local(DB_FAISS_PATH, embeddings, allow_dangerous_deserialization=True)
    llm = load_llm()
    qa_prompt = set_custom_prompt()
    qa = retrieval_qa_chain(llm, qa_prompt, db)

    return qa

def final_result(query):
    qa_result = qa_bot()
    response =  qa_result({'query' : query})

    return response


# Chainlit UI

@cl.on_chat_start
async def start():
    chain = qa_bot()
    msg = cl.Message(content= "Starting the systems...")
    await msg.send()
    msg.content = "Hi, welcome to MediPedia, how can I help you today?"
    await msg.update()
    cl.user_session.set("chain", chain)

@cl.on_message
async def main(message: cl.message):
    chain = cl.user_session.get("chain")
    cb = cl.AsyncLangchainCallbackHandler(
       stream_final_answer= True, answer_prefix_tokens= ["FINAL", "ANSWER"]
    )
    cb.answer_reached = True
    res = await chain.acall(message.content, callbacks=[cb])
    answer = res["result"]
    sources = res["source_documents"]

    if sources:
        answer += f"\nSources:" + str(sources)
    else:
        answer += f"\nNo Sources Found"

    await cl.Message(content=answer).send()

The `model.py` script defines functions to set up a question-answering (QA) system using a combination of language models and vector stores. Here's a breakdown of what it does:

1. **Setting up Custom Prompt Template**: Defines a custom prompt template for the QA retrieval process. The template includes placeholders for context and question, which are filled in during the retrieval process.

2. **Creating QA Chain**: Sets up a question-answering chain using the RetrievalQA class from the langchain library. This chain combines a language model (LLM) with a retriever, which searches through a vector database to find relevant documents for answering questions.

3. **Loading Language Model (LLM)**: Loads a pre-trained language model from the CTransformers class. This model is used for generating responses to user queries.

4. **Setting Up QA Bot**: Combines the QA chain, language model, and vector database to create a question-answering bot.

5. **Defining Function for Generating Response**: Defines a function `final_result()` that takes a user query as input, runs it through the QA bot, and returns the response generated by the bot.

6. **Chainlit UI Configuration**: Configures the Chainlit user interface for interacting with the QA bot. It sets up message handlers for starting the QA system and processing user queries, including sending and receiving messages from the user.

Below are few snaps of the interactive chatbot  :)



![image-5.png](attachment:image-5.png)

↪️ welcome page

![image-6.png](attachment:image-6.png)

↪️ here, we can actually see the chain of processes going in the background to answer the question we asked.

![image-7.png](attachment:image-7.png)

↪️ this is the generated response. we can also see that the source of information is mentioned along with the answer

![image-8.png](attachment:image-8.png)

↪️ this response shows that the chatbot doesn't misguide and is trained to be transparent about the things it is unaware of and to not act as a doctor.

#### What are some advantages of this?

- **Open-Source Architecture:** Leveraging open-source technologies, MediPedia fosters collaboration, innovation, and continuous improvement within the healthcare community.

- **Reliable Medical Information:** With access to a comprehensive repository of medical data, users can rely on MediPedia for accurate, up-to-date, and evidence-based medical information and guidance.

- **Ease of Access:** MediPedia's user-friendly interface makes it accessible to a wide range of users, including patients, caregivers, and healthcare providers, facilitating seamless interaction and knowledge sharing.

#### Drawbacks?

- **Limited Language Coverage:** While robust, MediPedia's language proficiency may be limited to specific languages, potentially hindering its accessibility to users who speak other languages.

- **Performance Constraints:** As MediPedia runs on CPU, processing speed may be slower compared to GPU-based systems, impacting response times and user experience, particularly during peak usage hours.


#### Applications:

By leveraging the power of generative AI and fine-tuning methodologies, the platform delivers intelligent responses, revolutionizing the way individuals engage with healthcare offering a set of valuable applications:

1. **Health Education**: Medical practitioners and educators can utilize MediPedia as an educational tool to disseminate accurate information about diseases, medications, and preventive healthcare measures. It can serve as a supplementary resource in medical training programs and patient education initiatives.

2. **Public Health Initiatives**: Governments and public health agencies can deploy MediPedia as part of public health campaigns to raise awareness about prevalent diseases, vaccination programs, and lifestyle modifications. It can facilitate the dissemination of critical health information to the general population, promoting disease prevention and early intervention.

### Conclusion

In conclusion, our journey through this Jupyter Notebook has been a deep dive into the realm of generative AI and its practical applications. We've explored the fundamentals of generative models, fine-tuning techniques, and real-world implementations like MediPedia. From content generation to language translation and beyond, we've witnessed the transformative potential of AI in reshaping industries and enhancing human experiences.

Yet, alongside the promise of AI innovation, we've also confronted ethical considerations, urging responsible development and deployment. Addressing biases, misinformation, and privacy concerns remains paramount as we navigate the evolving landscape of AI technology.

As we conclude, let us carry forward the knowledge gained, championing responsible AI practices and striving for positive societal impact. With innovation as our compass, let's continue to push boundaries, shape the future, and empower humanity to thrive in a rapidly advancing digital era. 🌟🤖

### References

1. Generative Adversarial Network - Retrieved from https://paperswithcode.com/method/gan
2. Generative AI for dummies(1/3)💡 - Retrieved from https://million-rare.medium.com/generative-ai-for-dummies-1-3-7ff6abc72850
3. Attention is all you need  - Retrieved from https://arxiv.org/abs/1706.03762
4. Rise of Gen AI Image source - Retrieved from https://informationisbeautiful.net/visualizations/the-rise-of-generative-ai-large-language-models-llms-like-chatgpt/
5. Awesome Gen AI guide - Retrieved from https://github.com/aishwaryanr/awesome-generative-ai-guide/tree/main
6. Cracks in the Facade: Flaws of Large Language Models - Retrieved from - https://datasciencedojo.com/blog/challenges-of-large-language-models/

### Additional Resources:

Here are some additional resources to further explore, deepen your understanding and stay updated on the latest developments in the field of AI and generative modeling. the topics covered in this notebook:

1. **Books**:
   - "GPT-3: The Next Evolution in AI" by OpenAI
   - "Deep Learning" by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
   - "Natural Language Processing with Python" by Steven Bird, Ewan Klein, and Edward Loper <br>
<br>   
2. **Online Courses**:
   - Coursera: "Natural Language Processing" by University of Michigan
   - Udemy: "Deep Learning Specialization" by Andrew Ng
   - edX: "Deep Learning Explained" by Microsoft <br>
<br>  
3. **Research Papers**:
   - "Attention Is All You Need" by Vaswani et al. (Introducing Transformer architecture)
   - "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding" by Devlin et al. (Introduction to BERT)
   - "Language Models are Few-Shot Learners" by Brown et al. (GPT-3 paper) <br>
 <br>  
4. **Websites and Blogs**:
   - OpenAI Blog: Offers insights into the latest advancements in AI research and development.
   - Towards Data Science: A platform for sharing data science and machine learning articles, tutorials, and resources.
   - Medium: Various authors and publications provide articles on AI, NLP, and machine learning topics.<br>
 <br>  
5. **GitHub Repositories**:
   - Hugging Face Transformers: A repository containing implementations of state-of-the-art models and pre-trained checkpoints.
   - AllenNLP: An open-source NLP research library built on PyTorch, providing various tools and pre-trained models.<br>
   <br>
6. **YouTube Channels**:
   - Two Minute Papers: Provides concise summaries and insights into various research papers in AI and machine learning.
   - CodeEmporium: Offers tutorials on implementing machine learning models and concepts. <br>
  <br> 
7. **Community Forums**:
   - Reddit's r/MachineLearning: A forum for discussions, questions, and news related to machine learning.
   - Stack Overflow: A platform where you can ask questions and find answers related to programming, including AI and machine learning.
   <br>

                                                             --------x---------

### License

Copyright 2024 Aditi Deodhar

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.