# AutoAssistAI

In [4]:
# Imports
import os
import json
import textwrap
import chromadb
import langchain
import sqlalchemy
import langchain_openai
from langchain_openai import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.chains import SimpleSequentialChain
from langchain.chains import SequentialChain
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferWindowMemory
from langchain.document_loaders import WebBaseLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
import warnings
warnings.filterwarnings('ignore')

## **Getting to Know LangChain**

[Oficial documentation](https://python.langchain.com/docs/get_started/introduction)

LangChain is a powerful framework designed to simplify the development of applications that use language models (LLMs). Its modular structure and versatility allow developers to build a wide range of solutions, from simple automation tasks to complex systems like chatbots, question-and-answer platforms, and more.

---

**What is LangChain?**  
LangChain is an open-source library that bridges the gap between LLMs and real-world applications by enabling seamless integration with various tools, data sources, and workflows. Its goal is to simplify the development process while offering robust capabilities for building intelligent applications.

---

**Core Features of LangChain**

1. **Modularity and Customization**  
   LangChain's modular design allows developers to integrate components like LLMs, prompt templates, memory, and agents. Each component can be customized to meet specific requirements, making the framework flexible and versatile.

2. **Integration with External Data**  
   One of LangChain's key features is Retrieval-Augmented Generation (RAG), enabling applications to retrieve and use external data sources like web content, documents, or APIs to provide more accurate and context-aware responses.

3. **Memory Management**  
   LangChain provides various types of memory, such as buffer memory and conversation memory, allowing applications to maintain context and improve user interactions over time.

4. **Agent Framework**  
   LangChain supports agents capable of dynamically deciding which tools or APIs to use based on user inputs, adding a layer of intelligence to your applications.

5. **Wide Compatibility**  
   It works seamlessly with a variety of LLMs, such as OpenAI's GPT models, Hugging Face transformers, and custom fine-tuned models, ensuring flexibility in choosing the best model for your use case.

---

**Applications of LangChain**

- **Chatbots**: Create intelligent and context-aware conversational agents.  
- **Question-Answer Systems**: Build systems capable of answering domain-specific questions using RAG and external data.  
- **Automated Processes**: Develop tools for summarizing, translating, or analyzing text data.  
- **Custom LLM Solutions**: Fine-tune language models with LangChain to address unique business problems.

---

**Why Use LangChain?**

LangChain simplifies the integration of language models with external tools and data sources, accelerating the development of sophisticated AI-driven applications. Whether you're building a chatbot, a data-powered assistant, or a customized LLM, LangChain offers the tools and flexibility to bring your ideas to life.


---

### **Diving Deeper into LangChain Components**

LangChain’s architecture is built around several core components, each designed to perform a specific function that simplifies the integration and application of large language models (LLMs). Below, we’ll explore these components in detail:

---

**1. Models**  
The **model** is the heart of LangChain. It interacts with the language model (LLM) to generate predictions, completions, or responses.

- **Supported Models:**  
  LangChain supports a wide range of LLMs, including:
  - OpenAI's GPT (e.g., GPT-3.5, GPT-4).
  - Hugging Face Transformers.
  - Open-source models (e.g., Llama, BLOOM, Falcon).
  
- **Customization:**  
  Developers can fine-tune models, adjust hyperparameters, and incorporate specialized pre-trained models for domain-specific tasks.

---

**2. Prompts**  
Prompts define how input is structured and presented to the LLM. Crafting effective prompts is crucial for achieving accurate and relevant responses.

- **Prompt Templates:**  
  LangChain provides tools for creating reusable templates with placeholders for dynamic inputs.  
  Example:  
  ```python
  from langchain.prompts import PromptTemplate

  prompt = PromptTemplate(
      input_variables=["context", "question"],
      template="Use the following context to answer the question:\n\n{context}\n\nQuestion: {question}"
  )
  ```
  
- **Prompt Optimization:**  
  LangChain facilitates testing and iteration of prompts to maximize model performance.

---

**3. Memory**  
Memory allows the system to retain information between interactions, making applications context-aware.

- **Types of Memory:**  
  - **ConversationBufferMemory:** Stores the entire conversation history.  
  - **ConversationSummaryMemory:** Summarizes past interactions to maintain context efficiently.  
  - **VectorStoreRetrieverMemory:** Uses embeddings to retrieve relevant context dynamically.

- **Use Case:**  
  For chatbots, memory ensures that the bot understands and maintains context throughout a conversation.

---

**4. Chains**  
Chains are sequences of operations that transform inputs into outputs. LangChain allows developers to build complex workflows by chaining multiple components together.

- **LLMChain:**  
  The simplest type of chain, consisting of a prompt and an LLM.  
  Example:  
  ```python
  from langchain.chains import LLMChain
  from langchain.llms import OpenAI

  llm = OpenAI(model="gpt-4")
  chain = LLMChain(llm=llm, prompt=prompt)
  response = chain.run({"context": "AI is transforming industries.", "question": "How is it used in healthcare?"})
  ```
  
- **Sequential Chains:**  
  Combine multiple chains to perform more complex tasks, such as summarization followed by question-answering.

---

**5. Tools and Agents**  
Agents are decision-makers that dynamically decide which tools to use based on user input. Tools provide external capabilities, such as searching the web or accessing APIs.

- **Tools:**  
  Common tools include:
  - **Web Search:** Retrieve real-time information.
  - **Calculators:** Perform mathematical computations.
  - **Databases:** Query structured or unstructured data.

- **Agents:**  
  Agents use prompts to decide which tool to invoke and how to handle responses.  
  Example: An agent might search the web for information if a question cannot be answered using the LLM alone.

---

**6. Data Connectors**  
LangChain supports **Retrieval-Augmented Generation (RAG)** by integrating with external data sources. This makes LLMs more powerful and capable of providing accurate, context-specific answers.

- **Data Sources:**  
  - **Vector Databases:** Pinecone, Weaviate, FAISS.  
  - **Document Loaders:** PDFs, Excel files, web scraping.  
  - **APIs:** Integrate third-party APIs for live data retrieval.

- **Embedding Models:**  
  LangChain allows embeddings to be generated for indexing and searching data. This ensures relevant information is retrieved efficiently.

---

**7. Evaluation**  
LangChain includes tools for evaluating and debugging applications to ensure they meet performance requirements.

- **Human-in-the-Loop (HITL):**  
  Involve human evaluators to assess the quality of responses.  
- **Automated Evaluation:**  
  Use metrics like BLEU, ROUGE, or accuracy to measure performance.

---

**8. Deployment**  
LangChain applications can be deployed on various platforms, making them scalable and production-ready.

- **Cloud Platforms:** AWS, GCP, Azure.  
- **Dockerization:** Containerize LangChain apps for easy deployment.  
- **Integration with APIs:** Expose the functionality as RESTful APIs for external use.

---

**9. Advanced Features**  
- **Streaming:** LangChain supports streaming responses for real-time applications like live chat interfaces.  
- **Callbacks:** Monitor and log the internal workflow of chains and agents for debugging or tracking.

---

**Why These Components Matter**  
Each component is modular and can be independently configured, allowing developers to:
- Customize solutions for specific use cases.
- Scale applications without overhauling existing structures.
- Ensure high performance and efficiency by leveraging the best tools and integrations.

---

Would you like me to focus on a specific component, or provide an example project that ties these components together?

---

## Defining the LLM

In [11]:
# Adding the API key
with open('../ignore/secret_key.json') as f:
    os.environ['OPENAI_API_KEY'] = json.load(f)['secret_key']
    

# Defines the LLM
# Creates an instance of a Large Language Model (LLM), specifically one provided by OpenAI
llm = OpenAI(temperature=0.9)

Temperature is a hyperparameter that influences the randomness of the responses generated by the model. A higher temperature value (usually ranging from 0 to 1) promotes more creative and varied responses. On the other hand, a lower temperature tends to cause the model to produce more deterministic and possibly more predictable responses.

In [13]:
# Send the prompt to LLM and capture the response
nome = llm.invoke("I want to open a Japanese food restaurant. Suggest a fancy name for it.")
print(nome)



"Sakura Garden Dining" 


In this context, the string “I want to open a Japanese food restaurant. Suggest a fancy name for it.” serves as the prompt or input to the language model. It describes the task the user wants the model to perform: creatively generating a name for a new Japanese food restaurant. The model will use its natural language training and prior knowledge to generate a response that meets this request.

---

## Using Prompt Templates

Prompt Templates in the context of LangChain refer to structured ways of formatting input to large language models (LLMs) to improve their performance and adherence to desired behaviors.

A prompt template defines a template sequence with placeholder variables that can be populated dynamically. This allows you to construct prompts in a consistent and programmatic manner, rather than hard-coding full prompts.

Prompt templates in LangChain provide a structured and extensible way to interface with LLMs, making it easy to explore and optimize prompting strategies to improve language model performance on specific tasks or domains.

In [14]:
# Set the prompt template
prompt_template_name = PromptTemplate(
    input_variables = ['cuisine'],
    template = "I want to open a {cuisine} restaurant. Suggest a fancy name for it."
)

The above line of code defines a PromptTemplate, a framework that allows you to create dynamic prompts for use with Large Language Models (LLMs). This approach is particularly useful when you want to generate custom prompts based on specific variables or when you want to reuse a prompt format with different data sets.

**input_variables = ['cuisine']**: Defines a list of variables that can be used to populate the template. In this case, there is a single variable called 'cuisine'. This variable acts as a placeholder that will be replaced with a specific value when the template is used.

In [16]:
# Use the previously defined template to generate a specific prompt,
# inserting the value "Italiana" in place of the variable culinary
p = prompt_template_name.format(cuisine = "Mexican")
print(p)

I want to open a Mexican restaurant. Suggest a fancy name for it.


## Operation Sequences with LLMChain

Chains in LangChain are sequences of operations that can process inputs and generate outputs by combining multiple components, including large language models (LLMs), other chains, and specialized tools or utilities.

An LLMChain is a type of chain that allows you to interact with a large language model (LLM) in a structured way. It provides a simple interface for passing inputs to the LLM and retrieving its outputs.

The LLMChain serves as a building block for many other constructs in LangChain, such as agents, tools, and more advanced chain types. By encapsulating the LLM interaction logic in a reusable and extensible component, LLMChain simplifies the process of building applications that leverage large language models.

In [18]:
# Create the chain and activate verbose
chain = LLMChain(llm = llm, prompt = prompt_template_name, verbose = True)

# Invoke the chain by passing a parameter to the prompt
chain.invoke("Brazilian")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mI want to open a Brazilian restaurant. Suggest a fancy name for it.[0m

[1m> Finished chain.[0m


{'cuisine': 'Brazilian',
 'text': '\n\n"Sabor do Brasil" which means "Taste of Brazil" in Portuguese.'}

The above line of code creates an instance of LLMChain, a class designed to chain or sequence operations using an LLM. This instance is configured to use a specific language model and a predefined prompt template.

In [19]:
# Create the chain and activate verbose
chain = LLMChain(llm = llm, prompt = prompt_template_name, verbose = True)

# Invoke the chain by passing a parameter to the prompt
chain.invoke("Thai")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mI want to open a Thai restaurant. Suggest a fancy name for it.[0m

[1m> Finished chain.[0m


{'cuisine': 'Thai', 'text': '\n\n"Silk & Spice Thai Bistro"'}