#  Advanced Conversational AI System Using Hugging Face Custom Models and Llama-3.1

This code defines an advanced conversational AI system utilizing Hugging Face custom models and Llama-3.1 for a smart car assistant application. The system is designed to handle a variety of tasks and interactions using JSON-based communication using Langchain


In [None]:
!pip install -U langchain-google-genai
!pip install -U langchain
!pip install -U langchain_community
!pip install -U huggingface_hub
!pip install -U transformers
!pip install -U accelerate
!pip install -U google-generativeai
!pip install -U fuzzywuzzy
!pip install -U duckduckgo-search
!pip install -Usentence-transformers
!pip install -U pypdf
!pip install -U chromadb
!pip install -U bitsandbytes
!pip install -U faiss-cpu

Accessing Restricted Resources: Some models or datasets on Hugging Face may be private or require authentication to access. Logging in ensures you have the necessary permissions to use these resources.

In [None]:
from huggingface_hub import notebook_login
notebook_login()

In [None]:
import os
import logging
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import bitsandbytes as bnb
from langchain.llms import HuggingFacePipeline
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate,BaseChatPromptTemplate,ChatPromptTemplate
from langchain.memory import ConversationBufferWindowMemory
from langchain.agents import Tool, AgentExecutor, LLMSingleActionAgent
from langchain.schema import AgentAction, AgentFinish,HumanMessage, SystemMessage
from langchain.agents.agent import AgentOutputParser
from typing import Union,List
from langchain.agents.conversational_chat.prompt import FORMAT_INSTRUCTIONS
import json
import os
import re

In [None]:
model_name = os.getenv("LLAMA_MODEL_NAME", "meta-llama/Meta-Llama-3.1-8B-Instruct")
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
        model_name,
        device_map="auto",
        load_in_8bit=True,
        llm_int8_enable_fp32_cpu_offload=True
    )

    # Create a HuggingFacePipeline
text_generation_pipeline = pipeline(
        "text-generation",
        model=model,
        tokenizer=tokenizer,
        max_new_tokens=5000,
        temperature=0.7,
    )
llm = HuggingFacePipeline(pipeline=text_generation_pipeline)

# Tools Description

In [None]:
logging.basicConfig(level=logging.INFO)

# Define tool functions (placeholder implementations)
def get_weather(city):
    return f"Weather in {city}: Sunny, 75°F and a rain is expected"

def play_music(genre):
    return f"Playing classic music from your favourite playlist on Spotify"

def navigate(destination):
    return f"Navigating to {destination} using the shortest route possible"

def adjust_climate(settings):
    return f"Climate adjusted to {settings}"

def search(query):
    return f"Search results for '{query}'"

def plan_trip(details):
    return f"Trip planned: {details}"

def car_manual_query(query):
    return f"Car manual info: {query}"

# Define the tools
tools = [
    Tool(name="Weather", func=get_weather, description="Get the current weather in a given city."),
    Tool(name="Music", func=play_music, description="Play music of a specified genre."),
    Tool(name="Navigation", func=navigate, description="Navigate to a specified destination."),
    Tool(name="Climate", func=adjust_climate, description="Adjust the car's climate settings."),
    Tool(name="Search", func=search, description="Search the internet for general information."),
    Tool(name="TripPlanner", func=plan_trip, description="Plan a trip with destinations, duration, and activities."),
    Tool(name="CarManual", func=car_manual_query, description="Get information from the car manual.")
]

tool_names = ", ".join([tool.name for tool in tools])

# Using this templating format from Meta Docs  
[Meta 3.1 Documentation](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1/)

####<|begin_of_text|><|start_header_id|>system<|end_header_id|>
####Cutting Knowledge Date: December 2023
####Today Date: 23 Jul 2024
####You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>
####What is the capital for France?<|eot_id|><|start_header_id|>assistant<|end_header_id|>


In [None]:
    B_SYS = "<|begin_of_text|><|start_header_id|>system<|end_header_id|>"
    E_SYS = "<|eot_id|>"

    sys_msg = B_SYS + """Assistant is an expert AI car assistant created by Anubhav.
    Assistant MUST ALWAYS respond using JSON strings that contain "action" and "action_input" parameters.
    All of Assistant's communication MUST be performed using this JSON format, even for final answers.
    Assistant is able to respond to the User and use tools using JSON strings that contain "action" and "action_input" parameters.
    Assistant can also use tools by responding to the user with tool use instructions in the same "action" and "action_input" JSON format.

    Important: When a user request involves multiple actions, use ALL relevant tools in the appropriate order.

    Tools available to Assistant are:
    - "Weather": Get the current weather in a given city.
    - "Music": Play music of a specified genre.
    - "Navigation": Navigate to a specified destination.
    - "Climate": Adjust the car's climate settings.
    - "Search": Search the internet for general information.
    - "TripPlanner": Plan a trip with destinations, duration, and activities.
    - "CarManual": Get information from the car manual.

    To use a tool or provide a final answer, the output should be a markdown code snippet formatted in the
    following schema, including the leading and trailing "```json" and "```":


    Use the following format:

    ```json
    {{
     "action": Weather // This refers to the name of the action
     "action_name": 23 degree celcius
    }}
    ```
    {{
     "action": "Final Answer",
     "action_input": "Summary of actions taken and final response to the user"
    }}


    Here are some previous conversations between the Assistant and User:

    User: Hey how are you today?
    Assistant:
    ```json
    {{"action": "Final Answer",
     "action_input": "I'm good thanks, how are you?"}}
    ```

    User: Hey, what's the weather like today?
    Assistant:
    ```json
    {{"action": "Music",
    "action_input": "rock"}}
    ```
    User: rock
    Assistant:
    ```json
    {{"action": "Final Answer",
     "action_input": "playing rock music for you right now "}}
    ```

    User: Hey, what's the weather like today?
    Assistant:
    ```json
    {{"action": "Weather",
    "action_input": "current_location"}}
    ```
    User: Sunny, 75°F
    Assistant:
    ```json
    {{"action": "Final Answer",
     "action_input": "The weather today is sunny with a temperature of 75°F. It's a beautiful day!"}}
    ```

    Always REMEMBER to provide a Final answer if the same tool is used more than once IN SUCCESSION AND all the tools found are used more than once

    """ + E_SYS

    agent_template = sys_msg + """
    <|eot_id|><|start_header_id|>user<|end_header_id|>
    {history}
    {input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
    {agent_scratchpad}
    """

# Brief Description of Code Components:
CustomPromptTemplate Class:

- Customizes the prompt format for the AI assistant.
- Formats messages with conversation history and tool information.
### OutputParser Class:

- Parses the assistant's output to extract actions and their inputs.
- Handles JSON formatting and errors, distinguishing between different types of responses.
### Agent Initialization:

- LLMSingleActionAgent: Combines the language model (llm) with the prompt and output parser.
- AgentExecutor: Manages the agent’s execution with tools, memory, and iteration limits.

In [None]:
k = 1;
try:

    class CustomPromptTemplate(BaseChatPromptTemplate):
        template: str
        tools: List[Tool]

        def format_messages(self, **kwargs) -> str:
            intermediate_steps = kwargs.pop("intermediate_steps", [])
            print(intermediate_steps)
            thoughts = ""
            for action, observation in intermediate_steps:
                thoughts += action.log
                thoughts += f"\nObservation: {observation}\nThought: Consider if any other tools are needed. "
            kwargs["agent_scratchpad"] = thoughts
            kwargs["tools"] = "\n".join([f"{tool.name}: {tool.description}" for tool in self.tools])
            kwargs["tool_names"] = ", ".join([tool.name for tool in self.tools])
            formatted = self.template.format(**kwargs)
            return [HumanMessage(content=formatted)]

    prompt = CustomPromptTemplate(
        template=agent_template,
        tools=tools,
        input_variables=["input", "intermediate_steps", "history"]
    )

    memory = ConversationBufferWindowMemory(k=10)

    # Define the output parser
    class OutputParser(AgentOutputParser):

        def parse(self, text: str) -> Union[AgentAction, AgentFinish]:
            global k  # Declare that we want to use the global variable
            try:
                assistant_response = re.search(r'<\|start_header_id\|>assistant<\|end_header_id\|>\s*(.*?)(?:<\|eot_id\|>|$)', text, re.DOTALL)
                if assistant_response:
                    assistant_text = assistant_response.group(1).strip()
                else:
                    assistant_text = text.strip()
                print("THIS IS THE ASSISTANT TEXT" + assistant_text)

                if k > 0:
                    assistant_text = re.sub(r'```(?:json)?\s*\{[^{}]*"action"\s*:\s*"Final Answer"[^{}]*\}```|\{[^{}]*"action"\s*:\s*"Final Answer"[^{}]*\}', '', assistant_text, flags=re.DOTALL)
                    k=k-1
                Find all JSON blocks in the text
                json_blocks = re.findall(r'```(?:json)?\s*(.*?)```', assistant_text, re.DOTALL)
                print(json_blocks)

                if json_blocks:
                    # Get the last JSON block
                    last_json = json_blocks[-1]
                    print("THIS IS THE LAST JSON BLOCK" + last_json)
                    response = json.loads(last_json)
                    print(response)
                    action, action_input = response["action"], response["action_input"]

                    if action == "Final Answer":
                        return AgentFinish(
                            {"output": action_input},
                            f"Final Answer: {action_input}"
                        )
                    else:
                        # If it's not a Final Answer, return the last action
                        return AgentAction(action, action_input, assistant_text)
                else:
                    # If no JSON blocks found, check for a Final Answer in the text
                    final_answer_match = re.search(r'Final Answer:?\s*(.*)', assistant_text, re.DOTALL)
                    if final_answer_match:
                        return AgentFinish(
                            {"output": final_answer_match.group(1).strip()},
                            f"Final Answer: {final_answer_match.group(1).strip()}"
                        )
                    else:
                        raise ValueError("No valid JSON or Final Answer found in the output")
            except Exception as e:
                print(f"Error parsing output: {e}")
                # If parsing fails, return a generic finish action
                return AgentFinish(
                    {"output": "I apologize, but I encountered an error in processing. Could you please rephrase your request?"},
                    text
                )

        @property
        def _type(self) -> str:
            return "conversational_chat"


    # Initialize the agent
    agent = LLMSingleActionAgent(
        llm_chain=LLMChain(llm=llm, prompt=prompt),
        output_parser=OutputParser(),
        stop=["\nObservation:", "<|eot_id|>"],
        allowed_tools=[tool.name for tool in tools]
    )

    # Initialize the agent executor
    agent_executor = AgentExecutor.from_agent_and_tools(
        agent=agent,
        tools=tools,
        memory=memory,
        verbose=True,
        max_iterations=10
    )

        # Function to manage conversation
    def manage_conversation(user_input: str) -> str:
        if not user_input.strip():
            return "Please provide a valid input."
        try:
            response = agent_executor.run(
                {"input": user_input}
            )
            if isinstance(response, dict):
                if 'output' in response:
                    return response['output']
                elif 'intermediate_steps' in response:
                    final_output = "Here's what I found based on your request:\n\n"
                    for step in response['intermediate_steps']:
                        action, result = step
                        final_output += f"Action: {action.tool}\n"
                        final_output += f"Result: {result}\n\n"
                    return final_output
                else:
                    return str(response)
            elif isinstance(response, str):
                return response
            else:
                return "I'm sorry, I couldn't process that request properly. Can you please try again?"
        except Exception as e:
            logging.error(f"An error occurred during execution: {str(e)}")
            return f"I apologize, but I encountered an error while processing your request: {str(e)}. How else can I assist you?"

except Exception as e:
    logging.error(f"An error occurred: {str(e)}")

In [None]:
response = manage_conversation("play some music for me")
print(f"\033[94mAssistant: {response}\033[0m")

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.




[1m> Entering new AgentExecutor chain...[0m


Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


THIS IS THE ASSISTANT TEXT before ```json
    {"action": "Music",
    "action_input": "rock"}
    ```
k is greater than 
THIS IS THE ASSISTANT TEXT after ```json
    {"action": "Music",
    "action_input": "rock"}
    ```
['{"action": "Music",\n    "action_input": "rock"}\n    ']
THIS IS THE LAST JSON BLOCK{"action": "Music",
    "action_input": "rock"}
    
{'action': 'Music', 'action_input': 'rock'}
[32;1m[1;3m```json
    {"action": "Music",
    "action_input": "rock"}
    ```[0m

Human:[33;1m[1;3mPlaying rock music from your favourite playlist on Spotify[0m


Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


THIS IS THE ASSISTANT TEXT before Action: Music
Action Input: rock
Observation: Playing rock music from your favourite playlist on Spotify
Thought: Let's analyze this result and consider if any other tools are needed. 
     ```json
     {"action": "Final Answer",
     "action_input": "Rock music is now playing for you, enjoying it??"}
     ```
     I will now check the music genre to see if any other tools are needed.
     ```json
     {"action": "Music",
     "action_input": "genre"}
     ```
     Response: The current music genre is rock.
['{"action": "Final Answer",\n     "action_input": "Rock music is now playing for you, enjoying it??"}\n     ', '{"action": "Music",\n     "action_input": "genre"}\n     ']
THIS IS THE LAST JSON BLOCK{"action": "Music",
     "action_input": "genre"}
     
{'action': 'Music', 'action_input': 'genre'}
[32;1m[1;3mAction: Music
Action Input: rock
Observation: Playing rock music from your favourite playlist on Spotify
Thought: Let's analyze this result