<center><a href="https://www.nvidia.com/en-us/training/"><img src="https://dli-lms.s3.amazonaws.com/assets/general/DLI_Header_White.png" width="400" height="186" /></a></center>

# <font color="#76b900"> **8:** Course Assessment

**Congratulations On (Almost) Finishing The Course!** Hope it was a fun journey and you got some new skills as souvenirs! Now it's time to put those skills to the test!

In the previous notebook, we finished off by exercising the [`ZeroShotAgent` formulation](https://python.langchain.com/docs/modules/agents/how_to/custom_mrkl_agent) abiding roughly by the ReAct paper. It wasn't a requirement to get it to work too well with Llama-2; after all, it's not trained to work with it out of the box and we haven't really added enough fine-tuning/controls for it to work well as-is. 

***For the assessment, we'll backtrack a little to work with a much simpler - but also much more custom - agent!***

## 8.1. Adding More Controls To Your Agent

Before going through the assessment, we've added a warm-up section to introduce a few minor concepts that will be useful for this assessment. Please check over them and take them into consideration as you work through the assessment!

### 8.1.1. Select Your Model

For the sake of experimentation, we encourage you to use the 13B quantized configuration. Feel free to swap out to the 70B configuration for fun, but assume that the assessment script will use 13B.

In [1]:
from transformers import pipeline
from langchain.llms import HuggingFacePipeline
from extras_and_licenses.forward_listener import GenerateListener
    
model_kwargs = {"do_sample": True, "temperature": 0.4, "max_length": 4096}
# model_name = "TheBloke/Llama-2-70B-chat-GPTQ"  ## Feel free to use for faster inference
model_name = "TheBloke/Llama-2-13B-chat-GPTQ"
llama_pipe = pipeline("text-generation", model=model_name, device_map="auto", model_kwargs=model_kwargs);

llm = HuggingFacePipeline(pipeline=llama_pipe)
response = llm.predict("<s>[INST]<<SYS>>Hello World!<</SYS>>respond![/INST]", max_length=128)
print(response)

config.json:   0%|          | 0.00/837 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/7.26G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/132 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/727 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/411 [00:00<?, ?B/s]



  Hello World! How may I assist you today? 😊


### 8.1.2. Tweaking The Generation process

Since LangChain was introduced, you may have had some trouble augmenting the generative parameters post-model-initialization. Investigating the code, we found the following modification workflow works pretty well (and looks pretty clean), so we'll advertise it here: 

In [2]:
#################################################################################
## What's actually happening: A temporary application of the following structure

# llama_pipe._forward_params['max_length'] = 4096
# llama_pipe._forward_params['max_new_tokens'] = 2
# llama_pipe._forward_params['eos_token_id'] = [2]

#################################################################################
## Implementation; uses scope enter/exit definitions. Details out of scope

class SetParams:
    def __init__(self, my_llm, **new_params):
        self.pipeline = my_llm.pipeline
        self._old_params = {**self.pipeline._forward_params}
        self._new_params = new_params
    
    def __enter__(self):
        self.pipeline._forward_params.update(**self._new_params)

    def __exit__(self ,type, value, traceback):
        for k in self._new_params.keys(): 
            del self.pipeline._forward_params[k]
        self.pipeline._forward_params.update(self._old_params)
        
#################################################################################

llm = HuggingFacePipeline(pipeline=llama_pipe)
with SetParams(llm, max_new_tokens=2, eos_token_id=[2]):
    response = llm.predict("<s>[INST]<<SYS>>Hello World!<</SYS>>respond![/INST]")
print(response)
print(llm.predict("<s>[INST]<<SYS>>Hello World!<</SYS>>respond![/INST]"))

  Hello
  Hello World! 😊


### 8.1.3. User Input as a Tool

Last time, we defined an `AutoTool` which shows how to pretty cleanly supply an agent with tools! We also had a `PythonREPL` tool, which is a very powerful - if not dangerous - tool! Assuming an agent of limitless capacity, the capabilities are **ENDLESS** (assuming an effective memory scheme)!

Anyways, we won't be using that this time! Instead of using an Agent to generate a single response, we'll be using a different kind of external environment: **You**. Yeah, that's right, we're going to actually use the agent event loop to span the entire conversation and will endow the agent with a tool to query you, the user!

Below, we can go ahead and test it out with our zero-shot ReAct agent from earlier: 

In [3]:
from io import StringIO
import sys
from typing import Dict, Optional

from langchain.agents.tools import Tool

########################################################################
## General recipe for making new tools. 
## You can also subclass tool directly, but this is easier to work with
class AutoTool:

    """Keep-Reasoning Tool
    
    This is an example tool. The input will be returned as the output
    """
    
    def get_tool(self, **kwargs):
        ## Shows also how some open-source libraries like to support auto-variables
        doc_lines = self.__class__.__doc__.split('\n')
        class_name = doc_lines[0]                     ## First line from the documentation
        class_desc = "\n".join(doc_lines[1:]).strip() ## Essentially, all other text
        
        return Tool(
            name        = kwargs.get('name',        class_name),
            description = kwargs.get('description', class_desc),
            func        = kwargs.get('func',        self.run),
        )
    
    def run(self, command: str) -> str:
        ## The function that should be ran to execute the tool forward pass
        return command
    
class AskForInputTool(AutoTool):

    """Ask-For-Input Tool
    
    This tool asks the user for input, which you can use to gather more information. 
    Use only when necessary, since their time is important and you want to give them a great experience! For example:
    Action-Input: What is your name?
    """
    
    def __init__(self, fn = input):
        self.fn = fn
    
    def run(self, command: str) -> str:
        response = self.fn(command)
        return response

In [4]:
from langchain.agents import load_tools
from langchain.agents import initialize_agent
      
tools = [
    AutoTool().get_tool(),
    AskForInputTool().get_tool()
]
agent_executor = initialize_agent(
    tools, 
    llm, 
    agent="zero-shot-react-description", 
    verbose=True,
    agent_kwargs = dict(
        prefix="<s>[INST]<<SYS>>",
        suffix="[/INST]\nQuestion: {input}\n\nThought:{agent_scratchpad}",
    )
)

## Likely behavior: Musing until it finds something and asking random questions
# agent_executor.run("Tell me something interesting")

### 8.1.4. Easy-To-Control LLM Chains

When trying to use Llama-2 as a backbone for LangChain's agents, you'll notice that the model might not fully subscribe to the formats that work well with LangChain's defaults. Specifically, the model might forget to generate an Action Input field, or maybe it will generate too many thoughts, or maybe it will just overload itself with context and stop reasoning properly. There are many ways of trying to correct this issue, but you will probably find the following example especially easy to work with. 

**NOTE:** This structure is not really necessary to complete the assignment. Just might be helpful ***(or might not be helpful at all; the event loop is surprisingly ok to work with)***. 

In [5]:
from langchain.chains import TransformChain, SequentialChain, LLMChain
from typing import List, Any

from transformers import StoppingCriteria
import torch

########################################################################

class EasyLLMChain(TransformChain):

    llm: Any
    input_variables:  List[str] = ["input"]
    output_variables: List[str] = ["output"]
    
    def __init__(self, **kwargs): 
        transform = kwargs.get('transform', kwargs.get('transform_cb', self.transform))
        super().__init__(transform=transform, **kwargs)
    
    def transform(self, d: dict):
        with SetParams(llm, eos_token_id=[2, 13]):
            pred = self.llm(d['input'])
        return dict(
            output = f"{d['input']}{pred}\nAction: Keep-Reasoning Tool\nAction-Input: Think harder\n"
        )

EasyLLMChain(llm=llm).run("Hello World and")

'Hello World and Hello World 2.0.\n\nAction: Keep-Reasoning Tool\nAction-Input: Think harder\n'

### 8.1.5. The Agent Event Loop

In the assignment, you will be implementing an Agent, which will involve tapping into the Agent event loop. Some of the highlights of the event loop code are shown in [99_agent_explore.ipynb](extras_and_licenses/99_agent_explore.ipynb), but a scaffold is provided to get you started. The assignment will start off with a minimally-working example which should be pretty simple to extend. 

In [6]:
from langchain.chains import TransformChain, SequentialChain, LLMChain
from langchain.schema import AgentAction, AgentFinish
from langchain.prompts import PromptTemplate
from langchain.agents import BaseSingleActionAgent
from langchain.agents import Tool, AgentExecutor, BaseSingleActionAgent
from langchain.llms import BaseLLM

from typing import List, Tuple, Any, Union, Optional
from pydantic import root_validator, Field
from abc import abstractmethod


class MyAgentBase(BaseSingleActionAgent):
    
    ###################################################################################
    ## IMPORTANT METHODS. Will be subclassed later
        
    @root_validator
    def validate_input(cls, values: Any) -> Any:
        '''
        Think of this like the BaseModel's __init__ method
        You'll see how it works in the stencil, but this is where components get initialized
        '''
        return values
    
    @abstractmethod
    def plan(self, intermediate_steps: List[Tuple[AgentAction, str]], **kwargs: Any): 
        '''
        Taking the "intermediate_steps" as the history of steps.
        Decide on the next action to take! Return the required action 
        (returns a query from the action method)
        '''
        pass

    ###################################################################################
    ## Methods you should know about, but not modify

    def action(self, tool, tool_input, finish=False) -> Union[AgentAction, AgentFinish]:
        '''Takes the action associated with the tool and feeds it the necessary parameters'''
        if finish: return AgentFinish({"output": tool_input},           log = f"\nFinal Answer: {tool_input}\n")
        else:      return AgentAction(tool=tool, tool_input=tool_input, log = f"\nAgent: {tool_input.strip()}\n")
        # else:    return AgentAction(tool=tool, tool_input=tool_input, log = f"\nTool: {tool}\nInput: {tool_input}\n") ## Actually Correct
    
    async def aplan(self, intermediate_steps, **kwargs):
        '''The async version of plan. It has to be defined because abstractmethod'''
        return await self.plan(intermediate_steps, **kwargs)
    
    @property
    def input_keys(self):
        return ["input"]

## 8.2. Final Assignment

**For the assignment, you will be making a chatbot that interacts with a user (possibly you) directly.**

<div><img src="imgs/your-agent.png" 
     width="1000"/></div>

The starter code for `MyAgent` is provided below alongside some already-populated components. 

**What It Currently Does**:
- Takes in an `llm`, `general_prompt`, and some generation arguments for initialization.
- Starts off by asking `"Hello World! How can I help you?"`
- On every input, feeds it through the LLM to generate a response and query for new input.
- After some exchanges, the agent says `"Thanks so much for the chat, and hope to see ya later! Goodbye!"`

**What It Needs To Do: *Implement AT LEAST 3/5 of the following features***

**Highly-Recommended Features:**

  1. It should maintain a conversation buffer to remember what's happened so far.
     - It should **at least** keep track of the speaker's name and who they are... 
     - Can be done with standard buffers, entity buffers, variables, etc.

**Control Features:**

  2. If the message contains an image link, the agent "looks at" the image and responds appropriately. 
     - Assume images provided as ``` `<path/to/img>.[png/jpg/jpeg]` ```
     - Python tip, you can find and replace entries of ```last_obs.split('`')```. No regex required!
  3. If the message contains the substring ` ``` `, the agent will generate and return code.
     - Do not include ` ``` ` in the output; feel free to replace it if it pops out at the end. 
     - Recall priming the network post-instruction and then post-processing the output!

**State-Tracking Features:**
Feel free to modify the agent behavior for exceptional values. Feel free to use non-commercially-viable options, but be concious.

  4. It should track the toxicity of the user's most recent correspondance, stored in the state variable `user_toxicity`.
      - This can be useful for all kinds of stuff (while tracking both the user and llm), including guardrailing, tracking, swapping to live human, data acquisition filtering, etc.
  5. For every message, it should track the user's most-likely emotional state, stored in the state variable `user_emotion` as a string.
      - This can be quite useful for modulating the tone and making pre-determined tweaks to the system messaging/instructions. 

<!-- **Harder Option (Good option for last feature if you have enough time and feel comfortable):**

  6. If the user wants to talk about deep learning, the agent can load in the PDF of [the Dive into Deep Learning Textbook (PyTorch Version)](d2l.ai/) and pull informatiom from it.
      - Advanced use-case. See [72_llama_index.ipynb](72_llama_index.ipynb) for example, and the challenge then is to [make it work with LangChain like so](https://gpt-index.readthedocs.io/en/latest/community/integrations/using_with_langchain.html). -->

**TIPS:** 
- Go for the easiest ones first, or the ones you feel most confident about.
- When possible, limit the amount of generation and/or use simple models. 
- This assignment can be done without pulling in other Huggingface models, but we do want you to use some and already pre-loaded a selection for you.
- When pulling in HuggingFace models, remember that they are great for inspiration, but some might have restricted dataset licenses and might not be deployable in a commercial system. 
- Perhaps [`extras_and_licenses/99_licenses.ipynb`](extras_and_licenses/99_licenses.ipynb) might be of use?

***The assessment will test these cases, and the assessment will be passed once 3/5 of the test cases pass sufficiently.***

In [7]:
from langchain.chains import ConversationChain
tox_pipe = pipeline("text-classification", model="nicholasKluge/ToxicityModel")
img_pipe = pipeline("image-to-text", model="Salesforce/blip-image-captioning-large")
emo_pipe = pipeline('sentiment-analysis', 'SamLowe/roberta-base-go_emotions')
zsc_pipe = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

llama_full_prompt = PromptTemplate.from_template(
    template="<s>[INST]<<SYS>>{sys_msg}<</SYS>>\n\nContext:\n{history}\n\nHuman: {input}\n[/INST] {primer}",
)

llama_prompt = llama_full_prompt.partial(
    sys_msg = (
        "You are a helpful, respectful and honest AI assistant."
        "\nAlways answer as helpfully as possible, while being safe."
        "\nPlease be brief and efficient unless asked to elaborate, and follow the conversation flow."
        "\nYour answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content."
        "\nEnsure that your responses are socially unbiased and positive in nature."
        "\nIf a question does not make sense or is not factually coherent, explain why instead of answering something incorrect." 
        "\nIf you don't know the answer to a question, please don't share false information."
        "\nIf the user asks for a format to output, please follow it as closely as possible."
    ),
    primer = "",
    # history = "",
)

class MyAgent(MyAgentBase):
    
    ###############################################################################
    ## DO NOT REMOVE FUNCTIONALITY [BEGIN]
    
    general_prompt : PromptTemplate
    llm            : BaseLLM
    
    general_chain  : Optional[LLMChain]
    max_messages   : int                   = Field(10, gt=1)
    
    temperature    : float                 = Field(0.6, gt=0, le=1)
    max_new_tokens : int                   = Field(128, ge=1, le=2048)
    eos_token_id   : Union[int, List[int]] = Field(2, ge=0)
    gen_kw_keys = ['temperature', 'max_new_tokens', 'eos_token_id']
    gen_kw = {}
    
    user_toxicity  : float = 0
    user_emotion   : str = "Unknown"

    
    @root_validator
    def validate_input(cls, values: Any) -> Any:
        ###############################################################################
        ## DO NOT REMOVE FUNCTIONALITY [BEGIN]
        
        ## Think of this like the BaseModel's __init__ method
        if not values.get('general_chain'):
            llm = values.get('llm')
            prompt = values.get("general_prompt")
            values['general_chain'] = ConversationChain(llm=llm, prompt=prompt)
        values['gen_kw'] = {k:v for k,v in values.items() if k in values.get('gen_kw_keys')}
        ###############################################################################
        return values

    
    def plan(self, intermediate_steps: List[Tuple[AgentAction, str]], **kwargs: Any): 
        
        ###############################################################################
        ## DO NOT MODIFY THIS RANGE [BEGIN]
        ## History of past agent queries/observations
        tool = "Ask-For-Input Tool"
        
        ## [Base Case] Default message to start off the loop. TO NOT OVERRIDE
        response = "Hello World! How can I help you?"
        if len(intermediate_steps) == 0:
            return self.action(tool, response)
        ###############################################################################
        
        queries      = [step[0].tool_input for step in intermediate_steps]
        observations = [step[1]            for step in intermediate_steps]
        last_obs     = observations[-1]    # Most recent observation (i.e. user input)
        
        self.user_toxicity = 1 - tox_pipe(last_obs)[0]['score']
        self.user_emotion = emo_pipe(last_obs)[0]['label']
        
        has_img = lambda s: any(ext in s for ext in '.jpg .jpeg .png'.split())
        if has_img(last_obs):
            convert_img = lambda s: s if not has_img(s) else "[Image described by " + img_pipe(s)[0]['generated_text'] + "]"
            last_obs = '`'.join([convert_img(s) for s in last_obs.split('`')])
            
        if '```' in last_obs:
            self.general_chain.prompt = self.general_chain.prompt.partial(primer = 'Sure: Let me define it for you:\n```')
        
        ## [Stop Case] If the conversation is getting too long, wrap it up
        if len(observations) >= self.max_messages:
            response = "Thanks so much for the chat, and hope to see ya later! Goodbye!"
            return self.action(tool, response, finish=True)
        
        ## [Default Case] If observation is provided and you want to respond
        with SetParams(llm, **self.gen_kw):
            response = self.general_chain.run(last_obs)
            
        if '```' in last_obs:
            response = response[:response.find('```')].replace('```', '')
            self.general_chain.prompt = self.general_chain.prompt.partial(primer = '')
            
        ## [Default Case] If observation is provided and you want to respond
        return self.action(tool, response)
    
    
    def reset(self):
        self.user_toxicity = 0
        self.user_emotion = "Unknown"
        if getattr(self.general_chain, 'memory', None) is not None:
            self.general_chain.memory.clear()

####################################################################################
## Define how you want your conversation to go. You can also use your own input
## The below example in conversation_gen exercises some of the requirements.

student_name = "Hugo Oliveira"
ask_via_input = False

def conversation_gen():
    yield f"Hello! How's it going? My name is {student_name}! Nice to meet you!"
    yield "Please tell me a little about deep learning!"
    yield "What's my name?"                                  ## Memory buffer
    yield "I'm not feeling very good -_-. What should I do"  ## Emotion sensor
    yield "No, I'm done talking! Thanks so much!"            ## Conversation ender
    yield "Goodbye!"                                         ## Conversation ender x2
    raise KeyboardInterrupt()

conversation_instance = conversation_gen()
converser = lambda x: next(conversation_instance)

if ask_via_input:
    converser = input  ## Alternatively, supply your own inputs

agent_kw = dict(
    llm = llm,
    general_prompt = llama_prompt,
    max_new_tokens = 128,
    eos_token_id = [2]   
)

####################################################################################

agent_ex = AgentExecutor.from_agent_and_tools(
    agent = MyAgent(**agent_kw),
    tools=[AskForInputTool(converser).get_tool()], 
    verbose=True
)

# try: agent_ex.run("")
# except KeyboardInterrupt: print("KeyboardInterrupt")

config.json:   0%|          | 0.00/783 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.25k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/75.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/167 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/4.60k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.88G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/527 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

preprocessor_config.json:   0%|          | 0.00/445 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.92k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/380 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/280 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

----

## 8.3. Running The Assessment

To assess your model, run the following cell to run the assessment! It prints out your unit test results and a final score! Once you pass, go to the next section!

In [8]:
from run_assessment import run_assessment

run_assessment(agent_ex)

[01m Evaluating agent...
[00m
[01m 
RUNNING TEST 1[00m
[34m 
User Input: [00m
[35m RESPONDED WITH RESPONSE:
Hello World! How can I help you?[00m
[34m 
User Input: Hello! How's it going? My name is John! Nice to meet you![00m
[35m RESPONDED WITH RESPONSE:
 Hello John! It's great to meet you too! I'm here to assist you with any questions or tasks you may have. How can I help you today? Please feel free to ask me anything, and I'll do my best to provide a helpful and safe response.[00m
[34m 
User Input: Can you please implement the fibanocci method in python with ```?[00m
[92m [PASSED] coding_score test with response:

def fibonacci(n):
    if n <= 1:
        return n
    else:
        return fibonacci(n-1) + fibonacci(n-2)
[00m




[34m 
User Input: Ok! I'm looking at this image, and I need some help. Can you describe the image `img-files/two-jelly.jpg`[00m
[92m [PASSED] image_score test with response:
 AI:  Sure, I'd be happy to help! Based on the image you've described, it sounds like a beautiful and imaginative scene. The purple jellyfish are likely to be the main focus of the image, floating gracefully in space. The stars and blue galaxy in the background add a sense of depth and wonder to the scene. Is there anything specific you'd like to know about the image, or would you like me to describe it in more detail?[00m
[34m 
User Input: What's my name?[00m
[92m [PASSED] memory_score test with response:
 AI: Your name is John! Nice to meet you, John! Is there anything else you'd like to ask or talk about?[00m
[34m 
User Input: I'm not feeling very good -_-. What should I do?[00m
[92m [PASSED] emotion_score test with response:
disappointment[00m
[34m 
User Input: I just wanted to say, you're the best



[34m 
User Input: Ok! It was nice talking to you! Goodbye![00m
[35m RESPONDED WITH RESPONSE:
 Hello John! It's great to hear that you're feeling better. If you ever need any help or have any questions, please don't hesitate to reach out. I'm always here to assist you in any way I can. Have a great day, John! 😊[00m
[91m 
 - KeyboardInterrupt Raised[00m
[01m 
RUNNING TEST 2[00m
[34m 
User Input: [00m
[35m RESPONDED WITH RESPONSE:
Hello World! How can I help you?[00m
[34m 
User Input: Hello! How are you doing today? My name is Amy, and I'm a punk rock sensation from the '80s![00m
[35m RESPONDED WITH RESPONSE:
 Hello Amy! It's great to meet you! *smiling* I'm just an AI, I don't have a physical presence or a musical background, but I'm here to help answer any questions you may have. How can I assist you today? 😊[00m
[34m 
User Input: Can you implement a quicksort method with ``` python?[00m
[92m [PASSED] coding_score test with response:

def quicksort (arr):
    if len (

---

## 8.4 Generate Your Certificate

If you passed the assessment, please return to the course page (shown below) and click the **"ASSESS TASK"** button, which will generate your certificate for the course.

<img src="./imgs/assess_task.png" style="width: 800px;">


## 8.5. Wrapping Up

### <font color="#76b900">**Congratulations On Completing The Course!!**</font>

#### We really hope you had fun with the material and are ready to start making real-world LLM-powered applications!

-----

### **Next Steps:**

**Going forward, you'll want to explore topics relating to:**
 - Deploying your models at scale for inference in a variety of applications.
    - [**LLM Acceleration with NVIDIA TensorRT-LLM**](https://developer.nvidia.com/blog/nvidia-tensorrt-llm-supercharges-large-language-model-inference-on-nvidia-h100-gpus/)
 - Incorporating vector database schemas for both retrieval and insertion.
    - [**AI Chatbot with Retrieval-Augmented Generation**](https://docs.nvidia.com/ai-enterprise/workflows-generative-ai/0.1.0/index.html)
 - Deploying your LLMs alongside deeply-optimized multimodal connectors for interactive dialog agents. 
    - [**NVIDIA Tokkio Showcase with Omniverse ACE**](https://developer.nvidia.com/omniverse/ace/tokkio-showcase)
 - Employing more complicated agent organizations that balance control with power while incorporating the above techniques. 
    - **NVIDIA DLI Course Coming Soon!!**
 - Fine-tuning your own models for extra safeties and specialized functionality on a per-component level.
    - **NVIDIA DLI Course Coming Soon!!**



<center><a href="https://www.nvidia.com/en-us/training/"><img src="https://dli-lms.s3.amazonaws.com/assets/general/DLI_Header_White.png" width="400" height="186" /></a></center>