# EX0. Hello World

Welcome to the **Hello World** example for the BRAD Python package! This tutorial is designed to help you get started with BRAD by demonstrating its core functionality. You'll learn how to:

1. Import the BRAD package.
2. Create and configure a BRAD `Agent` instance.
3. Select and configure different large language models (LLMs).
4. Integrate BRAD with other LangChain workflows.


## 1. Import Module

To begin, ensure you have installed the BRAD package. The package can be installed either directly from the project [Github](https://github.com/Jpickard1/BRAD/) or from the [Python Package Index (i.e. PyPI)](https://pypi.org/project/BRAD-Chat/) using `pip`:

```
!pip install -U BRAD-Chat
```

See the software manual distribution section [here](https://brad-bioinformatics-retrieval-augmented-data.readthedocs.io/en/latest/distribution.html) for additional ways to install the package.

### Quickstart  

Getting started with BRAD is simple and efficient. Once the package is installed, follow these steps for the quickest way to begin:

1. **Import the Chat Module**  
   Import the chat module with:  
   `from BRAD import chat`

2. **Start Chat**  
   To start the chat, simply call the `chat()` method. By default, this will prompt you to:

   - Enter your OpenAI API key
   - Select a Retrieval-Augmented Generation (RAG) database (you can enter `N` to skip this step)

3. **Welcome Message**  
   Once you've entered the required information, a welcome message will be displayed. This message will:

   - Provide the location of the output directory, where information, data, or downloads from the chat will be stored.
   - Prompt you to enter an input and begin chatting with the model.

In later sections, we will show how to change these configurations to customize your BRAD experience.

Below, the code was run to perform this process. The user asks the chat session who it is and what it can do, and then closes the conversation with the keyword `exit`.  

---




In [2]:
from BRAD import chat
chat.chat()

Enter your Open AI API key:  ········



Would you like to use a database with BRAD [Y/N]?


 N


[32m2024-11-21 14:29:47 INFO semantic_router.utils.logger local[0m


Welcome to RAG! The chat log from this conversation will be saved to /home/jpic/C:\Users\jpic\Documents\BRAD/November 21, 2024 at 02:29:34 PM/log.json. How can I help?


Input >>  who are you and what can you do?


BRAD >> 1: 


2024-11-21 14:35:30,384 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


I am BRAD, a Bioinformatic Retrieval Augmented Data chatbot. I specialize in biology, bioinformatics, genetics, and data science. I can provide information, answer questions, search databases, run pipelines, analyze codes, and more to assist with various biological and data-related inquiries. Feel free to ask me anything related to my expertise!


Input >>  exit


Thanks for chatting today! I hope to talk soon, and don't forget that a record of this conversation is available at: /home/jpic/C:\Users\jpic\Documents\BRAD/November 21, 2024 at 02:29:34 PM/log.json


---

Similar to importing the `chat` module in this Jupyter Notebook, the codes can be run similarly from the command line interface. Navigating in the command line to the root of the main project's repository, the following command will open a similar chat session:

`python BRAD/chat.py`

### `Agent` Module  

When using the `chat` module above, the `Agent` model is used under the hood to create an instance of an LLM-powered agent, which has access to different tools. Specifically, the `chat` module is a wrapper that executes the command: `Agent().chat()`

The `Agent` class is the primary way to configure BRAD. It allows you to set different options such as:

- Selecting a different `LLM` (Large Language Model)
- Choosing the correct RAG database
- Restarting previous chat sessions
- And more...

In the following sections, we will introduce important configurations as needed.



## 2. Create an `Agent`

This section will show how to:
1. create an `Agent`,
2. interface with the `Agent`,
3. and set different configurations.

To import the `Agent` class, we use the command:

`from BRAD import agent`

---

In [3]:
from BRAD import agent

---

### `Agent` Constructor

After importing the `Agent` class, we can use the following constructor to create an `Agent`. This will have similar prompts to the user as given by the `chat` module.

---


In [2]:
bot = agent.Agent()

Enter your Open AI API key:  ········



Would you like to use a database with BRAD [Y/N]?


 N


[32m2024-11-20 15:43:15 INFO semantic_router.utils.logger local[0m


Welcome to RAG! The chat log from this conversation will be saved to /home/jpic/C:\Users\jpic\Documents\BRAD/November 20, 2024 at 03:42:41 PM/log.json. How can I help?


---

Here again, an API key is requested. By default, the `Agent` class will use LLMs from OpenAI. When the user is prompted to enter an API key, these are saved within:

`os.environ["OPENAI_API_KEY"]`

Alternatively, when the user selects models from NVIDIA, as we will do in the following sections, the API keys are stored in:

`os.environ["NVIDIA_API_KEY"]`


### Chat with the `Agent`

Once the `Agent` has been created, there are two methods to interact with it:

- `chat()`: this method opens an interactive chat session, similar to the one produced by the `chat` module.
- `invoke()`: this method responds to a single user query.

---

In [7]:
response = bot.invoke("Who are you and what can you do?")

BRAD >> 2: 


2024-11-21 15:07:35,339 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


I am BRAD, a Bioinformatic Retrieval Augmented Data chatbot specializing in biology, bioinformatics, genetics, and data science. I can provide information on various topics within these fields, search for relevant articles, analyze data using pipelines and codes, and access databases like Gene Ontology and Enrichr. Feel free to ask me any questions you have related to biology and data science!


In [8]:
bot.chat()



Input >>  what types of problems can you solve?


BRAD >> 3: 


2024-11-21 15:08:03,452 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


BRAD: I can help solve a wide range of problems related to biology, bioinformatics, genetics, and data science. This includes analyzing genetic data, identifying patterns in biological datasets, predicting protein structures, exploring gene functions, and much more. Whether you need assistance with data analysis, literature search, pipeline creation, or code optimization, I am here to help with various problems in these fields. Feel free to ask me any specific questions you have!


Input >>  exit


Thanks for chatting today! I hope to talk soon, and don't forget that a record of this conversation is available at: /home/jpic/C:\Users\jpic\Documents\BRAD/November 21, 2024 at 03:07:16 PM/log.json


---

### Agent State

The purpose of the `Agent` class is to maintain the state or memory of the system. For example, the `Agent` can query the LLM, but in order to provide the LLM with access to previous parts of the conversation, the `Agent` tracks the conversation history, so that it can be resupplied to the LLM. Additional information to track the configurations, input, output, process of using different tool modules and more are also tracked by the state, which has the following shema:

```
>>> Agent.state = {
>>> 'config'            : {
>>>     <configuration variables>
>>> },
>>> 'prompt'            : <user input>,
>>> 'output'            : <streaming output of Agent>,
>>> 'memory'            : <agent memory>,
>>> 'process'           : {
>>>     'MODULE'        : <Tool module used to respond to user input>,
>>>     <module specific information>: {
>>>         ...
>>>     }
>>> },
>>> 'llm-api-calls'     : <number of LLM calls used by Agent>,
>>> ...
>>> }
```


In addition to maintaining a state in memory, a unique output directory is assigned to each `Agent`. These directories are where all information, data, or downloads related to the `Agent` are stored. These output directories allow information or memory to persist even when the `Agent` is turned off or deleted, allowing it to be restarted from the same position. Within the output directory, the `log.json` file is used to record each step or action taken by the `Agent` including every query of the LLM. The log file follows the following schema:s
3. debuging vs. nondebuggin mode
4. and more



```
>>> [
>>>     0: {
>>>         'TIME'   : <time stamp>,
>>>         'PROMPT' : <input from the user or preplanned prompt>,
>>>         'OUTPUT' : <output displayed to the user>,
>>>         'STATUS' : {
>>>                 'LLM' : <primary large language model being used>,
>>>                 'Databases' : {
>>>                         'RAG' : <primary data base>,
>>>                         <other databases>: <more databases can be added>,
>>>                         ...
>>>                     },
>>>                 'config' : {
>>>                         'debug' : True,
>>>                         'output-directory' : <path to output directory>,
>>>                         ...
>>>                     }
>>>             },
>>>         'PROCESS' : {
>>>                 'MODULE' : <name of module i.e. RAG, DATABASE, CODER, etc.>
>>>                 'STEPS' [
>>>                         # information for a particular step involved in executing the module. Some examples
>>>                         {
>>>                            'LLM' : <language model>,
>>>                            'prompt template' : <prompt template>,
>>>                            'model input' : <input to model>,
>>>                            'model output' : <output of model>
>>>                         },
>>>                         {
>>>                            'func' : 'rag.retreival',
>>>                            'num articles retrieved' : 10,
>>>                            'multiquery' : True,
>>>                            'compression' : True,
>>>                         }
>>>                     ]
>>>             },
>>>         'PLANNED' : [
>>>                 <Next prompt in queue>,
>>>                 ...
>>>             ]
>>>     },
>>> 1 : {
>>>         'TIME'   : <time stamp>,
>>>         'PROMPT' : <second prompt>,
>>>         ...
>>>     },
>>> ...
>>> ]:
```

### Configurations

To configure an `Agent` for different tasks, the `Agents` can be supplied `.json` files during construction to specify information such as:
1. where to place the output directory
2. how to use different tools
3. debuging vs. nondebuggin mode
4. and more

A default configuration file is build into the python package and is available [here](https://github.com/Jpickard1/BRAD/blob/main/BRAD/config/config.json). The default values can be overwritten by providing similarly structured files as input when constructing an `Agent`. We will demonstrate this in the subsequent example notebooks.

## 3. LLM Selection

The BRAD package is interoperable for different LLMs. Any [LangChain](https://www.langchain.com/) compatible LLM can be used for the argument `Agent(llm=<here>)` to construct the `Agent`. BRAD contains code to load LLMs using:
1. [OpenAI's API](https://openai.com/)
2. [NVIDIA's API](https://build.nvidia.com/explore/discover)
3. [LlamaCpp](https://github.com/ggerganov/llama.cpp)

When using the APIs, it is the responsibility of the user to obtain API keys for the respective services, and when using a locally run LLM with LlamaCpp, the user must provide a path to the `.guff` file and ensure appropriate hardware to run inference on the LLM. To load LLMs from each of these three locations, the following module and methods can be used:

In [4]:
from BRAD import llms

model = llms.load_openai(
    model_name='gpt-3.5-turbo-0125',
    api_key='XXXXX'
    temperature='1.0'
)

model = llms.load_nvidia(
    model_name='meta/llama3-70b-instruct',
    api_key='XXXXX'
    temperature='1.0'
)

model = llmsload_llama(model_path = <path to .guff file>,
               n_ctx = 4096,
               max_tokens = 1000,
              )

These models can then be input as the LLM of an `Agent` as follows:

In [None]:
bot = agent.Agent(
    llm = model
)

## 4. Integration to LangChain

BRAD is built atop the [LangChain](https://www.langchain.com/) environment, which is a comprehensive framework for developing LLM powered applications. Within each method or tool of BRAD is several LangChain chains that allow the LLM to make decisions regarding how to use the tools.

To make BRAD consistent with LangChain, an `Agent` can be reintegrated into a LangChain tool. To do the, the `Agent.to_langchain` method provides an interface to port an `Agent` to be compatible with `langchain_core.language_models.llms`. This works as follows:

---


In [9]:
llmBot = bot.to_langchain()


---

Then `llmBot` can be placed into any LangChain environment, while retaining full access to BRAD's functionality.

Thank you for exploring the Hello World notebook! Please proceed to the next notebooks to dive deeper.
