# Advanced Config

# `config_list_from_json`

Daisy-chain configs together --> alternatives when api failure

In [1]:
from IPython.display import Image, display

import autogen
from autogen.coding import LocalCommandLineCodeExecutor

config_list = autogen.config_list_from_json(
    "OAI_CONFIG_LIST",
    file_location=".",
    # filter_dict={"model": ["gpt-4", "gpt-3.5-turbo"]},  # comment out to get all
)
print('config_list:', config_list)

config_list: [{'model': 'gpt-4', 'api_key': 'sk-proj-S0vzLvnzdWmo3wmudRCQT3BlbkFJ6JWclNt49KOxygTxhqJe'}, {'model': 'gpt-3.5-turbo', 'api_key': 'sk-proj-S0vzLvnzdWmo3wmudRCQT3BlbkFJ6JWclNt49KOxygTxhqJe'}]


In [None]:
user_proxy = autogen.UserProxyAgent(
    name='user_proxy',
    system_message='Human manager',
    code_execution_config={'last_n_messages': 2, 'work_dir': 'groupchat'},
    human_input_mode='TERMINATE'
)

Notice, for the `user_proxy`, we do not use the `config_list`

In [3]:
assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config={'config_list': config_list, 'seed': 42}
)

`'seed': 42` creates a `.cache/42` folder with a `cache.db`, kept as reference for repeat queries

# [Enhanced Inference](https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference/)

* control infernce on llm api with tuning
* enhance llm for task
* tuning done with:
  * validation data
  * evaluation function
  * metric to optimise
  * search space
  * budgets

* control infernce on llm api with tuning
* enhance llm for task
* tuning done with:
  * validation data
  * evaluation function
  * metric to optimise
  * search space
  * budgets

### [Tuning](https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference/#perform-tuning)


```
import autogen

config, analysis = autogen.Completion.tune(
    data=tune_data,
    metric="success",
    mode="max",
    eval_func=eval_func,
    inference_budget=0.05,
    optimization_budget=3,
    num_samples=-1,
)
```

### [API unification](https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference/#api-unification)


* Don't need to change API code!

### [Templating](https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference/#templating)

```
response = client.create(
    context={"problem": "How many positive integers, not exceeding 100, are multiples of 2 or 3 but not 4?"},
    prompt="{problem} Explain your reasoning step-by-step",
    allow_format_str_template=True,
    **config
)
```

### [Logging](https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference/#logging)

```
import autogen.runtime_logging

autogen.runtime_logging.start(logger_type="sqlite", config={"dbname": "YOUR_DB_NAME"})
```

```
autogen.runtime_logging.stop()
```

# [RAG Applications](https://microsoft.github.io/autogen/blog/2023/10/18/RetrieveChat)

* External embedding database ￫ agents able to pull knowledge

In [2]:
import autogen
from autogen.agentchat.contrib.retrieve_assistant_agent import RetrieveAssistantAgent
from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent

`rag_proxy_agent` automatically creates vector db from `docs_path`

In [3]:
assistant = RetrieveAssistantAgent(
    name="assistant",
    system_message="You are a helpful assistant.",
    llm_config={'config_list': config_list, 'seed': 42},
)

rag_proxy_agent = RetrieveUserProxyAgent(
    name="ragproxyagent",
    retrieve_config={
        "task": "qa",
        "docs_path": "https://raw.githubusercontent.com/microsoft/autogen/main/README.md",
    },
)

  from tqdm.autonotebook import tqdm, trange


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [4]:
assistant.reset()
ragproxyagent.initiate_chat(
    assistant, message=ragproxyagent.message_generator, problem="What is autogen?")

Trying to create collection.


2024-08-03 15:40:07,269 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - Found 2 chunks.[0m
2024-08-03 15:40:07,275 - autogen.agentchat.contrib.vectordb.chromadb - INFO - No content embedding is provided. Will use the VectorDB's embedding function to generate the content embedding.[0m
Number of requested results 20 is greater than number of elements in index 2, updating n_results = 2


VectorDB returns doc_ids:  [['5e6501a8', '47b63d39']]
[32mAdding content of doc 5e6501a8 to context.[0m
[32mAdding content of doc 47b63d39 to context.[0m
[33mragproxyagent[0m (to assistant):

You're a retrieve augmented chatbot. You answer user's questions based on your own knowledge and the
context provided by the user.
If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.
You must give as short an answer as possible.

User's question is: What is autogen?

Context is: Autogen enables the next-gen LLM applications with a generic [multi-agent conversation](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat) framework. It offers customizable and conversable agents that integrate LLMs, tools, and humans.
By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code.

Features of this use c

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


[33massistant[0m (to ragproxyagent):

Autogen is a generic multi-agent conversation framework that automates chat among multiple capable agents for tasks that may require using tools via code, with or without human feedback. It allows customization of agents and integrates Large Language Models (LLMs), tools, and humans. The framework also enables complex applications by facilitating communication between multiple agents. Autogen offers features for customization, human participation, and enhanced LLM inferences. It can be applied with diverse conversation patterns for complex workflows and provides examples of working systems across various domains and complexities.

--------------------------------------------------------------------------------
[31m
>>>>>>>> NO HUMAN INPUT RECEIVED.[0m


ChatResult(chat_id=None, chat_history=[{'content': 'You\'re a retrieve augmented chatbot. You answer user\'s questions based on your own knowledge and the\ncontext provided by the user.\nIf you can\'t answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.\nYou must give as short an answer as possible.\n\nUser\'s question is: What is autogen?\n\nContext is: Autogen enables the next-gen LLM applications with a generic [multi-agent conversation](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat) framework. It offers customizable and conversable agents that integrate LLMs, tools, and humans.\nBy automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code.\n\nFeatures of this use case include:\n\n- **Multi-agent conversations**: AutoGen agents can communicate with each other to solve tasks. This allows for more

### [Customising Embedding Functions](https://microsoft.github.io/autogen/blog/2023/10/18/RetrieveChat#customizing-embedding-function)