A code-first agent framework for seamlessly planning and executing data analytics tasks. This innovative framework interprets user requests through coded snippets and efficiently coordinates a variety of plugins in the form of functions to execute data analytics tasks
Highlighted Features
- Rich data structure - TaskWeaver allows you to work with rich data structures in Python, such as DataFrames, instead of having to work with text strings.
- Customized algorithms - TaskWeaver allows you to encapsulate your own algorithms into plugins (in the form of Python functions), and orchestrate them to achieve complex tasks.
- Incorporating domain-specific knowledge - TaskWeaver is designed to be easily incorporating domain-specific knowledge, such as the knowledge of execution flow, to improve the reliability of the AI copilot.
- Stateful conversation - TaskWeaver is designed to support stateful conversation. It can remember the context of the conversation and leverage it to improve the user experience.
- Code verification - TaskWeaver is designed to verify the generated code before execution. It can detect potential issues in the generated code and provide suggestions to fix them.
- Easy to use - TaskWeaver is designed to be easy to use. We provide a set of sample plugins and a tutorial to help you get started. Users can easily create their own plugins based on the sample plugins. TaskWeaver offers an open-box experience, allowing users to run a service immediately after installation.
- Easy to debug - TaskWeaver is designed to be easy to debug. We have detailed logs to help you understand what is going on during calling the LLM, the code generation, and execution process.
- Security consideration - TaskWeaver supports a basic session management to keep different users' data separate. The code execution is separated into different processes in order not to interfere with each other.
- Easy extension - TaskWeaver is designed to be easily extended to accomplish more complex tasks. You can create multiple AI copilots to act in different roles, and orchestrate them to achieve complex tasks.
- Python 3.10 or above
- OpenAI (or Azure OpenAI) access with GPT-3.5 above models. However, it is strongly recommended to use the GPT-4, which is more stable.
- Other requirements can be found in the
requirements.txt
file.
OpenAI API had a major update from 0.xx to 1.xx in November 2023. Please make sure you are not using an old version because the API is not backward compatible.
You can install TaskWeaver by running the following command:
git clone https://github.com/microsoft/TaskWeaver.git
cd TaskWeaver
# install the requirements
pip install -r requirements.txt
TaskWeaver runs as a process, you need to create a project directory to store plugins and configuration files.
We provided a sample project directory in the project
folder. You can copy the project
folder to your workspace.
A project directory typically contains the following files and folders:
📦project
┣ 📜taskweaver_config.json # the configuration file for TaskWeaver
┣ 📂plugins # the folder to store plugins
┣ 📂planner_examples # the folder to store planner examples
┣ 📂codeinterpreter_examples # the folder to store code interpreter examples
┣ 📂sample_data # the folder to store sample data used for evaluations
┣ 📂logs # the folder to store logs, will be generated after program starts
┗ 📂workspace # the directory stores session data, will be generated after program starts
┗ 📂 session_id
┣ 📂ces # the folder used by the code execution service
┗ 📂cwd # the current working directory to run the generated code
Before running TaskWeaver, you need to provide your OpenAI API key and other necessary information.
You can do this by editing the taskweaver_config.json
file.
If you are using Azure OpenAI, you need to set the following parameters in the taskweaver_config.json
file:
{
"llm.api_base": "https://xxx.openai.azure.com/",
"llm.api_key": "the api key",
"llm.api_type": "azure",
"llm.api_version": "the api version",
"llm.model": "the model name, e.g., gpt-4"
}
{
"llm.api_key": "the api key",
"llm.model": "the model name, e.g., gpt-4"
}
💡 Only the latest OpenAI API supports the
json_object
response format. If you are using an older version of OpenAI API, you need to set thellm.response_format
tonull
.
More configuration options can be found in the configuration documentation.
# assume you are in the taskweaver folder
# -p is the path to the project directory
python -m taskweaver -p ./project/
This will start the TaskWeaver process and you can interact with it through the command line interface. If everything goes well, you will see the following prompt:
=========================================================
_____ _ _ __
|_ _|_ _ ___| | _ | | / /__ ____ __ _____ _____
| |/ _` / __| |/ /| | /| / / _ \/ __ `/ | / / _ \/ ___/
| | (_| \__ \ < | |/ |/ / __/ /_/ /| |/ / __/ /
|_|\__,_|___/_|\_\|__/|__/\___/\__,_/ |___/\___/_/
=========================================================
TaskWeaver: I am TaskWeaver, an AI assistant. To get started, could you please enter your request?
Human: ___
In this example, we will show you how to use TaskWeaver to pull data from a database and apply an anomaly detection algorithm.
anomaly_detection.mp4
If you want to follow this example, you need to configure the sql_pull_data
plugin in the project/plugins/sql_pull_data.yaml
file.
You need to provide the following information:
api_type: azure or openai
api_base: ...
api_key: ...
api_version: ...
deployment_name: ...
sqlite_db_path: sqlite:///../../../sample_data/anomaly_detection.db
The sql_pull_data
plugin is a plugin that pulls data from a database. It takes a natural language request as input and returns a DataFrame as output.
This plugin is implemented based on Langchain. If you want to follow this example, you need to install the Langchain package:
pip install langchain
pip install tabulate
In this example, we will show you how to use TaskWeaver to forecast QQQ's price in the next week using the ARIMA algorithm.
stock_forecast.mp4
If you want to follow this example, you need to you have two requirements installed:
pip install yfinance
pip install statsmodels
For more examples, please refer to our paper.
If you want to use TaskWeaver as a library, you can refer to the following code example:
from taskweaver.app.app import TaskWeaverApp
app_dir = "/path/to/project/"
app = TaskWeaverApp(app_dir=app_dir)
session = app.get_session()
user_query = "hello, what can you do?"
response_round = session.send_message(user_query,
event_handler=lambda x, y: print(f"{x}:\n{y}"))
print(response_round.to_dict())
Note:
- event_handler: a callback function that is utilized to display the response obtained from TaskWeaver step by step.
It takes two arguments: the message type (e.g.,
plan
) and the message content. - response_round: the response from TaskWeaver. which is an object of the
Round
class. An example of theRound
object is shown below:
{
"id": "round-20231201-043134-218a2681",
"user_query": "hello, what can you do?",
"state": "finished",
"post_list": [
{
"id": "post-20231201-043134-10eedcca",
"message": "hello, what can you do?",
"send_from": "User",
"send_to": "Planner",
"attachment_list": []
},
{
"id": "post-20231201-043141-86a2aaff",
"message": "I can help you with various tasks, such as counting rows in a data file, detecting anomalies in a dataset, searching for products on Klarna, summarizing research papers, and pulling data from a SQL database. Please provide more information about the task you want to accomplish, and I'll guide you through the process.",
"send_from": "Planner",
"send_to": "User",
"attachment_list": [
{
"id": "atta-20231201-043141-6bc4da86",
"type": "init_plan",
"content": "1. list the available functions"
},
{
"id": "atta-20231201-043141-6f29f6c9",
"type": "plan",
"content": "1. list the available functions"
},
{
"id": "atta-20231201-043141-76186c7a",
"type": "current_plan_step",
"content": "1. list the available functions"
}
]
}
]
}
There are two ways to customize TaskWeaver: creating plugins and creating examples.
Since TaskWeaver can already perform some basic tasks, you can create plugins to extend its capabilities. A plugin is a python function that takes a set of arguments and returns a set of results.
Typically, you only need to write a plugin in the following example scenarios:
- You want to encapsulate your own algorithm into a plugin.
- You want to import a python package that is not supported by TaskWeaver.
- You want to connect to an external data source to pull data.
- You want to query a web API.
Refer to the plugin documentation for more details. Otherwise, you can leverage TaskWeaver's code generation capability to perform tasks.
The purpose of examples is to help LLMs understand how to perform tasks especially when the tasks are complex and need domain-specific knowledge.
There are two types of examples: (1) planning examples and (2) code interpreter examples. Planning examples are used to demonstrate how to use TaskWeaver to plan for a specific task. Code generation examples are used to demonstrate how to generate code or orchestrate plugins to perform a specific task.
Refer to the example documentation for more details.
Our paper could be found here. If you use TaskWeaver in your research, please cite our paper:
@article{taskweaver,
title={TaskWeaver: ACode-First Agent Framework},
author={Bo Qiao, Liqun Li, Xu Zhang, Shilin He, Yu Kang, Chaoyun Zhang, Fangkai Yang, Hang Dong, Jue Zhang, Lu Wang, Minghua Ma, Pu Zhao, Si Qin, Xiaoting Qin, Chao Du, Yong Xu, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang},
journal={arXiv preprint arXiv:2311.17541},
year={2023}
}
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.