The Robotics Task-sequencer System Framework enables developers to achieve more complex robot systems where a robot is capable of changing operation patterns on-the-fly based on user instructions. The layered framework structure enables combining different development areas such as prompt-engineering, robotic skill development, and hardware integration while ensuring that the entire system runs as a whole.
The framework enables generating a sequence of robot actions (a behavior tree) from text input by a human operator using Azure OpenAI, and then connecting those actions to the robot controller. The framework is designed to adapt to different hardware of different forms including but not limited to dual-arm robots, dexterous hands, industrial manipulators, and mobile robots.
The above figure illustrates the overview of the framework. Dotted arrows indicate data flow while solid arrows indicate component loading. Components in a dotted blue outline may indicate dependency on an external service. Components in an orange outline or coloring may require implementation/replacement depending on the usage scenario.
Please note that the term "user(s)" below refers to the developers, engineers, etc. who will be directly using this repository and does not refer to the end-users.
What is/are the Task-sequencer System Framework’s intended use(s)?
The framework should be used to develop robot systems which turn human language into actions for robots. The repository containing the framework includes a sample code on how to adapt this framework, however, it is up to the users to implement the actual system and hardware integration to operate for their purpose.How was Task-sequencer System Framework evaluated?
The framework has been tested with several robots ranging from a humanoid to a four-legged robot. Instructions have been tested on a range of scenarios including daily chores, inspection, and assembly. These varying hardware and scenarios were used to test the applicability of the framework. Please note that the tested scenarios are not part of the repository, however, a user could apply the framework to such scenarios by customizing the prompts, skills, and hardware connections.What are the limitations of the Task-sequencer System Framework? How can users minimize the impact of the limitations when using the framework?
The framework is not designed to be used as-is for a robotics operation. Users must provide the appropriate prompts, skills, and hardware connections. The repository provides example prompts for a daily chore scenario with a default skill set including find, look, grasp, pick, bring, place, release, and navigation. While this sample prompt and default skill set may be used as-is for some use cases, users should build upon the samples and not rely on the sample implementations as if it is something reliable for a robotics operation.Is the Task-sequencer System Framework an end-to-end language-action model?
The framework is not an end-to-end language-action model but rather a layered framework to achieve a language-to-action pipeline. Instead of directly outputting low-level motor commands, the language input is converted to a sequence of high-level action commands (tasks) in the behavior tree format. This allows human-in-the-loop where a human operator may examine the generated sequence before low-level commands are sent to the robot.Does the Task-sequencer System Framework ensure safe executions?
The framework does not detect whether the generated sequence is safe to perform. It is highly recommended that the user checks and ensures that the generated sequence is safe by examining the generated behavior tree. Users should be prepared to emergency stop the robot if is behaving unexpectedly.What operational factors and settings allow for effective and responsible use of the Task-sequencer System Framework?
To increase the reliability of the sequence generation, it is recommended that a user edit the prompts. To achieve the hardware connections, a user with a good understanding about their hardware should implement the connection with extra consideration on the robot’s control speed and preferably use simulators until obtained reliable end-to-end system flows.Does the repository provide examples for real hardware connection?
The sample codes do not provide any examples that will connect to a specific hardware. However, if using a ROS compatible robot, an example for connecting to a ROS controller is provided in the comments of the sim_robot_controller.py sample code.How can custom modules be integrated with the Task-sequencer System Framework?
The framework relies on loading python modules for the prompt-set definitions, skills, and hardware connections. A user can load their own prompt-set definitions, skills, and hardware adapters by creating their own configuration files and by passing the right arguments when running the server.py and core.py (please see the Developing section for further details). It is highly recommended that users package their own modules and configurations in a separate repository/folder and avoid mixing their code with the code provided from this repository.Are the skills in the default skill set machine-learned / trained using datasets? What are the requirements for using the default skill set?
All the skills in the default skill set are engineered (non-trained) and require integrating sensors/controllers in the hardware connection layer to successfully generate a trajectory. While these skills require the hardware adapters to be correctly setup, they do not require additional data collection as they are not trained using data. However, data collection could be a requirement for the controller/sensor module the user decides to integrate (e.g., a sensor hardware integrated with Cognitive Services for object detection, etc.).What are the skill parameters and how are they obtained in the system?
In the generated behavior tree, skill parameters are indicated using name conventions starting with an "@" mark. For the default skill set, these parameters change the behavior of the trajectory generation. In the sample code and for the default skill set, a fixed value is loaded from an external JSON file. The repository does not provide an automated way of obtaining these parameters and it is recommended that a user develops the automated procedure for capturing these parameters based on the state of the environment. However, a tool using GPT-4o to automate the skill parameter collection for the default skill sets may be released as a separate repository in the future.Please install the code and required packages in a Python environment. Below is an example of installing the code and packages in a new venv environment (the below is only an example, please follow appropriate install procedures for your Python environment).
py -3 -m venv .venv
.venv/Scripts/activate
pip install -r requirements.txt
pip install .
If using Visual Studio Code, don't forget to set the Python interpreter to the venv with Ctrl + Shift + P
then "Select Interpreter".
The recommended Python version for this repository is 3.12.
To run the server.py, please also make sure you have Azure OpenAI ready under your subscription. The recommended model is gpt-4o. You may also need to set the Rate limits to a high value.
The framework includes two running Python applications: the server and the core. The server generates a sequence of robot actions in the behavior tree format. The core runs the generated sequence to control the robot.
To run the server, run the following command (this is a one-line command):
python ./src/tasqsym_encoder/server.py --credentials <CREDENTIALS_FILE> --aoai --aioutput --aimodel tasqsym_samples.aimodel_samples.model.PickPlaceScenario --config ./src/tasqsym_samples/encoder_sample_settings.json --connection file
For the <CREDENTIAL_FILE>
please create a file with the appropriate access settings to your Azure OpenAI resource.
AZURE_OPENAI_ENDPOINT=
AZURE_OPENAI_KEY=
AZURE_OPENAI_DEPLOYMENT_NAME_CHATGPT=
Once the server begins (shows INFO: Uvicorn running on http://localhost:9100
in the terminal), open a web browser and connect to localhost:9100.
In the web browser UI, enter throw away the empty bottle
in the text box and you should see a behavior tree corresponding to the text instruction generated in a few seconds. After confirming the generated content, enter Y
in the text box and you should see the generated content saved into a file (tasqsym_encoder_output.json).
To run the core, run the following command (this is a one-line command):
python ./src/tasqsym/core.py --config ./src/tasqsym_samples/sim_robot_sample_settings.json --btfile ./src/tasqsym_samples/generated_sequence_samples/throw_away_the_trash.json --connection standalone
Once you run the core, you should see messages like runNode {'Node': 'FIND', '@target_description': 'empty_bottle', '@context': 'find the empty_bottle on the right'}
and sending joint angles to the sim robot controller ...
indicating that the behavior tree node is being parsed and as-if sending commands to the robot controller.
You may change the --btfile
to point to the file generated from the server by running --btfile ./tasqsym_encoder_output.json
.
Instead of using generated files, the core can receive the generated sequence directly from the server by passing through MQTT. This can be done by changing the --connection
options to use --connection mqtt
for both the server and client. You will need an MQTT broker setup to run with this option.
To run using MQTT, you will also need to edit the <CREDENTIAL_FILE>
and include the following content:
MQTT_HOST_NAME=
MQTT_USERNAME=
MQTT_CLIENT_ID=
MQTT_CERT_FILE=
MQTT_KEY_FILE=
MQTT_TCP_PORT=
When running the core, make sure to remove the --btfile
option and instead pass a credential file for the core using --credentials <CREDENTIAL_FILE_CORE>
.
The framework is designed so that users can develop and replace parts of the codes such as replacing prompts, replacing skills, replacing hardware connections with their own-developed (custom) modules. It is important to note that custom modules/configurations should be their own set of codes and separated from this repository as shown in the following diagram.
└── <task-sequencer-framework>
└── <custom_prompt_set>
├── setup.py
└── src
└── <custom_prompt_set>
├── __init__.py
├── my_model.py
├── action_definitions.json
├── sample_environment.json # environment for testing prompts
├── prompt
│ ├── action_prompt.txt # often used as-is
│ ├── environment_prompt.txt
│ ├── example_prompt.txt
│ ├── output_prompt.txt
│ └── role_prompt.txt # often used as-is
├── query
│ └── query.txt # often used as-is
├── system
│ └── system.txt # often used as-is
└── my_settings_for_server.json
└── <custom_skill_library>
├── setup.py
└── src
└── <custom_skill_library>
├── __init__.py
└── library
├── __init__.py
├── extended_library.py
└── <custom_skill_name>
├── __init__.py
├── <custom_skill_name>.py
└── <custom_skill_name>.md
└── <custom_hardware_adapter>
├── setup.py
└── src
└── <custom_hardware_adapter>
├── __init__.py
├── my_combiner.py
├── my_robot_config.json
├── include
│ ├── __init__.py
│ ├── <robot_name>_model.py
│ └── <robot_name>_controller.py
└── my_settings_for_core.json
To properly set up the custom modules for the server, please create a settings.json with the right settings and pass the JSON file using the --config
argument. If you are using your own set of prompts, please specify the corresponding prompt-set description module (my_model.py) using the --aimodel
argument.
To properly set up the custom modules for the core, please create a settings.json with the right settings and pass the JSON file using the --config
argument. If you are using your own skill set, please make sure the path to the library list is correctly set in the settings.json. If you are using your own hardware connections (adapter implementations), please make sure the path to the adapters are correctly set in the settings.json.
The framework includes a comprehensive test suite using pytest to ensure reliability and maintainability.
To run all tests:
pytest tests/
To run tests with verbose output:
pytest tests/ -v
To run a specific test file:
pytest tests/test_config_loader.py -v
To run tests with code coverage analysis:
pytest tests/ --cov=src --cov-report=term-missing
To generate an HTML coverage report:
pytest tests/ --cov=src --cov-report=html
The HTML report will be generated in the htmlcov/
directory. Open htmlcov/index.html
in a web browser to view detailed coverage information.
The test suite is organized into several categories:
tests/test_core_structs.py
- Tests for core data structures (Status, Point, Quaternion, Pose)tests/test_config_loader.py
- Tests for configuration loading and validationtests/test_integration.py
- Integration tests for main components and sample filestests/test_error_handling.py
- Tests for error scenarios and edge casestests/test_math_utils.py
- Tests for mathematical utility functionstests/test_action_formats.py
- Tests for robot action format classestests/test_blackboard.py
- Tests for blackboard functionalitytests/test_bt_decoder.py
- Tests for behavior tree decodingtests/test_world_format.py
- Tests for world format data structurestests/test_engines.py
- Tests for engine functionality
The test suite requires the following additional dependencies:
pip install pytest>=8.2 pytest-asyncio>=1.0.0 pytest-cov>=6.0.0
These are included in the requirements.txt
file.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.