This repo has been populated by an initial template to help get you started. Please make sure to update the content to build a great experience for community-building.
As the maintainer of this project, please make a few updates:
- Improving this README.MD file to provide a great experience
- Updating SUPPORT.MD with content about this project's support experience
- Understanding the security reporting process in SECURITY.MD
- Remove this section from the README
R&D Agent: Focusing on automating the most core and valuable part of the industrial R&D process.
Core method: Evolving;
TODO: importance justification
In this project, we are aiming to build a Data-Centric R&D Agent that can
-
Read real-world material (reports, papers, etc.) and extract key formulas, descriptions of interested features, factors and models.
-
Implement the extracted formulas, features, factors and models in runnable codes.
- Due the limited ability for LLM in implementing in once, evolving the agent to be able to extend abilities by learn from feedback and knowledge and improve the agent's ability to implement more complex models.
-
Further propose new ideas based on current knowledge and observations.
In this section, we will briefly introduce the roadmap/technical type of this project.
-
Backbone LLM: We use GPT series as main backbone of the agent.
.env
file uis used to config settings (such as APIkey, APIEndpoint and etc) in the environment variables way. Check this Readme for environment set up. -
KnowledgeGraph based evolving: We do not do any further pertain or fine-tune on the LLM model. Instead, we modify prompts like RAG, but use knowledge graph query information to evolve the agent's ability to implement more complex models.
- Typically, we build a knowledge consisted with
Error
,Component
(you can think of it as a numeric operation or function),Trail
and etc. We add nodes of these types to the knowledge graph with relationship while the agent tries to implement a model. For each attempts, the agent will query the knowledge graph to get the information of current status as prompt input. The agent will also update the knowledge graph with the new information after the attempt.
- Typically, we build a knowledge consisted with
Example: code standard, design. Lint
-
Set up the development environment.
make dev
-
Run linting and formatting.
make lint
-
Backbone/APIBackend of LLm are encapsulated in src/finco/llm.py. All chat completion request are managed by this file.
-
All frequently modified codes under tense development are included in the src/scripts folder.
- The most important task is to improve the agent's performance in the benchmark of factor implementation.
-
Currently, factor implementation is the main task. We define basic class of factor implementation in [src/scripts/factor_implementation/share_modules] and implementation strategies in [src/scripts/factor_implementation/baselines].
Currently, the code structure is unstable and will frequently change for quick updates. The code will be refactored before a standard release. Please try to align with the following principles when developing to minimize the effort required for future refactoring.
๐ src
โฅ ๐ <project name>: avoid namespace
โฅ ๐ core
โฅ ๐ component A
โฅ ๐ component B
โฅ ๐ component C
โฅ ๐ app
โฅ ๐ scenario1
โฅ ๐ scenario2
โฅ ๐ scripts
Folder Name | Description |
---|---|
๐ core | The core framework of the system. All classes should be abstract and usually can't be used directly. |
๐ component X | Useful components that can be used by others(e.g. scenario). Many subclasses of core classes are located here. |
๐ app | Applications for specific scenarios (usually built based on components). Removing any of them does not affect the system's completeness or other scenarios. |
๐ scripts | Quick and dirty things. These are candidates for core, components, and apps. |
You can manually source the .env
file in your shell before running the Python script:
Most of the workflow are controlled by the environment variables.
# Export each variable in the .env file; Please note that it is different from `source .env` without export
export $(grep -v '^#' .env | xargs)
# Run the Python script
python your_script.py
Name | Description |
---|---|
conf.py |
The configuration for the module & app & project |
This project welcomes contributions and suggestions.
You can find issues in the issues list or simply running grep -r "TODO:"
.
Making contributions is not a hard thing. Solving an issue(maybe just answering a question raised in issues list ), fixing/issuing a bug, improving the documents and even fixing a typo are important contributions to RDAgent.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.