Skip to content

microsoft/NeMoEval

Repository files navigation

NeMoEval: A Benchmark Tool for Natural Language-based Network Management

This is a benchmark tool to evaluate natural language-based network management using LLM-generated code.

The benchmark consists of two applications:

  1. Traffic Analysis: Analyzing traffic using communication graphs, in which nodes represent network components like routers, switches, or devices, and edges symbolize the connections or paths between these components.

  2. Network Lifecycle Management: Managing the entire lifecycle of a network entails various phases, including capacity planning, network topology design, deployment planning, and diagnostic operations.

Pre-requisites

Needs Python 3.10 or better.

(Can get it from: https://www.itsupportwale.com/blog/how-to-upgrade-to-python-3-10-on-ubuntu-18-04-and-20-04-lts/)

  1. Enter a virtual environment with pip on conda
python3.11 -m venv venvname
source venvname/bin/activate
  1. Install the requirements.txt with pip on conda
pip install -r requirements.txt
  1. Set the secrect key In the app_traffic_analysis/baseline/ and app_lifecycle_management/baseline/ folder:

Rename .env.template to .env

Set OPENAI_API_KEY and OPENAI_API_BASE in the .env file.

Make sure that the OPENAI_API_KEY has credits

Note that by default we use AzureOpenAI key. Please include OPENAI_API_BASE=https://your-resource-name.openai.azure.com in your .env file.

If you want to use non-Azure key, please select the following in baseline/ai_model.py

# Without Azure key
llm = OpenAI(
    model_name='text-davinci-003',
    temperature=0,
    max_tokens=2048,
    openai_api_key=OPENAI_API_KEY
   )

Run applications

(1) Traffic Analysis

  1. To run with existing query
cd app_traffic_analysis/baseline
python test_with_golden.py

Results are logged in baseline/logs/

  1. To add your own query with golden answer code Add your {prompt, answer} pair in the following code. Node that the order of prompt and answers must match.
cd app_traffic_analysis/golden_answer_generator
python write_new_pair_to_df.py
  1. Example to generate a new Network graph Please check meaning of params in mock_graph_data.py.
cd app_traffic_analysis/baseline
python mock_graph_data.py --n=5 --v=5 --c=0.05 --o=data/graph_data/node5.json

If you want to load the new graph, change the floowing global variable in baseline/test_with_golden.py to the graph path you want.

OUTPUT_JSONL_PATH = 'logs/node10_log.jsonl'
GRAPH_PATH = "../data/graph_data/node10.json"

(2) Lifecycle management

We use Google MALT data as an example of application use. The full data is avaliable in the original git repo.

For baseline showcase, we only extract a small set of original data due to LLMs token limit. The sampled data is stored in app_lifecycle_management/data/malt-example-sample.txt.

To run:

cd app_lifecycle_management/baseline
python test_with_golden.py

Reference Paper

If you use our benchmark tool in your work, we would appreciate a reference to the following paper:

Sathiya Kumaran Mani, Yajie Zhou, Kevin Hsieh, Santiago Segarra, Ranveer Chandra, and Srikanth Kandula. Enhancing Network Management Using Code Generated by Large Language Models. ACM Workshop on Hot Topics in Networks (HotNets), 2023.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Releases

No releases published

Packages

No packages published

Languages