We aim to showcase the machine learning deliver the best benchmark performance for NP-hard and NP-complete problems.
In the meantime, this repo will also include our codes and tricks when playing with nonconvex and nonlinear optimization problems.
Benchmark for combinatorial optimization problems.
For deep reinforcement learning algorithms, we use ElegantRL and OpenAI Gym.
The following two key technologies are under active development:
-
Massively parallel simuations of gym-environments on GPU, using thousands of CUDA cores and tensor cores.
-
Podracer scheduling on a GPU cloud, e.g., DGX-2 SuperPod.
Key references:
-
Mazyavkina, Nina, et al. "Reinforcement learning for combinatorial optimization: A survey." Computers & Operations Research 134 (2021): 105400.
-
Bengio, Yoshua, Andrea Lodi, and Antoine Prouvost. "Machine learning for combinatorial optimization: a methodological tour d’horizon." European Journal of Operational Research 290.2 (2021): 405-421.
-
Peng, Yun, Byron Choi, and Jianliang Xu. "Graph learning for combinatorial optimization: a survey of state-of-the-art." Data Science and Engineering 6, no. 2 (2021): 119-141.
-
Nair, Vinod, et al. "Solving mixed integer programs using neural networks." arXiv preprint arXiv:2012.13349 (2020).
-
Makoviychuk, Viktor, et al. "Isaac Gym: High performance GPU based physics simulation for robot learning." Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2). 2021.
ElegantRL_Solver follows the principles:
- Gym-style environments (not all the same, since there is no action or step).
- Massively parallel environments.
Important functions
- reset(): Initialize the variables
- obj(): Calculate the objective function, i.e., Halmiltonian.
- reward(mu1: Tensor, mu2: Tensor): Calculate the Halmiltonian of from the graph mu1 to another graph mu2.
- Learn to branch
code 2023 AAAI Reinforcement Learning for Branch-and-Bound Optimisation using Retrospective Trajectories
code 2021 AAAI Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies
- Learn to cut
code 2020 ICML Reinforcement learning for integer programming: Learning to cut
- ML/RL + algorithm/heuristic
code 2021 AAAI Combining reinforcement learning with Lin-Kernighan-Helsgaun algorithm for the traveling salesman problem
- Classical methods
- Random walk
- Greedy
-
$\epsilon$ -greedy - Simulated annealing
- Local search
- Beam search
- Branch-and-bound
- Cutting plane
Gurobi download/install manual (the state-of-the-art solver)
SCIP download/install manual (an open-source solver, and its simplex is commonly used in "learn to branch/cut")
BiqMac (only for binary quadratic or maxcut)
Xpress download/install manual
The ElegantRL_Solver's performance compared with other methods or solvers is presented here, which includes several problems such as maxcut and TSP, and also how to run the codes.
RLSolver
└──helloworld
└──maxcut.py
└──maxcut_env.py
└──opt_methods
└──readme
└──graph_partitioning.md
└──maxcut.md
└──tsp.md
└──rlsolver (main folder)
└──data (datasets for problems)
└──envs
└──result (store output files)
└──rlsolver_learn2opt
└──mimo
└──tensor_train
└──graph_partitioning.py
└──graph_partitioning_gurobi.py
└──maxcut.py
└──maxcut_H2O.py
└──maxcut_gurobi.py
└──tsp.py
└──tsp_gurobi.py
└──utils.py
- MIMO
- Maxcut
- TNCO
- TSP
- MILP
- portfolio_management
- quantum_circuits