Skip to content

David-deng-yeah/Shenzhen-cup-mathematical-modeling-challenge

Repository files navigation

Shenzhen-cup-mathematical-modeling-challenge

Solution framework

image

relevant link

Shenzhen-cup-mathematical-modeling-challenge

blog

introduction

The pursuit and escape differential game problem not only plays an increasingly important role in modern military science and technology, but also can be used as an important case to test machine learning algorithms. Based on the theory of kinematics and optimization, this paper designs a reinforcement learning intelligent algorithm to solve the sheep dog chase and escape game problem.

To solve the first problem, we need to establish an optimal containment strategy model for dogs. First of all, considering the goal of the dog, the dog hopes to successfully encircle the sheep, and when the distance between the dog and the sheep is minimized, the probability of successful encirclement is greater. Therefore, we list the kinematical equations of dog sheep in polar coordinates and express the distance between sheep and dog. Finally, based on the distance expression between sheep and dogs, the objective programming model of relative distance minimization is given.

To solve the second problem, we need to explore the conditions under which the sheep can escape successfully when the dog is surrounded by the optimal containment strategy. Firstly, according to the kinematic equation, the escape angle, the distance from the escape point and the escape time are obtained. Then the containment time of the dog is obtained by the arc length and angular velocity of the dog from the escape point. The conditions for successful escape of sheep can be obtained simultaneously: the escape time of sheep is less than that of dogs. Then, we extend the case of one dog to one sheep to the case of multiple dogs to one sheep. According to the kinematic equation, there are two optimal escape strategies for sheep: linear motion or broken line motion.

For problem 3, we need to train sheep without kinematics and optimization theory knowledge through machine learning to make them escape successfully. Based on reinforcement learning ddpg algorithm, this paper designs an intelligent algorithm for sheep escape training. First, the initial environmental state is set according to the motion equation of the dog in question 1. Through the interaction between the intelligent sheep and the environment, the escape experience is obtained and stored in the memory. The gym Library of Python was used for 200 times of training, and the algorithm converged after the 40th training. The sheep had learned the decision to escape successfully.

For question 4, we need to make quantitative evaluation on the algorithm in question 3. First, the number of convergence steps, the pursuit trajectory of sheep and dogs and the average score of the algorithm are taken as evaluation indicators. After the number of convergence steps, the sheep have learned to make a successful escape decision, which reflects the practicability of the algorithm. At the same time, if the sheep can reach convergence in a short number of steps, it shows that the learning ability of the algorithm is strong. As for the pursuit and escape trajectory of sheep and dogs, according to the converged trajectory, the decision of each step of the sheep can be preliminarily analyzed, and whether the decision is in line with the reality can be judged, so as to evaluate the scientificity of the algorithm. Finally, the evaluation is carried out according to the test results of the sheep's escape training intelligent algorithm. In the test, the number of convergence steps is 40, and the sheep achieved the training goal with few training times, which reflects the practicability and strong learning ability of the algorithm. At the same time, according to the track of sheep dog chasing, it can be analyzed that the movement and decision-making of sheep are in line with the reality, which reflects the scientificity of the algorithm. The average score of the algorithm can evaluate the performance of the algorithm by randomly selecting the state, which can reflect the effectiveness of the algorithm.

In view of question 5, we will promote the one-to-one hunting of sheep and dogs in question 3, and propose an intelligent escape algorithm for sheep under multi dog hunting. In view of this situation, we adjusted the return function of the algorithm of question 3 accordingly, and the result was that it began to converge after about 125 times of training. It can be seen that the learning ability of the algorithm is strong and the algorithm is practical. Then, in order to further test and evaluate the algorithm, we tested and trained the intelligent sheep four times by changing different initial position combinations of two dogs. In the four training sessions, the number of steps required by the sheep to escape was 27, 33, 43 and 28 respectively. It can be seen from the track of chasing and escaping that the movements and Strategies of the intelligent sheep are in line with the reality and can be described. Therefore, the scientificity of the algorithm is reflected. Finally, the advantages and disadvantages of the model and the extension of the model are given.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages