Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSP_20 task reward function #10

Closed
wasd12345 opened this issue Nov 19, 2018 · 1 comment
Closed

TSP_20 task reward function #10

wasd12345 opened this issue Nov 19, 2018 · 1 comment

Comments

@wasd12345
Copy link

Hi, thanks for sharing your code.

In the file "tsp_task.py", I see the TSP reward defined in the usual way as the sum of distances between consecutive vertices in the cylce. But it looks like you also started to generalize this, or maybe had to manually tune things a bit for tsp_20? I see this:

# For TSP_20 - map to a number between 0 and 1
# min_len = 3.5
# max_len = 10.
# TODO: generalize this for any TSP size
#tour_len = -0.1538*tour_len + 1.538 
#tour_len[tour_len < 0.] = 0.
return tour_len

So for the results you showed for the tsp_20 and tsp_50 tasks, did you use the usual sum of distances that would run as is in the current uncommented code, or did you find it helped to do those affine then clip operations you have commented out? Are these values derived theoretically or empirically?

Any other thoughts here in terms of generalizing the reward to work with arbitrary tour length?

Thanks!

@pemami4911
Copy link
Owner

I used the usual sum of distances for the results. I don't think that the commented out lines worked better, but it's been a while, sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants