Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attacking Towers #33

Merged
merged 12 commits into from
Feb 7, 2019
Merged

Conversation

Nostrademous
Copy link
Collaborator

This is a start to improve #32

We can now attack towers.

This also compares tower health between the friendly and enemy mid tier 1 tower as a reward.

Also adds more reward to a win/loss.

Minor fixes here and there.

@Nostrademous
Copy link
Collaborator Author

rebased on top of your latest code

@Nostrademous
Copy link
Collaborator Author

Added a static method to break-apart the full unit-list provided in protobuffer into unit-type lists in a single pass so that unit_matrix() static method doesn't have to iterate the full list numerous times as for-loops in python tend to be not the most efficient.

@Nostrademous
Copy link
Collaborator Author

Since we now use zero-sum game and enemy rewards are reflected inversely in our rewards I fixed tower health reward to not be a zero-sum game in itself as we were thus doubling the effect. Also, I believe I had the reward value reversed by accident.

@Nostrademous
Copy link
Collaborator Author

Added a parameter for tracking whether a unit "is attacking me" (normalized to [-0.5 to 0.5]) and removed the facing_sin and facing_cos parameters as they do not affect any of the actions we have enabled thus far (if we attack, it will just turn and attack...)

If we bring them back in the future we should quantize them better as they add a lot of state-space search the way they were implemented.

@Nostrademous
Copy link
Collaborator Author

Apparently I always have access to all the buildings/towers of allied & enemies in the unit-list protobuf. Data might not be valid but they are present so I added a way to filter them out.

@TimZaman
Copy link
Owner

TimZaman commented Feb 4, 2019

I got some trouble ingesting this. Let me know when you clean it up/RFR

@Nostrademous
Copy link
Collaborator Author

Sure, what is the ingest trouble? Sorry I don't understand if you mean "merging" or "understanding" or ???

@TimZaman
Copy link
Owner

TimZaman commented Feb 4, 2019 via email

@Nostrademous
Copy link
Collaborator Author

Nostrademous commented Feb 4, 2019

Okay, added a few more comments but it still will be a lot I fear. Here is what the code does:

  1. Created a new static function to separate the unit list embedded in the world-state protobuf into unit-type specific groups. That way we can add more groups later (like wards, courier, jungle creep, etc.) and not worry about them when performing other calculations. This also "should" speed up execution as all the 'unit_matrix()' calls we were doing now don't have to iterate the entire unit list in the protobuf to determine the appropriate basic unit info tensors, but rather just the ones for the list we are creating (as we have separate embeddings for each).

  2. It adds the 'TOWER' unit-type handles and list into the policy for evaluation and reasoning so we have an understanding of their health and other basic unit information. This is not the easiest thing to do though as testing showed that the world-state protobuf includes information about all the friendly and enemy towers (even if I can't see the enemy towers - which I was not expecting... I think it just reports either max value or last seen value, but need to test more) so I had to add code to specifically only attack the tier 1 mid tower for now so it stops trying to target the other towers which it can't hit anyways because they are invulnerable. For friendly tier 1 tower, I added code to prevent it being a valid handle (via not including it in the unit handle list) if the tower is above 10% health.

  3. I modified the basic unit understanding parameters by adding 'is_targeting_me' parameter which is a normalized boolean for whether that unit is currently right-clicking me / attacking me. This, IMHO, should help the bots learn that they are being targeted/attacked and by whom. Hopefully this info along with the distance from each unit helps them learn to not die to towers or creep attacks over time. I removed the facing sin/cos parameters as I don't think they add anything (but please let me know if I'm wrong... I can be really dull on some thing at times).

  4. Everything else is just minor cleanup. I normalized the tower reward to be [0.0 to 3.0] by changing the divisor from 500. to 600. (since tier 1 tower has 1800 health). I changed win rewards to be 10. and -10. for win/loss. etc.

@TimZaman
Copy link
Owner

TimZaman commented Feb 4, 2019

btw did you rebase?

@Nostrademous
Copy link
Collaborator Author

I did 23 hrs ago and you haven’t made any committed since so it is sitting on top of your HEAD

@Nostrademous
Copy link
Collaborator Author

Well... I'm going to have to rebase again and do some merge conflict clean up. Probably in 1-2 hrs when I have time.

@Nostrademous
Copy link
Collaborator Author

@TimZaman Okay, rebased on your latest commits. Hopefully I have explained it enough.

@TimZaman
Copy link
Owner

TimZaman commented Feb 7, 2019

Hmm so 90% is great, 5% is up for discussion and 5% i disagree with. I think I'll merge it and then patch up. Or just patch up the MR directly. I guess I'll do that.

agent.py Show resolved Hide resolved
@TimZaman TimZaman merged commit 2b6f04d into TimZaman:master Feb 7, 2019
@Nostrademous Nostrademous deleted the improved_attention branch February 7, 2019 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants