New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasional test failures in ranged_test.cpp #22064

Closed
kevingranade opened this Issue Oct 2, 2017 · 2 comments

Comments

Projects
None yet
2 participants
@kevingranade
Member

kevingranade commented Oct 2, 2017

Game version: Everything since 41e8209

Operating system: All that run tests.

Tiles or curses: N/A

Mods active: N/A

Expected behavior

All tests pass consistently.

Actual behavior

Very occasional (so far 3 / 270 executions in travis) failures in either the "unskilled shooter with an inaccurate pistol" or "unskilled shooter with an inaccurate smg" tests.

Steps to reproduce the behavior

Run ranged unit tests repeatedly.

Failures

https://travis-ci.org/CleverRaven/Cataclysm-DDA/jobs/281157304
https://travis-ci.org/CleverRaven/Cataclysm-DDA/jobs/282020818
https://travis-ci.org/CleverRaven/Cataclysm-DDA/jobs/280275187

@kevingranade

This comment has been minimized.

Show comment
Hide comment
@kevingranade

kevingranade Oct 2, 2017

Member

Most likely fix is to back out or adjust the penultimate balance tweak from #21468 (07b94a2), I had run the tests many times before that adjustment with no failures, and I probably just cut it too close.

Essentially I dropped the multiplier on the DEX penalty a tiny bit and increased the multiplier on the PER penalty a similarly small amount, just reversing this will probably fix the immediate problem, but some compromise value may also work.

Quick outline of the feedback from the test:

unskilled_shooter_accuracy
  an unskilled shooter with an inaccurate smg
-------------------------------------------------------------------------------
ranged_balance.cpp:218
...............................................................................

This is the name of the test section that failed, we spawn an "unskilled shooter" and give them an "inaccurate smg", then have them fire at a target at a specified range.

ranged_balance.cpp:195: FAILED:
  CHECK( fast_stats_upper[1].avg() < hit_rate_cap )
with expansion:
  0.72222f < 0.6f

This is the actual test results, we expect the average of "fast_stats_upper" to be lower than 0.6, but it isn't. This test is checking the hit rate of a "quickdraw", but it's failing because it's hitting too often.
"fast_stats" holds the results of a number of test shots at the target, specifically whether they hit or not (this excludes "grazing" hits).

with messages:
  Normal: [ 131 ]
  Linear: [ 30 27 20 606.708 ]
  Range: 3

This gives the basic outline of the test, Normal: [131] is the dispersion from the gun and ammo, Linear: [30 27 20 606.708 are the dispersions from marksmanship, Dex, hand encumbrance, and recoil. Range is obviously the engagement range.

  Max aim speed: 84.5
  Min aim speed: 5.82146

Max aim speed and min aim speed give feedback about how fast the player is aiming, if min aim speed is above 0 it means they haven't aimed fully, so they are constrained by the time provided to aim.

  shooter.weapon.gun_skill().str() := "smg"
  shooter.get_skill_level( shooter.weapon.gun_skill() ) := 0
  shooter.get_dex() := 8
  to_milliliter( shooter.weapon.volume() ) := 2750 (0xabe)

Some misc stats for troubleshooting, none are particularly pertinent here.

  fast_stats[1].n() := 100
  fast_stats[1].adj_wald_error() := 0.15319f
  fast_stats_upper[1].n() := 144
  fast_stats_upper[1].adj_wald_error() := 0.12042f

This is detail about the test results, it sampled 100 shots and estimates that there is 0.15 error for the lower test bound, and it sampled 144 times and estimates that there is 0.12 error for the upper bound (the one that failed).

What we can see from this is that the adjustment to PER had no effect on this test, since it isn't even close to full aim. Reverting the DEX multiplier back to its previous value would result in that penalty being 30 instead of 27 (not a particularly large value, that might not be enough TBH). Alternately some other change that increases the effective dispersion of this scenario without breaking other tests would resolve the issue.

As a last resort, we can bless the results and loosen the tests, but I don't think that's a good option.

Member

kevingranade commented Oct 2, 2017

Most likely fix is to back out or adjust the penultimate balance tweak from #21468 (07b94a2), I had run the tests many times before that adjustment with no failures, and I probably just cut it too close.

Essentially I dropped the multiplier on the DEX penalty a tiny bit and increased the multiplier on the PER penalty a similarly small amount, just reversing this will probably fix the immediate problem, but some compromise value may also work.

Quick outline of the feedback from the test:

unskilled_shooter_accuracy
  an unskilled shooter with an inaccurate smg
-------------------------------------------------------------------------------
ranged_balance.cpp:218
...............................................................................

This is the name of the test section that failed, we spawn an "unskilled shooter" and give them an "inaccurate smg", then have them fire at a target at a specified range.

ranged_balance.cpp:195: FAILED:
  CHECK( fast_stats_upper[1].avg() < hit_rate_cap )
with expansion:
  0.72222f < 0.6f

This is the actual test results, we expect the average of "fast_stats_upper" to be lower than 0.6, but it isn't. This test is checking the hit rate of a "quickdraw", but it's failing because it's hitting too often.
"fast_stats" holds the results of a number of test shots at the target, specifically whether they hit or not (this excludes "grazing" hits).

with messages:
  Normal: [ 131 ]
  Linear: [ 30 27 20 606.708 ]
  Range: 3

This gives the basic outline of the test, Normal: [131] is the dispersion from the gun and ammo, Linear: [30 27 20 606.708 are the dispersions from marksmanship, Dex, hand encumbrance, and recoil. Range is obviously the engagement range.

  Max aim speed: 84.5
  Min aim speed: 5.82146

Max aim speed and min aim speed give feedback about how fast the player is aiming, if min aim speed is above 0 it means they haven't aimed fully, so they are constrained by the time provided to aim.

  shooter.weapon.gun_skill().str() := "smg"
  shooter.get_skill_level( shooter.weapon.gun_skill() ) := 0
  shooter.get_dex() := 8
  to_milliliter( shooter.weapon.volume() ) := 2750 (0xabe)

Some misc stats for troubleshooting, none are particularly pertinent here.

  fast_stats[1].n() := 100
  fast_stats[1].adj_wald_error() := 0.15319f
  fast_stats_upper[1].n() := 144
  fast_stats_upper[1].adj_wald_error() := 0.12042f

This is detail about the test results, it sampled 100 shots and estimates that there is 0.15 error for the lower test bound, and it sampled 144 times and estimates that there is 0.12 error for the upper bound (the one that failed).

What we can see from this is that the adjustment to PER had no effect on this test, since it isn't even close to full aim. Reverting the DEX multiplier back to its previous value would result in that penalty being 30 instead of 27 (not a particularly large value, that might not be enough TBH). Alternately some other change that increases the effective dispersion of this scenario without breaking other tests would resolve the issue.

As a last resort, we can bless the results and loosen the tests, but I don't think that's a good option.

@Coolthulhu

This comment has been minimized.

Show comment
Hide comment
@Coolthulhu

Coolthulhu Oct 5, 2017

Contributor

Going for fully normal distribution could allow getting rid of RNG altogether.
Sum of two independent, normally distributed variables is a normally distributed variable with known mean and variance.
Most natural process results are better approximated with natural distribution than uniform one. For example, error in offset when holding a gun pointing at a target, barrel vibration, imperfections in production of gun or ammo and so on. A sum of those would approximate length of sum of 2D vector offsets (ie. sum of errors to aiming) than the current uniform distribution sum does.
Extreme outliers on any of the above are rare, but can happen. Small errors - on any individual "error type" - are more common than large ones.
By treating the MoA offset as a variable with folded normal distribution (0 mean), it could be analyzed with good precision, with relatively simple formulas that take nanoseconds to compute, instead of giant brute force synthetic tests that treat everything as a total black box.

Natural distributions are a bit too realistic at times and could make aiming at distant targets too easy and aiming at targets close up too hard, but those aren't huge errors. Plus, they would indirectly fix current lack of sniping at lower levels that way.

Contributor

Coolthulhu commented Oct 5, 2017

Going for fully normal distribution could allow getting rid of RNG altogether.
Sum of two independent, normally distributed variables is a normally distributed variable with known mean and variance.
Most natural process results are better approximated with natural distribution than uniform one. For example, error in offset when holding a gun pointing at a target, barrel vibration, imperfections in production of gun or ammo and so on. A sum of those would approximate length of sum of 2D vector offsets (ie. sum of errors to aiming) than the current uniform distribution sum does.
Extreme outliers on any of the above are rare, but can happen. Small errors - on any individual "error type" - are more common than large ones.
By treating the MoA offset as a variable with folded normal distribution (0 mean), it could be analyzed with good precision, with relatively simple formulas that take nanoseconds to compute, instead of giant brute force synthetic tests that treat everything as a total black box.

Natural distributions are a bit too realistic at times and could make aiming at distant targets too easy and aiming at targets close up too hard, but those aren't huge errors. Plus, they would indirectly fix current lack of sniping at lower levels that way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment