Feature Request: means of troubleshooting intermittent failures that do not depend on sequence #3030

brandondrew · 2023-04-14T20:44:23Z

Subject of the issue

Feature Request: means of gathering statistics on intermittent tests that do not depend on sequence

Your environment

Ruby version: 2.7.8
rspec-core version: 3.9.3

Steps to reproduce

I'm aware of rspec --bisect for isolating specs that fail intermittently based on sequence of examples run. But I have a case in a legacy app (I've never worked in this codebase before) where there are specs failing intermittently even when run in isolation—so sequence is not a factor.

In other words, if I run

rspec ./spec/services/locate_missing_approver_service_spec.rb:5

it will fail perhaps 1 time in 10. That 10% figure is a guess based on my experience of running this, but it might be useful if I had a way of accurately (and quickly, and not manually) gathering statistics like that, so I could use git bisect to determine where the intermittent failures begin (or if they were always present since the beginning of that spec).

At the moment I'm manually running a function to check whether the failures occur in any given commit or with any temporarily changed code, and I could extend it to gather stats, but I'm not sure if that is the best path to go down.

function test-flaky-spec() {
  while true; do
    rspec ./spec/services/locate_missing_approver_service_spec.rb:5
    if [[ $? -ne 0 ]]; then
      break
    fi
    sleep 0.1
  done
}

If gather statistics is the best means of troubleshooting that we can hope for, it would be nice to have some options built-into Rspec, such as those illustrated here, specifying how many times to run the spec and where to save the data:

rspec --loop=100 --stats=missing_approver.csv ./spec/services/locate_missing_approver_service_spec.rb:5

But perhaps there is something better than such statistics, and those with more experience diagnosing this sort of problem can offer better ways to support such troubleshooting—I'm sorry that this feature request is still somewhat abstract.

Expected behavior

Some means to diagnosing intermittent failures (those which don't depend on sequence of examples, that is).

Actual behavior

As far as I can see, we're pretty much on our own in this situation. (But maybe I'm overlooking something! I checked rspec --help on the latest version and I don't see anything there either, so I don't think I'm overlooking anything.)

The text was updated successfully, but these errors were encountered:

JonRowe · 2023-04-15T07:19:00Z

Theres a couple of gems that offer functionality in this area, rspec-retry (https://github.com/NoRedInk/rspec-retry) is the closest to what you want (but theres also rspec-rerun and respec) and looks like you could build what you want using it?

brandondrew · 2023-04-15T17:44:48Z

Theres a couple of gems that offer functionality in this area, rspec-retry (https://github.com/NoRedInk/rspec-retry) is the closest to what you want (but theres also rspec-rerun and respec) and looks like you could build what you want using it?

@JonRowe Thanks for the response!

If I understand correctly,

those projects are only (or mainly) focused just on rerunning the specs, not on tracking stats or anything else
they seem to assume that if a spec passes even once, then the failure is due to problems with the spec and not the code it's describing???
re-running specs and gathering stats is probably the necessary first step to figuring out why they're failing, and there's no other generally accepted practice for solving such problems?

JonRowe · 2023-04-17T07:52:49Z

those projects are only (or mainly) focused just on rerunning the specs, not on tracking stats or anything else

Correct but the one I linked to gives you the hooks required to gather stats yourself.

they seem to assume that if a spec passes even once, then the failure is due to problems with the spec and not the code it's describing???

I'm not sure what you mean by this, they were created to handle intermittent failures (also called flakey tests/specs) where the failure is considered a transitory artifact (e.g. the test usually passes and can be reasonably believed to be correct, but fails due to other reasons such as page load, service unavailability or race conditions etc).

re-running specs and gathering stats is probably the necessary first step to figuring out why they're failing, and there's no other generally accepted practice for solving such problems?

Nothing does exactly what you want to my knowledge, a CI service (BuildKite) does offer a test analytics add on that does similar...

RSpec does record the status of examples run as part of its example persistence feature, (which facilitates re-runs of failing specs manually), but thats aimed at easy interation over failed specs. You could probably hook into that to record statistics by processing the file after each spec run.

benoittgt · 2023-06-21T22:45:46Z

Like mentioned by @JonRowe there is also Test Insight from CircleCI who offers something quite similar to stats on flaky.

Maybe you could also store the status of your failing data. This basic code helped me few times.

RSpec.configure do |config|
  config.around(:each) do |example|
    unless User.count.zero?
      open('tmp/leaky.txt', 'a') { |f|
        f << "Location is: #{example.location} with #{User.count} and ids: #{User.ids}\n"
      }
    end
    
    example.run
  end
end

Also one solution may be to use register_listener as mentioned.

class ExampleListener
  def example_finished(notification)
    open('tmp/stats.csv', 'a') { |f|
      f << "#{notification.example.description};#{notification.example.location};#{notification.example.execution_result.status}\n"
    }
  end
end

RSpec.configure do |c|
  c.reporter.register_listener ExampleListener.new, :example_finished
end

RSpec.describe 'random specs' do
  it "passes" do
    expect(true).to eq(true)
  end

  it "passes too" do
    expect(true).to eq(true)
  end

  it "fails" do
    expect(true).to eq(false)
  end
end

More example in spec file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: means of troubleshooting intermittent failures that do not depend on sequence #3030

Feature Request: means of troubleshooting intermittent failures that do not depend on sequence #3030

brandondrew commented Apr 14, 2023

JonRowe commented Apr 15, 2023

brandondrew commented Apr 15, 2023

JonRowe commented Apr 17, 2023

benoittgt commented Jun 21, 2023 •

edited

Feature Request: means of troubleshooting intermittent failures that do not depend on sequence #3030

Feature Request: means of troubleshooting intermittent failures that do not depend on sequence #3030

Comments

brandondrew commented Apr 14, 2023

Subject of the issue

Your environment

Steps to reproduce

Expected behavior

Actual behavior

JonRowe commented Apr 15, 2023

brandondrew commented Apr 15, 2023

JonRowe commented Apr 17, 2023

benoittgt commented Jun 21, 2023 • edited

benoittgt commented Jun 21, 2023 •

edited