How to deal with flaky tests #15

jwoodrow · 2021-09-03T08:02:41Z

When in presence of a flaky test spec-tracer will completely negate the information it gives (it's flakiness) because once the test is a success it will not be run until one of the dependent files is run again.

I know flaky tests is a difficult subject to deal with because if the flaky test just works on the first pass it will remain hidden but at least if it fails on the first run I think it is possible to detect this as a flaky test and keep running it (and maybe even inform the user that there are flaky tests in the test suite)

I wrote this quickly without fully testing my reproduction steps because this is happening in my work project and I cannot simply copy private company code as an example but it should work out the same in theory.

Another way to fix this more simply would be to add the possibility to configure a "always run these tests" option which would allow us to run critical tests every time no matter what the cache says.

Expected behavior

If a test fails and then succeeds on a rerun with no change in the dependencies then the test should always be run.

Actual behavior

Once the flaky test succeeds it will not be rerun

Steps to reproduce the problem

The model

class FlakyModel < ApplicationRecord
	def flaky_method
		Random.rand(2)
	end

	def predictable_method
		1
	end
end

The spec

require 'rails_helper'

RSpec.describe FlakyModel, type: :model do
	let(:instance) { described_class.new }

	it 'is flaky' do
		expect(instance.flaky_method).to eq(0)
	end

	it 'is not flaky' do
		expect(instance.predictable_method).to eq(1)
	end
end

Run specs until failure on flaky test with RSPEC_TRACER_NO_SKIP=true
Rerun until flaky test succeeds without RSPEC_TRACER_NO_SKIP=true
Rerun again and notice nothing will run

RSpec Tracer version

ruby version 3.0.1
rspec-core version 3.10.1
rspec-tracer version 0.4.0

The text was updated successfully, but these errors were encountered:

avmnu-sng · 2021-09-03T14:16:30Z

@jwoodrow It makes sense to not skip flaky tests. I'll work on it to provide support for the flaky tests.

jwoodrow · 2021-09-03T14:24:42Z

Thanks !

I'll keep on eye on the situation. On another note I've set this up in the CI environment my company uses and we've cut down from 1h30 to run tests on travis down to 20 minutes 👌

Just one question though, is it normal for the initial load to take upwards of 10 minutes before running any tests ? My guess is that because of the sheer size of our project (and the number of files) it takes a long time to check the rspec-tracer cache ?

avmnu-sng · 2021-09-03T14:29:42Z

Awesome. That's a significant drop. 🎉

I noticed that I was downloading all the available cache from S3 for a commit SHA. Working on a release #16 to fix. Let's check the timing with this fix. Otherwise, I am not able to think of what might have increased the initial load time to 10 minutes (It's not normal), unless the cache files size is huge.

If it doesn't help, I'll dig deeper.

jwoodrow · 2021-09-03T14:35:41Z

We aren't using S3 for the build cache but the built in cache from travis

cache:
  yarn: true
  bundler: true
  directories:
    - node_modules
    - vendor/bundle
    - public/assets
    - public/packs-test
    - rspec_tracer_cache

it works just fine like this but it's the initial load that takes upwards of 10 minutes even locally (I should have said so earlier). I don't know if there's a way for you to optimise this though 🤷‍♂️

unless the cache files size is huge

Well... It does take up 66MB 🙃

avmnu-sng · 2021-09-03T14:42:56Z

It would need some digging then:

Do you mean bundle exec rspec before starting the execution takes around 10 minutes?
Is it a collective 66MB for all the files under rspec_tracer_cache?
If you're using rspec_tracer_cache on Travis, how would it find the best cache for you?

jwoodrow · 2021-09-03T15:00:48Z

Do you mean bundle exec rspec before starting the execution takes around 10 minutes?

Basically it takes ~10 minutes to go from this line to the next

RSpec tracer loaded cache from /home/travis/build/path_to_project/rspec_tracer_cache/6783bc6a970fee584760b32d3498355b

RSpec tracer is running xx examples (actual: xxxx, skipped: xxxx)

but when running things locally it takes much less time to do this same step (3m30 basically) so maybe this is also linked to the limited performance available using CI tools

Is it a collective 66MB for all the files under rspec_tracer_cache?

Yes the whole folder is 66MB

If you're using rspec_tracer_cache on Travis, how would it find the best cache for you?

It chooses the cache linked to the branch AFAIK so it should always choose the best cache, no ? And since this is currently only on a single branche there is only one cache as it would be the case locally

Travis CI can cache content that does not often change, to speed up your build process. To use the caching feature, in your repository settings, set Build pushed branches to ON.

Travis CI fetches the cache for every build, including branches and pull requests.

If a branch does not have its own cache, Travis CI fetches the cache of the repository’s default branch.

There is one cache per branch and language version/ compiler version/ JDK version/ Gemfile location/ etc.

Only modifications made to the cached directories from normal pushes are stored.

avmnu-sng · 2021-09-03T15:13:25Z

Basically, it takes ~10 minutes to go from this line to the next

We are talking about https://github.com/avmnu-sng/rspec-tracer/blob/main/lib/rspec_tracer/runner.rb#L148 taking ~10 minutes, which doesn't seem right. It is iterating on the dependencies and finding if anything changed. I'll see if something can be optimized here.

Is the cache size ~ 66 MB on local as well? Or is it growing on the CI only? Also, can you check if the cache only has the files from the project and not vendor stuff? Ideally, it should not.

jwoodrow · 2021-09-03T17:21:04Z

Is the cache size ~ 66 MB on local as well? Or is it growing on the CI only? Also, can you check if the cache only has the files from the project and not vendor stuff? Ideally, it should not.

The cache is 66MB locally, can't really check in details for travis but it should be the same size since travis cache is really just re-adding the folder where it was as it was and when I run multiple times the specs locally the cache doesn't get bigger so it should be the same for travis.

Also, can you check if the cache only has the files from the project and not vendor stuff? Ideally, it should not.

I've checked briefly by searching the files all_files.json (389KB) and dependency.json (12.1MB) and I can't find anything not relating to our app's code 🤔

jwoodrow · 2021-09-04T09:18:47Z

Hi @avmnu-sng

I've looked more into my travis issues and noticed that on travis vendor was being added to the examples_coverage.json file which made it take up nearly a whole Gig uncompressed. I've changed the add filters to avoir coverage of these files so hopefully this addresses this situation. Maybe it should be stressed somewhere in the documentation that it is recommended not to exclude vendor from the coverage for rspec-tracer and that behaviour can differ between CI and local when it comes to these folders.

Another thing is the load could be greatly increased I think by introducing a breaking change and moving away from json and using messagepack instead here is an old benchmark I found when it came to sereliazing content and it is also faster when it comes to deserializing. It would also decrease the cache size a significant amount.

Finally would it be possible to add a "single cache history" mode ? Because currently it seems like it acumulates caches for each commit, no ? I think most usages especially when using CI can use a single cache solution since the worst case scenario would be that a few tests would be rerun but even then I don't think this would happen 🤔

avmnu-sng · 2021-09-04T10:05:37Z

@jwoodrow It's a miss on my end, I forgot to add /vendor/bundle in the add_coverage_filter callback https://github.com/avmnu-sng/rspec-tracer/blob/main/lib/rspec_tracer/defaults.rb#L4. I'll add it.

Finally would it be possible to add a "single cache history" mode ? Because currently it seems like it acumulates caches for each commit, no ?

The caching is based on test_suites and doesn't care about commits. Basically, if you're running all the examples and there are no parallel build jobs then only one cache file set is created and it keeps it updated to make sure none of the examples are missed from the current suite. In case, you have multiple CI builds, the gem will only maintain a cache for respective suites.

https://github.com/avmnu-sng/rspec-tracer#test_suite_id

Another thing is the load could be greatly increased I think by introducing a breaking change and moving away from json and using messagepack instead

Thanks for the suggestion. I'll explore it and see how much difference it makes in the future. For now, I am focusing on making the bug-free and providing support for the common use cases.

avmnu-sng · 2021-09-04T10:44:38Z

@jwoodrow Can you try it out using gem 'rspec-tracer', github: 'avmnu-sng/rspec-tracer' before I release? Please delete the existing Travis caches. It has fixes for:

flaky tests
load time
vendor files in coverage

jwoodrow · 2021-09-04T11:33:20Z

The caching is based on test_suites and doesn't care about commits. Basically, if you're running all the examples and there are no parallel build jobs then only one cache file set is created and it keeps it updated to make sure none of the examples are missed from the current suite. In case, you have multiple CI builds, the gem will only maintain a cache for respective suites.

All the tests are run as one test suite with no test_suite_id being set which is why I'm confused as to why I'm getting two subfolders in the rspec_tracer_cache folder 🤔 I could set this manually to be a constant ID maybe ?

@jwoodrow Can you try it out using gem 'rspec-tracer', github: 'avmnu-sng/rspec-tracer' before I release? Please delete the existing Travis caches. It has fixes for:

I'll try it out soon and let you know

avmnu-sng · 2021-09-04T11:54:16Z

@jwoodrow Can you share the folder structure? Probably run tree rspec_tracer_cache. Also the TEST_SUITE_ID is optional.

jwoodrow · 2021-09-04T11:59:38Z

Can you share the folder structure? Probably run tree rspec_tracer_cache.

I just deleted the folder both locally and on travis to try out the main branch, but basically it was something like this

rspec_tracer_cache
- hash
  - all_examples.json
  - examples_coverage.json
  - dependency.json
  - etc
- different_hash
  - all_examples.json
  - examples_coverage.json
  - dependency.json
  - etc
- last_run.json

avmnu-sng · 2021-09-04T12:33:22Z

It's fine. It won't accumulate these two hashes. The hash is basically combined md5 of all the examples run. It only uses the hash in the last_run.json file.

avmnu-sng self-assigned this Sep 3, 2021

avmnu-sng mentioned this issue Sep 3, 2021

Significant Increase in the Load Time #17

Closed

avmnu-sng linked a pull request Sep 4, 2021 that will close this issue

Add Support for Flaky Tests Detection #19

Merged

avmnu-sng closed this as completed in #19 Sep 4, 2021

avmnu-sng mentioned this issue Sep 4, 2021

Examples coverage has vendor files #20

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to deal with flaky tests #15

How to deal with flaky tests #15

jwoodrow commented Sep 3, 2021 •

edited

Loading

avmnu-sng commented Sep 3, 2021 •

edited

Loading

jwoodrow commented Sep 3, 2021

avmnu-sng commented Sep 3, 2021

jwoodrow commented Sep 3, 2021 •

edited

Loading

avmnu-sng commented Sep 3, 2021 •

edited

Loading

jwoodrow commented Sep 3, 2021 •

edited

Loading

avmnu-sng commented Sep 3, 2021 •

edited

Loading

jwoodrow commented Sep 3, 2021

jwoodrow commented Sep 4, 2021 •

edited

Loading

avmnu-sng commented Sep 4, 2021 •

edited

Loading

avmnu-sng commented Sep 4, 2021

jwoodrow commented Sep 4, 2021

avmnu-sng commented Sep 4, 2021 •

edited

Loading

jwoodrow commented Sep 4, 2021 •

edited

Loading

avmnu-sng commented Sep 4, 2021

How to deal with flaky tests #15

How to deal with flaky tests #15

Comments

jwoodrow commented Sep 3, 2021 • edited Loading

Expected behavior

Actual behavior

Steps to reproduce the problem

RSpec Tracer version

avmnu-sng commented Sep 3, 2021 • edited Loading

jwoodrow commented Sep 3, 2021

avmnu-sng commented Sep 3, 2021

jwoodrow commented Sep 3, 2021 • edited Loading

avmnu-sng commented Sep 3, 2021 • edited Loading

jwoodrow commented Sep 3, 2021 • edited Loading

avmnu-sng commented Sep 3, 2021 • edited Loading

jwoodrow commented Sep 3, 2021

jwoodrow commented Sep 4, 2021 • edited Loading

avmnu-sng commented Sep 4, 2021 • edited Loading

avmnu-sng commented Sep 4, 2021

jwoodrow commented Sep 4, 2021

avmnu-sng commented Sep 4, 2021 • edited Loading

jwoodrow commented Sep 4, 2021 • edited Loading

avmnu-sng commented Sep 4, 2021

jwoodrow commented Sep 3, 2021 •

edited

Loading

avmnu-sng commented Sep 3, 2021 •

edited

Loading

jwoodrow commented Sep 3, 2021 •

edited

Loading

avmnu-sng commented Sep 3, 2021 •

edited

Loading

jwoodrow commented Sep 3, 2021 •

edited

Loading

avmnu-sng commented Sep 3, 2021 •

edited

Loading

jwoodrow commented Sep 4, 2021 •

edited

Loading

avmnu-sng commented Sep 4, 2021 •

edited

Loading

avmnu-sng commented Sep 4, 2021 •

edited

Loading

jwoodrow commented Sep 4, 2021 •

edited

Loading