Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deal with flaky tests #15

Closed
jwoodrow opened this issue Sep 3, 2021 · 15 comments · Fixed by #19
Closed

How to deal with flaky tests #15

jwoodrow opened this issue Sep 3, 2021 · 15 comments · Fixed by #19
Assignees

Comments

@jwoodrow
Copy link

jwoodrow commented Sep 3, 2021

When in presence of a flaky test spec-tracer will completely negate the information it gives (it's flakiness) because once the test is a success it will not be run until one of the dependent files is run again.

I know flaky tests is a difficult subject to deal with because if the flaky test just works on the first pass it will remain hidden but at least if it fails on the first run I think it is possible to detect this as a flaky test and keep running it (and maybe even inform the user that there are flaky tests in the test suite)

I wrote this quickly without fully testing my reproduction steps because this is happening in my work project and I cannot simply copy private company code as an example but it should work out the same in theory.

Another way to fix this more simply would be to add the possibility to configure a "always run these tests" option which would allow us to run critical tests every time no matter what the cache says.


Expected behavior

If a test fails and then succeeds on a rerun with no change in the dependencies then the test should always be run.

Actual behavior

Once the flaky test succeeds it will not be rerun

Steps to reproduce the problem

  1. The model
class FlakyModel < ApplicationRecord
	def flaky_method
		Random.rand(2)
	end

	def predictable_method
		1
	end
end
  1. The spec
require 'rails_helper'

RSpec.describe FlakyModel, type: :model do
	let(:instance) { described_class.new }

	it 'is flaky' do
		expect(instance.flaky_method).to eq(0)
	end

	it 'is not flaky' do
		expect(instance.predictable_method).to eq(1)
	end
end
  1. Run specs until failure on flaky test with RSPEC_TRACER_NO_SKIP=true
  2. Rerun until flaky test succeeds without RSPEC_TRACER_NO_SKIP=true
  3. Rerun again and notice nothing will run

RSpec Tracer version

ruby version 3.0.1
rspec-core version 3.10.1
rspec-tracer version 0.4.0

@avmnu-sng avmnu-sng self-assigned this Sep 3, 2021
@avmnu-sng
Copy link
Owner

avmnu-sng commented Sep 3, 2021

@jwoodrow It makes sense to not skip flaky tests. I'll work on it to provide support for the flaky tests.

@jwoodrow
Copy link
Author

jwoodrow commented Sep 3, 2021

Thanks !

I'll keep on eye on the situation. On another note I've set this up in the CI environment my company uses and we've cut down from 1h30 to run tests on travis down to 20 minutes 👌

Just one question though, is it normal for the initial load to take upwards of 10 minutes before running any tests ? My guess is that because of the sheer size of our project (and the number of files) it takes a long time to check the rspec-tracer cache ?

@avmnu-sng
Copy link
Owner

Awesome. That's a significant drop. 🎉

I noticed that I was downloading all the available cache from S3 for a commit SHA. Working on a release #16 to fix. Let's check the timing with this fix. Otherwise, I am not able to think of what might have increased the initial load time to 10 minutes (It's not normal), unless the cache files size is huge.

If it doesn't help, I'll dig deeper.

@jwoodrow
Copy link
Author

jwoodrow commented Sep 3, 2021

We aren't using S3 for the build cache but the built in cache from travis

cache:
  yarn: true
  bundler: true
  directories:
    - node_modules
    - vendor/bundle
    - public/assets
    - public/packs-test
    - rspec_tracer_cache

it works just fine like this but it's the initial load that takes upwards of 10 minutes even locally (I should have said so earlier). I don't know if there's a way for you to optimise this though 🤷‍♂️

unless the cache files size is huge

Well... It does take up 66MB 🙃

@avmnu-sng
Copy link
Owner

avmnu-sng commented Sep 3, 2021

It would need some digging then:

  • Do you mean bundle exec rspec before starting the execution takes around 10 minutes?
  • Is it a collective 66MB for all the files under rspec_tracer_cache?
  • If you're using rspec_tracer_cache on Travis, how would it find the best cache for you?

@jwoodrow
Copy link
Author

jwoodrow commented Sep 3, 2021

  • Do you mean bundle exec rspec before starting the execution takes around 10 minutes?

Basically it takes ~10 minutes to go from this line to the next

RSpec tracer loaded cache from /home/travis/build/path_to_project/rspec_tracer_cache/6783bc6a970fee584760b32d3498355b

RSpec tracer is running xx examples (actual: xxxx, skipped: xxxx)

but when running things locally it takes much less time to do this same step (3m30 basically) so maybe this is also linked to the limited performance available using CI tools

  • Is it a collective 66MB for all the files under rspec_tracer_cache?

Yes the whole folder is 66MB

  • If you're using rspec_tracer_cache on Travis, how would it find the best cache for you?

It chooses the cache linked to the branch AFAIK so it should always choose the best cache, no ? And since this is currently only on a single branche there is only one cache as it would be the case locally

Travis CI can cache content that does not often change, to speed up your build process. To use the caching feature, in your repository settings, set Build pushed branches to ON.

  • Travis CI fetches the cache for every build, including branches and pull requests.
  • If a branch does not have its own cache, Travis CI fetches the cache of the repository’s default branch.
  • There is one cache per branch and language version/ compiler version/ JDK version/ Gemfile location/ etc.
  • Only modifications made to the cached directories from normal pushes are stored.

@avmnu-sng
Copy link
Owner

avmnu-sng commented Sep 3, 2021

Basically, it takes ~10 minutes to go from this line to the next

We are talking about https://github.com/avmnu-sng/rspec-tracer/blob/main/lib/rspec_tracer/runner.rb#L148 taking ~10 minutes, which doesn't seem right. It is iterating on the dependencies and finding if anything changed. I'll see if something can be optimized here.

Is the cache size ~ 66 MB on local as well? Or is it growing on the CI only? Also, can you check if the cache only has the files from the project and not vendor stuff? Ideally, it should not.

@jwoodrow
Copy link
Author

jwoodrow commented Sep 3, 2021

Is the cache size ~ 66 MB on local as well? Or is it growing on the CI only? Also, can you check if the cache only has the files from the project and not vendor stuff? Ideally, it should not.

The cache is 66MB locally, can't really check in details for travis but it should be the same size since travis cache is really just re-adding the folder where it was as it was and when I run multiple times the specs locally the cache doesn't get bigger so it should be the same for travis.

Also, can you check if the cache only has the files from the project and not vendor stuff? Ideally, it should not.

I've checked briefly by searching the files all_files.json (389KB) and dependency.json (12.1MB) and I can't find anything not relating to our app's code 🤔

@avmnu-sng avmnu-sng linked a pull request Sep 4, 2021 that will close this issue
@jwoodrow
Copy link
Author

jwoodrow commented Sep 4, 2021

Hi @avmnu-sng

I've looked more into my travis issues and noticed that on travis vendor was being added to the examples_coverage.json file which made it take up nearly a whole Gig uncompressed. I've changed the add filters to avoir coverage of these files so hopefully this addresses this situation. Maybe it should be stressed somewhere in the documentation that it is recommended not to exclude vendor from the coverage for rspec-tracer and that behaviour can differ between CI and local when it comes to these folders.

Another thing is the load could be greatly increased I think by introducing a breaking change and moving away from json and using messagepack instead here is an old benchmark I found when it came to sereliazing content and it is also faster when it comes to deserializing. It would also decrease the cache size a significant amount.

Finally would it be possible to add a "single cache history" mode ? Because currently it seems like it acumulates caches for each commit, no ? I think most usages especially when using CI can use a single cache solution since the worst case scenario would be that a few tests would be rerun but even then I don't think this would happen 🤔

@avmnu-sng
Copy link
Owner

avmnu-sng commented Sep 4, 2021

@jwoodrow It's a miss on my end, I forgot to add /vendor/bundle in the add_coverage_filter callback https://github.com/avmnu-sng/rspec-tracer/blob/main/lib/rspec_tracer/defaults.rb#L4. I'll add it.

Finally would it be possible to add a "single cache history" mode ? Because currently it seems like it acumulates caches for each commit, no ?

The caching is based on test_suites and doesn't care about commits. Basically, if you're running all the examples and there are no parallel build jobs then only one cache file set is created and it keeps it updated to make sure none of the examples are missed from the current suite. In case, you have multiple CI builds, the gem will only maintain a cache for respective suites.

https://github.com/avmnu-sng/rspec-tracer#test_suite_id

Another thing is the load could be greatly increased I think by introducing a breaking change and moving away from json and using messagepack instead

Thanks for the suggestion. I'll explore it and see how much difference it makes in the future. For now, I am focusing on making the bug-free and providing support for the common use cases.

@avmnu-sng
Copy link
Owner

@jwoodrow Can you try it out using gem 'rspec-tracer', github: 'avmnu-sng/rspec-tracer' before I release? Please delete the existing Travis caches. It has fixes for:

  • flaky tests
  • load time
  • vendor files in coverage

@jwoodrow
Copy link
Author

jwoodrow commented Sep 4, 2021

The caching is based on test_suites and doesn't care about commits. Basically, if you're running all the examples and there are no parallel build jobs then only one cache file set is created and it keeps it updated to make sure none of the examples are missed from the current suite. In case, you have multiple CI builds, the gem will only maintain a cache for respective suites.

All the tests are run as one test suite with no test_suite_id being set which is why I'm confused as to why I'm getting two subfolders in the rspec_tracer_cache folder 🤔 I could set this manually to be a constant ID maybe ?

@jwoodrow Can you try it out using gem 'rspec-tracer', github: 'avmnu-sng/rspec-tracer' before I release? Please delete the existing Travis caches. It has fixes for:

I'll try it out soon and let you know

@avmnu-sng
Copy link
Owner

avmnu-sng commented Sep 4, 2021

@jwoodrow Can you share the folder structure? Probably run tree rspec_tracer_cache. Also the TEST_SUITE_ID is optional.

@jwoodrow
Copy link
Author

jwoodrow commented Sep 4, 2021

Can you share the folder structure? Probably run tree rspec_tracer_cache.

I just deleted the folder both locally and on travis to try out the main branch, but basically it was something like this

  • rspec_tracer_cache
    • hash
      • all_examples.json
      • examples_coverage.json
      • dependency.json
      • etc
    • different_hash
      • all_examples.json
      • examples_coverage.json
      • dependency.json
      • etc
    • last_run.json

@avmnu-sng
Copy link
Owner

It's fine. It won't accumulate these two hashes. The hash is basically combined md5 of all the examples run. It only uses the hash in the last_run.json file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants