Running tests currently takes 6-10 minutes per configuration (OS + Python version). This is mostly because of some tests actually running jobs and waiting for their completion. We should separate these e2e tests and run them at most for 1 configuration.