-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: Run tests on GHA #139
CI: Run tests on GHA #139
Conversation
|
||
- pip: | ||
- herbie-data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've used the environment-test.yml
file, because this file was already pretty close to what was needed for bootstrapping a Conda environment on CI, modulo this change, and adding pytest
.
I see that you might have been using it differently, because it previously installed the herbie-data
package from PyPI. However, on CI, when running code from a pull request, the code on this very Git ref from the repository should be used, instead of installing the package from PyPI.
Let me know if you can adjust your workflow, or if you have different suggestions.
.github/workflows/tests-python.yml
Outdated
strategy: | ||
fail-fast: false | ||
matrix: | ||
os: ["ubuntu-latest"] | ||
python-version: ["3.9", "3.10", "3.11"] | ||
# In order to save resources, only run particular | ||
# matrix slots on other OS than Linux. | ||
include: | ||
- os: "macos-latest" | ||
python-version: "3.11" | ||
#- os: "windows-latest" | ||
# python-version: "3.11" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you want to see any adjustments to the test matrix configuration.
I've not added Python 3.8 here, in order to save resources, and, in a similar spirit, the recipe will only run a single version of Python on macOS.
On regular operations, you would toggle to fail-fast: true
, in order to save further resources: It will cancel other jobs when a test job fails on one test matrix slot, because it does not make sense to continue running them.
Using fail-fast: false
makes sense when working on the workflow recipe itself, in order to get the big picture if specific tests would only fail on Windows, for example, and you don't want them to cancel the jobs running on Linux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With 1ff1f36, I've configured fail-fast: true
, to save resources in the default case, where it does not make sense to run the full test matrix, because the output from a single instance will usually be enough to find out about the root cause of the error.
- Python 3.8, 3.9, or **3.10** (recommended) | ||
- Python 3.8 to 3.11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you designated Python 3.10 to be the "default" version supported by Herbie, but I think it is safe to bump that to Python 3.11.
- Meta: Explicitly list supported Python versions and operating systems - README: Add Python 3.11 to list of supported Python versions - README: Stop designating a specific Python version
899fb5f
to
4d186ea
Compare
After rebasing this patch, the first official test run on GHA succeeded. 🚀 |
Usually, when a single job fails, it does not make sense to continue running the other jobs in the same context. When debugging dedicated issues which only manifest on specific versions of Python, or on specific operating system environments, it is easy to toggle that option back to `fail-fast: false`.
1ff1f36
to
05dd3cb
Compare
dc04e5b
to
092ed18
Compare
@pytest.mark.skipif( | ||
is_time_between("00:00", "02:30"), | ||
reason="Test hibernated between 00:00 and 02:30 UTC, " | ||
"because upstream data not available or incomplete at night.", | ||
) | ||
def test_rap_aws(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works. The underlying issue with the test is that Herbie attempts to look for data at today = pd.to_datetime("today").floor("1D")
at a time when the model hasn't had a chance to finish running. Setting today = pd.to_datetime("today").floor("1D")-pd.timedelta('12H')
wouldn't run into lags with the real-time dissemination.
💔 Did not find ┊ model=rap ┊ product=awp130pgrb ┊ 2023-Jan-11 00:00 UTC F00
092ed18
to
b7aaa70
Compare
I belive this is ready to merge. Do you have any objection? |
I think it is ready to go in. Enjoy, and thanks! |
@pytest.mark.skipif( | ||
is_time_between("00:00", "02:30"), | ||
reason="Test hibernated between 00:00 and 02:30 UTC, " | ||
"because upstream data not available or incomplete at night.", | ||
) | ||
def test_rap_aws(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works. The underlying issue with the test is that Herbie attempts to look for data at today = pd.to_datetime("today").floor("1D")
at a time when the model hasn't had a chance to finish running. Setting today = pd.to_datetime("today").floor("1D")-pd.timedelta('12H')
wouldn't run into lags with the real-time dissemination.
Thanks @amotl for your effort putting this workflow together! I really appreciate you taking the time. I have learned so much from seeing this get put together. It definitely helps streamline development and makes Herbie feel a lot more "grown up." |
I am happy you see such value in the CI configuration. Thank you for merging. |
Dear Brian,
this is the second iteration on GH-100, using a more native approach than GH-138. On Linux and macOS, it will use a pure-Python environment, while on Windows, it will use a Conda environment for setting up the test sandbox.
I think this patch brings in the best balance of platform/environment coverage vs. resource usage, with ~15 minutes run time in total for all test matrix slots on Linux + macOS combined. Unfortunately, Windows/Conda weighs in with another ~15 minutes, but C'est la vie.
With kind regards,
Andreas.