Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codecov upload failing with 'Too many uploads to this commit.' #38

Closed
altendky opened this issue Feb 22, 2021 · 24 comments
Closed

Codecov upload failing with 'Too many uploads to this commit.' #38

altendky opened this issue Feb 22, 2021 · 24 comments

Comments

@altendky
Copy link
Owner

A couple days ago the nightlies started failing. I haven't seen this on other repositories I run. I started to attempt to setup an action to help with collecting coverage reports and making a single upload over in https://github.com/altendky/coverage-artifact but ran into the limitation that actions can't use other actions (altendky/coverage-artifact#1). I've already got this going in pymodbus-dev/pymodbus#592 but don't super feel like copying it around at the moment. So, I'll take the excuse that maybe this is a Codecov bug that they'll fix soon enough. Maybe.

https://github.com/altendky/ssst/runs/1948470755?check_suite_focus=true

[ErrorDetail(string='Too many uploads to this commit.', code='invalid')]

Already reported to Codecov a couple times.

https://community.codecov.io/t/too-many-uploads-to-this-commit/2574
https://community.codecov.io/t/ci-failure-due-to-too-many-uploads-to-this-commit/2587

@thomasrockhu
Copy link

Hi @altendky, Tom from Codecov here. We have rolled out a fix for this issue. If you have any issues, please respond on the community boards.

@altendky
Copy link
Owner Author

@thomasrockhu, thanks for the effort and for going out of your way to follow up here. I just retriggered last night's nightly scheduled workflow and it has at least one failure so far. Let me know if there's anything I can do to help debug this.

https://github.com/altendky/ssst/runs/1962754257?check_suite_focus=true

[6392] D:\a\ssst\ssst$ 'C:\Program Files\Git\bin\bash.EXE' codecov.sh -Z -n 'Test - Windows PySide2 CPython 3.9 x64'

  _____          _
 / ____|        | |
| |     ___   __| | ___  ___ _____   __
| |    / _ \ / _` |/ _ \/ __/ _ \ \ / /
| |___| (_) | (_| |  __/ (_| (_) \ V /
 \_____\___/ \__,_|\___|\___\___/ \_/
                              Bash-20210129-7c25fce


==> git version 2.30.1.windows.1 found
==> curl 7.75.0 (x86_64-w64-mingw32) libcurl/7.75.0 OpenSSL/1.1.1i (Schannel) zlib/1.2.11 brotli/1.0.9 zstd/1.4.8 libidn2/2.3.0 libssh2/1.9.0 nghttp2/1.41.0
Release-Date: 2021-02-03
Protocols: dict file ftp ftps gopher gophers http https imap imaps ldap ldaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp 
Features: alt-svc AsynchDNS brotli HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz Metalink MultiSSL NTLM SPNEGO SSL SSPI TLS-SRP zstd
==> GitHub Actions detected.
    project root: D:/a/ssst/ssst
    Yaml not found, that's ok! Learn more at http://docs.codecov.io/docs/codecov-yaml
==> Running gcov in D:/a/ssst/ssst (disable via -X gcov)
==> Searching for coverage reports in:
    + D:/a/ssst/ssst
    -> Found 3 reports
==> Detecting git/mercurial file structure
==> Reading reports
    + D:/a/ssst/ssst/coverage.xml bytes=28527
    + D:/a/ssst/ssst/coverage_reports/coverage.Test - Windows PySide2 CPython 3.9 x64 bytes=94208
    + D:/a/ssst/ssst/coverage_reports/coverage.Test - Windows PySide2 CPython 3.9 x64.xml bytes=28527
==> Appending adjustments
    docs.codecov.io/docs/fixing-reports
    -> No adjustments found
==> Gzipping contents
        28K	/tmp/codecov.l35jcV.gz
==> Uploading reports
    url: codecov.io
    query: branch=main&commit=163333325a76373f64db9da91e3e3f53cda8da81&build=591528434&build_url=http%3A%2F%2Fgithub.com%2Faltendky%2Fssst%2Factions%2Fruns%2F591528434&name=Test%20-%20Windows%20PySide2%20CPython%203.9%20x64&tag=&slug=altendky%2Fssst&service=github-actions&flags=&pr=&job=CI&cmd_args=Z,n
->  Pinging Codecov
codecov.io/upload/v4?package=bash-20210129-7c25fce&token=secret&branch=main&commit=163333325a76373f64db9da91e3e3f53cda8da81&build=591528434&build_url=http%3A%2F%2Fgithub.com%2Faltendky%2Fssst%2Factions%2Fruns%2F591528434&name=Test%20-%20Windows%20PySide2%20CPython%203.9%20x64&tag=&slug=altendky%2Fssst&service=github-actions&flags=&pr=&job=CI&cmd_args=Z,n
[ErrorDetail(string='Too many uploads to this commit.', code='invalid')]
400

@thomasrockhu
Copy link

Thanks @altendky, we'll take a look into it

@thomasrockhu
Copy link

@altendky, we rolled out a new fix for this, are you still experiencing the issue?

@altendky
Copy link
Owner Author

Thanks for the continued effort and involvement here. There are different errors this time anyways.

https://github.com/altendky/ssst/runs/1975254271?check_suite_focus=true

{'detail': ErrorDetail(string='Actions workflow run is stale', code='not_found')}

https://github.com/altendky/ssst/runs/1975254317?check_suite_focus=true

{'detail': ErrorDetail(string='Actions workflow run is stale', code='not_found')}

@altendky
Copy link
Owner Author

So, I'm not convinced we all know what's going on given the significant inconsistency across jobs within a build and across builds of different projects but... Over in #41 I disabled coverage uploading to Codecov for the scheduled nightly builds. This means that, as is, nightlies won't be getting coverage checked. To address this I'll add an in-build fail-if-coverage-level-not-satisfied check (anything less than 100% in my case).

On the one hand, this makes sense. On the other, I'm implementing the combining of coverage myself instead of being able to always depend on Codecov and I'm losing presentation of those nightly build coverage results in the Codecov UI. Though I guess we could argue that they were already not usable since each nightly run was combining with the previous for the same commit.

Anyways, I get it, but it is a hole in Codecov usability that might be worth looking into. Or... I just need to be notified of the relevant already existing features.

I'll follow up when I've done this and can conclude it is or isn't working.

@thomasrockhu
Copy link

@altendky this is a little bit of a limitation with tokenless uploading. I think you should NOT receive that error if you include the upload token.

@altendky
Copy link
Owner Author

Oh, hey, that's a cool tidbit. Thanks. I'll probably try that route first then. Is it possible to create separate ... 'entries' in Codecov for a single commit but separate builds? Such that I'm not piling multiple nightly build results into one commit in Codecov. That's not critical, but supporting nightlies well would be a nice extra.

@thomasrockhu
Copy link

@altendky, I'm not totally sure I understand what you mean here. Would flags be helpful?

@altendky
Copy link
Owner Author

Maybe flags could be used with a new one for each build.

The difference about nightlies is that you want a report per-build rather than a report per-commit. Let's say that you have a first nightly test run with 100% coverage and then a second nightly runs and gets some different dependency version that reduces coverage to 90%. The Codecov report will stay at 100% for that commit due to the coverage uploaded by the first nightly build. So for most browsing of results a commit based combining of results makes sense. For nightlies it is entirely about the build. Certainly, this gets harder to deal with if the nightlies are distributed across many CI services.

Anyways, this isn't a big deal for me. I just thought I'd comment on the different use case.

I'll follow up here after I get the token in place and see a few nightlies succeed (hopefully :]).

altendky added a commit that referenced this issue Feb 28, 2021
@altendky
Copy link
Owner Author

Hmm... so apparently the bash script is finding three reports to upload. I'm poking at that in #44. Using the token (for in-repo builds) is being tried in #42.

Still, what are the Actions workflow run is stale listed above about? (#38 (comment)) Is that what you described as the limitation of tokenless uploads? Or was that about the original 'too many uploads' issue?

@thomasrockhu
Copy link

@altendky, it is a different issue as our processor wasn't properly calculating a timeout. We are discussing internally increasing that timeout. We extended it to be on par with previous behavior. Are you still seeing the issue without the token?

@altendky
Copy link
Owner Author

altendky commented Mar 2, 2021

@thomasrockhu, I just reran #44 three times with no issues. Its only difference is avoiding the upload of the two duplicate reports that shouldn't have been uploaded anyways. As far as I can tell it is not using the token. Previous times there were multiple failures per run so it appears fixed.

While I don't really want to have to use the token, I am happy to apply the changes in #44. I hadn't thought about those extra duplicate reports getting uploaded. They shouldn't be. I appreciate your work on this and that I have this path forward.

That said, I restarted the main branch nightly job just to see and it is still weird. I understand that you aren't claiming it should be fixed, but it doesn't seem like it's actually an upload limit. Several jobs got the too many uploads error but then jobs after those failures ended up succeeding... I guess it could be a 'too many active upload connections at one point in time' issue? In which case introducing a stagger to my uploads would help I guess? Is this intended behavior? Further, this commit has already been built multiple times so it isn't an overall all-jobs-ever limit for a given commit otherwise all rebuilds would fail every single upload now. Click below to expand and see the screenshots showing several jobs succeeding after one had failed, seemingly.

Thanks for all your help with this.

For quick access, here's a link to the Codecov builds page.

https://codecov.io/gh/altendky/ssst/commit/f9c0c497943abd99a4468568596349554f62903c/build

(note that every job except black and mypy are uploading coverage)

Screenshots of early jobs failing before others succeed

image

image

@altendky
Copy link
Owner Author

altendky commented Mar 2, 2021

I saw the comment about diagnostics and a maybe fix and went ahead and reran one of my troublesome builds again. Still got a couple stale errors. They were towards the end of the run but there were a few jobs after that succeeded.

https://github.com/altendky/ssst/runs/2016219031?check_suite_focus=true

@thomasrockhu
Copy link

@altendky, ok, we have pushed the timeout a little bit. Thanks for everything you provided above. Would you be able to see if you still get the Action workflow run is stale error?

@altendky
Copy link
Owner Author

altendky commented Mar 2, 2021

@thomasrockhu, I just started it again. Ended up with too many uploads again. I'm going to start a new branch and push empty commits to it instead of rerunning the job in GitHub Actions.

https://github.com/altendky/ssst/runs/2016219031?check_suite_focus=true

Here's the new branch with a fresh (empty) commit. It... just worked. :|

#46
https://github.com/altendky/ssst/actions/runs/615569097

What is the timeout? The job takes less than ten minutes, not 30+ certainly like some projects.

So, some extra context that may be relevant. I've had issues over in https://github.com/altendky/qtrio this week with status checks on the PR page being 'messed up'. It's a separate repository and it hasn't been having Codecov trouble. And this repo hasn't been having the messed up status checks trouble... but I'd rather disclose it now than later. Just in case. Just this afternoon they suggested they were having issues recently that had corrupted some workflow states. I tried creating new branches with new commits and new pull requests but my issue there persists. I don't know what bits of the GitHub stuff might confuse what bits of Codecov so I can't really guess as to whether this is relevant.

https://www.githubstatus.com/history

@altendky
Copy link
Owner Author

altendky commented Mar 3, 2021

It feels like maybe there are situations where the uploads should be failing, but they are doing so inconsistently thus resulting in confusion rather than "oh, I'm uploading several times to the same commit and therefor exceeding the limit". And other repos are entirely fine. Like I can see that qtrio ran a nightly CI with 32 jobs uploading coverage 5 nights in a row without error or any other builds in between (so it must have been the same commit being rebuilt at _least_ those nights but I think many more and thus well past 100 which I thought I saw some claim was the limit).

https://github.com/altendky/qtrio/actions/runs/609493329

@thomasrockhu
Copy link

@altendky, a few things here

  1. We ran some test on our side and have bumped the limit to 150.
  2. Are you still seeing the Actions workflow run is stale issue? We made another change this weekend to address it.
  3. The qtrio run you shared actually does hit that limit

@altendky
Copy link
Owner Author

altendky commented Mar 9, 2021

Apparently I forgot that I was ignoring Codecov upload issues over in QTrio. My apologies for the misinformation there. I haven't had any issues on main since I corrected the triplicate report detection. I just reran two of the builds that I have been rerunning related to this thread and they both passed (three times each).

Is the limit actually per commit hash such that repeated builds, such as a scheduled nightly job would create, would add up and eventually pass that limit regardless? Or is it more of a per-build-run limit? Also, would a single run of the bash uploader count as 'one upload' even when there are multiple reports found? Or does each report count as a separate upload against the 150 limit? I'm comfortable with whatever, just trying to be aware so I can set CI up appropriately and know what to think when I see issues.

Still, the inconsistency and the fact that some jobs in a build would fail before other jobs that succeeded within the same build makes it seem to me like something more than just a limit is (was?) going on. But, I can move on happily for now as my builds seem to be working. I'm still certainly happy to do any further checking that interests you.

Now to remember to check up on the failures over in QTrio...

@thomasrockhu
Copy link

@altendky the limit is per commit hash, so nightly jobs would possibly add and surpass that limit. A single run of the bash uploader should count as a single build. So if you need to batch a few reports together in a single run, that should only count against the limit as one.

I'm glad things are working for you now. Just double-checking that I answered everything outstanding here?

@altendky
Copy link
Owner Author

I think you've answered all the direct questions so I'll go ahead and close this. Thank you for all the time and effort.

@altendky
Copy link
Owner Author

@thomasrockhu, welp, bad timing I guess on deciding this was done. I had missed that this morning the nightly build had reached the limit and started failing due to too many uploads. But, that has been explained. So I did up #49 to avoid nightly coverage uploads. That worked fine in the PR but once merged I got a single failure.

https://github.com/altendky/ssst/runs/2168273366?check_suite_focus=true#step:11:70
{'detail': ErrorDetail(string='Actions workflow run is stale', code='not_found')}

I saw elsewhere a seeming reference to some 60 minute limit related to this. I don't know where I go to get GitHub actions to report the overall build runtime, but I can say that right now it says it was triggered 25 minutes ago. So, it was at least less than 25 minutes.

image

@thomasrockhu
Copy link

@altendky I pushed a fix up for the Actions workflow run is stale. Are you still seeing it?

@altendky
Copy link
Owner Author

I tried rerunning the last linked build a few times and it's been good so far. Thanks again. I'll feed back here if I have more trouble soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants