Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infrastructure Status Updates #44317

Open
andrewbranch opened this issue Apr 29, 2020 · 13 comments
Open

Infrastructure Status Updates #44317

andrewbranch opened this issue Apr 29, 2020 · 13 comments

Comments

@andrewbranch
Copy link
Member

andrewbranch commented Apr 29, 2020

This issue is an announcement board for known issues with DefinitelyTyped infrastructure, including CI, @typescript-bot automation, and npm publishing. To keep the noise down for those who want to subscribe to this issue, I’m locking the comments, but if you have questions or would like to report potential problems, please do so in the DefinitelyTyped Gitter.

Status badges

✅ Current status: Ok ✅

There are no known repository-wide infrastructure issues. If you’d like to report one, please do so in the DefinitelyTyped Gitter.

@DefinitelyTyped DefinitelyTyped locked and limited conversation to collaborators Apr 29, 2020
@andrewbranch

This comment has been minimized.

@andrewbranch
Copy link
Member Author

andrewbranch commented Apr 29, 2020

✅ Resolved: typescript-bot incident

All PRs that had activity in the last 40 hours have been re-processed by the bot to ensure labels are accurate. If you think the state of your PR is being represented incorrectly, please ping me on the PR.

Expect continued tweaks to @typescript-bot’s behavior as we monitor the new system in production. If you have feedback or would like to report a problem, please ping me on the problematic PR or on the DefinitelyTyped Gitter.

@andrewbranch
Copy link
Member Author

Missing and duplicate reviewer pings

For some of the time the new bot has been running, it has failed to ping owners for reviews. We’ve since fixed the problem and have been reprocessing PRs updated in the last four days to ensure owners are made aware of them.

Unfortunately, in some cases, this reprocessing failed to deduplicate previous pings that did go through, so some reviewers may have been pinged multiple times unnecessarily. This is an artifact of trying to fix up old PRs on the fly, and doesn’t reflect the typical behavior of the bot as it processes future PRs in real time. Nonetheless, we’re sorry for the spam.

@andrewbranch

This comment has been minimized.

@andrewbranch
Copy link
Member Author

✅ Resolved: Duplicate typescript-bot comments.

A separate bug caused @typescript-bot to issue duplicate comments pinging reviewers to update their reviews if that reviewer left more than one comment on a given review or more than one review on a given commit. It has now been fixed.

We believe we have now accounted for all causes of duplicate comments. If you see evidence to the contrary, please ping me on the PR where it occurred.

@andrewbranch

This comment has been minimized.

@andrewbranch

This comment has been minimized.

@andrewbranch
Copy link
Member Author

andrewbranch commented Jun 1, 2020

✅ Resolved: Incorrect dtslint failures

The issue was caused by dtslint and tslint each using a different version of TypeScript. We’ve fixed the issue by using peerDependencies instead of dependencies in certain dependency paths, and are continuing to do so in additional dependency paths to prevent further issues.

@orta orta pinned this issue Jun 1, 2020
@sandersn
Copy link
Contributor

sandersn commented Jun 3, 2020

✅ Resolved: CI and publishing down Wednesday 4 AM - 4 PM GMT.

(9 PM Tuesday - 9 AM Wednesday PDT).

A PR that failed CI was merged and blocked subsequent CI runs as well as publishing. The PR was correct but a broken suggested-change commit got merged right before the whole PR merged.

I added myself as the author in f64a42d. Broken PRs will need to merge from master to get this commit.

Incorrect parsing shouldn't block tests or publication, and normally doesn't. I filed a bug to track this failure.

@sandersn
Copy link
Contributor

sandersn commented Jun 4, 2020

✅ Resolved: Publishing down Thursday 3 AM - 8 PM GMT.

(8 PM Wednesday - 1 PM Thursday PDT).

Publishing is broken because of a change to react-bootstrap, which currently has a v1 subdirectory, but a toplevel directory with 0.32. Any change, such as #45183, will cause publishing to fail. The publisher gets confused when trying to get the newest version, returns undefined, and tries to read react-bootstrap/undefined/index.d.ts instead of react-bootstrap/v1/index.d.ts. I fixed the crash in microsoft/DefinitelyTyped-tools#30 but CI needs an additional fix to forbid PRs that create this inverted version structure.

I also have a PR that removes v1 — it's not really needed — but that fails CI for other reasons, which I have a PR to fix as well. microsoft/DefinitelyTyped-tools#24
But I haven't been able to finish that PR because Definitely Typed keeps failing.

@sandersn
Copy link
Contributor

sandersn commented Mar 18, 2021

✅ Resolved: CI down 7 AM 2021/03/17 - 6 PM 03/18 UTC.

(12 AM Wednesday - 11 AM Thursday PDT).

CI and other dtslint-runner-based tests ran out of memory because of a nightly version of Typescript that uses more memory to parse JSDoc comments. I mitigated the problem by pinning to Tuesday's nightly build of Typescript, and reverting the PR that caused the memory increase: microsoft/TypeScript#43302.

I'll revert DT's mitigation once Typescript's new nightly build has published.

@sandersn
Copy link
Contributor

sandersn commented Mar 23, 2021

✅ Resolved: Publishing down from around 1 PM 2021/03/23 UTC - 2021/03/25 2 AM UTC.

(6 AM 03/23 PST - 7 PM 03/24).

A change in notNeededPackage.json's format incorrectly updated a retry case for deprecating packages. The first time that case happened about 12 hours after the change, publishing was blocked.

I shipped a mitigation (microsoft/DefinitelyTyped-tools#219) and a fix (microsoft/DefinitelyTyped-tools#220), but the publisher took several hours to come back up because Azure Web App deployments can be very slow.

@andrewbranch and I decided this was a good time to switch to Azure Functions. That is done, but packages with a lot of files can take a long time to publish, which can block publishing. We're still working on performance and improvements which should fix that problem too.

@sandersn
Copy link
Contributor

sandersn commented Jul 8, 2021

✔️ Resolved: Publishing backlogged from around 10 PM 2021/07/07 UTC - 12 AM 2021/07/09.

(2 PM 07/07 PDT - 5 PM 07/08).

I merged a PR to upgrade thousands of packages to be backward compatible with 4.4's new flag, exactOptionalPropertyTypes. This requires adding | undefined to every optional property. The publisher cleared the backlog at the start of 07/09.

@peterblazejewicz peterblazejewicz unpinned this issue Oct 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants