Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows] mongo service will be disabled by default on August, 8th #5949

Closed
4 of 10 tasks
ddobranic opened this issue Jul 22, 2022 · 25 comments
Closed
4 of 10 tasks

[Windows] mongo service will be disabled by default on August, 8th #5949

ddobranic opened this issue Jul 22, 2022 · 25 comments
Assignees
Labels
Announcement awaiting-deployment Code complete; awaiting deployment and/or deployment in progress OS: Windows

Comments

@ddobranic
Copy link
Contributor

Breaking changes

mongodb service on Windows Server which is currently running will be disabled

Target date

The propagation is starting on August, 8th and will take 2-3 days

The motivation for the changes

Currently, mongodb service on Windows Server is running but it should be disabled by default

Possible impact

The mongodb service is disabled and all related processes

Platforms affected

  • Azure DevOps
  • GitHub Actions

Virtual environments affected

  • Ubuntu 18.04
  • Ubuntu 20.04
  • Ubuntu 22.04
  • macOS 10.15
  • macOS 11
  • macOS 12
  • Windows Server 2019
  • Windows Server 2022

Mitigation ways

Set-Service mongodb -StartupType Automatic          
Start-Service -Name mongodb
@al-cheb al-cheb self-assigned this Aug 3, 2022
@actions actions deleted a comment from nikotinman Aug 5, 2022
@actions actions deleted a comment from nikotinman Aug 5, 2022
@actions actions deleted a comment from takkken Aug 8, 2022
@ddobranic ddobranic added the awaiting-deployment Code complete; awaiting deployment and/or deployment in progress label Aug 8, 2022
@dsfrederic
Copy link

Hi @ddobranic,

I'm currently experiencing an error that might be related.

image

CONTEXT: I'm building this image in azure devops pipeline to store it in an Azure image gallery. Based on this https://github.com/YannickRe/azuredevops-buildagents

@al-cheb
Copy link
Contributor

al-cheb commented Aug 9, 2022

@dsfrederic, pull the latest changes from main branch. It has been fixed in #6030

@dsfrederic
Copy link

It seems to have gone past the mongoDb install. Now we're one step further and the scripts fail again. Can you help me along? Is there a method to run only on verified scripts. In other words what's the branching/tagging strategy?

image

@mikhailkoliada
Copy link
Contributor

It seems to have gone past the mongoDb install. Now we're one step further and the scripts fail again. Can you help me along? Is there a method to run only on verified scripts. In other words what's the branching/tagging strategy?

image

we make a release once a week when a new image version is being released. For your specific error it looks like the codql archive seems to be inconsistent, try again, please.

@rkm
Copy link

rkm commented Aug 10, 2022

Hi - I've updated my CI today with Start-Service -Name mongodb in order to handle this change, however we've already seen a few runs fail with:

Run Start-Service -Name mongodb
Start-Service: D:\a\_temp\5bb2a[14](https://github.com/SMI/SmiServices/runs/7769010886?check_suite_focus=true#step:8:15)0-67d9-46e9-a0e4-447e48467acf.ps1:2
Line |
   2 |  Start-Service -Name mongodb
     |  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
     | Service 'MongoDB Server (MongoDB) (mongodb)' cannot be started due to the following error: Cannot
     | start service 'mongodb' on computer '.'.

Error: Process completed with exit code 1.

(https://github.com/SMI/SmiServices/runs/7769010886?check_suite_focus=true#step:8:1)

Any ideas? Thanks.

@al-cheb
Copy link
Contributor

al-cheb commented Aug 10, 2022

@rkm, it should be:

- run: |
        Set-Service mongodb -StartupType Automatic          
        Start-Service -Name mongodb

@rkm
Copy link

rkm commented Aug 10, 2022

@al-cheb Have tried this now, but still encountering the issue

@al-cheb
Copy link
Contributor

al-cheb commented Aug 10, 2022

@al-cheb Have tried this now, but still encountering the issue

Please provide a link to the build.

@jas88
Copy link

jas88 commented Aug 10, 2022

@al-cheb Have tried this now, but still encountering the issue

Please provide a link to the build.

Here's our most recent instance https://github.com/SMI/SmiServices/runs/7771359161?check_suite_focus=true

@al-cheb
Copy link
Contributor

al-cheb commented Aug 10, 2022

@jas88, just for curiosity, what happens when you move this step right after the checkout step?

@jas88
Copy link

jas88 commented Aug 10, 2022

@al-cheb Tried it; both runs completed OK this time, but that doesn't say much since most but not all of the last few runs worked...

@al-cheb
Copy link
Contributor

al-cheb commented Aug 16, 2022

Both images have been deployed with a disabled mongo service by default.

@MattiSG
Copy link

MattiSG commented Aug 23, 2022

As documented in https://github.com/orgs/community/discussions/30083, I encounter the same issue as @rkm described in #5949 (comment):

Run Set-Service mongodb -StartupType Automatic          
  Set-Service mongodb -StartupType Automatic          
  Start-Service -Name mongodb
  shell: C:\Program Files\PowerShell\7\pwsh.EXE -command ". '{0}'"
Start-Service: D:\a\_temp\d215d86b-60cc-42b6-9911-58c92d5c02a6.ps1:3
   3 |  Start-Service -Name mongodb
     |  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
     | Service 'MongoDB Server (MongoDB) (mongodb)' cannot be started due to the following error: Cannot
     | start service 'mongodb' on computer '.'.
Error: Process completed with exit code 1.

This happens using the instructions provided by @al-cheb, with this step:

      - name: Start MongoDB (Windows)
        run: |
          Set-Service mongodb -StartupType Automatic          
          Start-Service -Name mongodb
        if: ${{ env.RUNNER_OS }} == 'Windows'

And yields the exact same result as what I had before finding this issue, that is using only Start-Service -Name "MongoDB".

What can we do to bring Mongo back to Windows? Currently, the only option left seems to be to disable MongoDB-related tests in CI, which plays strongly against GitHub Actions.

Note on process

I am a bit confused about how I can get notified of such major, breaking changes. Our builds started failing on Windows with no clear reason, and it took us two half days to identify the root cause (MongoDB not being started anymore on Windows) since it coincided with a major MongoDB release and we did not suspect that GitHub Actions would change its infrastructure this way. There is no documentation on how to start the service in the image description nor in the Actions doc. This issue itself was referred to me by a colleague, and I did not manage to find it through different searches. In the end, the proposed workaround does not seem to work.

More generally, this issue was opened on 22/07 for deployment on 08/08, and the rationale for such a breaking change with no clear workaround at the time of writing is “mongodb service on Windows Server is running but it should be disabled by default”. This does not make it very clear why we have to pay such a strong adaptation cost on short notice.

What is the way we, as users, are expected to follow with such breaking changes? The volume of issues makes it impractical to watch the repository. Should we check this repository weekly for “Announcement”-tagged issues? 🙂

@al-cheb
Copy link
Contributor

al-cheb commented Aug 23, 2022

@MattiSG, could you add the start mongodb service step right after checkout or the first one?

MattiSG added a commit to OpenTermsArchive/engine that referenced this issue Aug 23, 2022
@MattiSG
Copy link

MattiSG commented Aug 23, 2022

Thank you @al-cheb for your quick reply! I thought I got it to work by adjusting your instructions to use the case-sensitive version of the service name (with MongoDB and not mongodb as the service name). However, I now am in the same situation as described by @jas88, where one run passed and the next run failed 😕

could you add the start mongodb service step right after checkout or the first one?

Done in this run. It took two minutes of waiting before the step finally output “WARNING: Waiting for service 'MongoDB Server (MongoDB) (MongoDB)' to start...”, but it passed. However, I don't know how reliable a positive result can be considering my and @jas88’s experience with random results 😕 I will run a few more iterations and report.

@al-cheb Could you please clarify the rationale behind starting the service as early as possible in the process? I assume this is because it might take some time for the server to start, and our tests might fail earlier than that. However, in our case, dependencies are installed and a first series of tests are executed in-between the service start and the usage of MongoDB, averaging over a minute, which should be enough for the MongoDB server to be up.

I see that @ankane used:

sc config MongoDB start= auto
sc start MongoDB

since I am not familiar with Windows services, is there any reason why this could work if Service-Start fails? 🙂

@MattiSG
Copy link

MattiSG commented Aug 23, 2022

I confirm that on a second run, even as “first step after checkout”, the service start fails.

@al-cheb
Copy link
Contributor

al-cheb commented Aug 23, 2022

@MattiSG, Could you check?

- run: |
    sc.exe config MongoDB start= auto
    sc.exe start MongoDB

@MattiSG
Copy link

MattiSG commented Aug 23, 2022

With sc.exe instead of Service-Start, I had 3 successful attempts out of 4. So far, this is the most reliable way to start MongoDB on Windows. I will rebase the branch and add another few refactors, that should trigger additional builds. I will report here if additional failures are encountered.

On a first run, the service did start and logged:

SERVICE_NAME: MongoDB 
        TYPE               : 10  WIN32_OWN_PROCESS  
        STATE              : 2  START_PENDING 
                                (NOT_STOPPABLE, NOT_PAUSABLE, IGNORES_SHUTDOWN)
        WIN32_EXIT_CODE    : 0  (0x0)
        SERVICE_EXIT_CODE  : 0  (0x0)
        CHECKPOINT         : 0x0
        WAIT_HINT          : 0x7d0
        PID                : 3884
        FLAGS              : 

However, all sorts of weird, never-seen-before errors appeared in the run. Those did not seem to be directly related to MongoDB.

On a second run, the service started and logged the exact same lines (except for the PID of course) and all tests passed.

On a third run, same as second.

On a fourth run, same as second.

MattiSG added a commit to OpenTermsArchive/engine that referenced this issue Aug 23, 2022
MattiSG added a commit to OpenTermsArchive/engine that referenced this issue Aug 23, 2022
@MattiSG
Copy link

MattiSG commented Aug 23, 2022

The sc.exe workaround seems the most reliable until now. However, since its introduction, we regularly get failures on non-MongoDB-related parts of the test suite. These are random and did not happen before.

The duration for starting up the MongoDB service is between 1 and 3 minutes, leading to a significant slowdown of our pipeline:

Screen Shot 2022-08-23 at 15 59 08

Following this change, we are considering disabling MongoDB tests on Windows. This is disappointing and plays against GitHub Actions as a cross-platform CI engine.

MattiSG added a commit to OpenTermsArchive/engine that referenced this issue Aug 23, 2022
MattiSG added a commit to OpenTermsArchive/engine that referenced this issue Aug 23, 2022
MattiSG added a commit to OpenTermsArchive/engine that referenced this issue Aug 23, 2022
MattiSG added a commit to OpenTermsArchive/engine that referenced this issue Aug 23, 2022
MattiSG added a commit to MattiSG/setup-mongodb that referenced this issue Aug 24, 2022
@MattiSG
Copy link

MattiSG commented Aug 24, 2022

Stability report after running 10 times each syntax

The following syntaxes were run in GitHub Actions on windows-2022 10 times in a row. They all show about 90% success, tending to demonstrate there is no difference in failure rate depending on syntax.

The test suite itself is not inherently flaky, as each run was also run in ubuntu-latest and was successful every time.

In almost two years of running this test suite cross-platform, Windows tests had only one moment of diverging failure a few months ago. This observed increase to 10% failure rate seems to be a consequence of the change introduced here.

I will report in a few days or weeks if this data changes.

sc.exe : 9 successes out of 10 runs

sc.exe config MongoDB start= auto
sc.exe start MongoDB

9/10 success

Start-Service: 9 successes out of 10 runs

Set-Service MongoDB -StartupType Automatic          
Start-Service -Name MongoDB

9/10 success

sc over JavaScript: 8 successes out of 10 runs

run(`sc config MongoDB start= auto`);
run(`sc start MongoDB`);

Screen Shot 2022-08-24 at 12 04 40

MattiSG added a commit to OpenTermsArchive/engine that referenced this issue Aug 24, 2022
MattiSG added a commit to OpenTermsArchive/engine that referenced this issue Aug 24, 2022
Implement the GitHub Actions recommended solution following an update to their infrastructure. The expected result is to drop failure rate from 100% down to 10%. See actions/runner-images#5949 (comment) for details.
Upgrade CI Windows from Server 2019 to Server 2022.
Speed up Linux CI.
Test on MongoDB v5 instead of v4 on Linux.
@rkm
Copy link

rkm commented Feb 17, 2023

This issue seems to have resurfaced for us unfortunately. Example run here.

Our workflow file is unchanged, and contains:

      - name: "[windows] start MongoDB service"
        if: ${{ matrix.os == 'windows' }}
        shell: pwsh
        run: |
          Set-Service mongodb -StartupType Automatic
          Start-Service -Name mongodb

@code-ape
Copy link

This has also recently restarted for a project I'm working on.

@rkm
Copy link

rkm commented Mar 16, 2023

@al-cheb is there any more information we could provide to help diagnose this issue?

@ddobranic
Copy link
Contributor Author

@rkm , could you please check?

- run: |
    sc.exe config MongoDB start= auto
    sc.exe start MongoDB

@rkm
Copy link

rkm commented Mar 30, 2023

@ddobranic We've changed to the sc.exe method, but it now occasionally fails with:

[SC] ChangeServiceConfig SUCCESS
[SC] StartService FAILED 1053:

The service did not respond to the start or control request in a timely fashion.

Example run: https://github.com/SMI/SmiServices/actions/runs/4563723915/jobs/8052574736?pr=1489#step:4:17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Announcement awaiting-deployment Code complete; awaiting deployment and/or deployment in progress OS: Windows
Projects
None yet
Development

No branches or pull requests

8 participants