Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cosmos DB Linux Emulator fails to start on some Intel chips #45

Open
milismsft opened this issue Feb 18, 2022 · 39 comments
Open

Cosmos DB Linux Emulator fails to start on some Intel chips #45

milismsft opened this issue Feb 18, 2022 · 39 comments
Assignees
Labels

Comments

@milismsft
Copy link
Contributor

Related to: actions/runner-images#5036 (comment)

The Cosmos DB Linux Emulator fails to start on some Intel chips.

lscpu output:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit

Byte Order: Little Endian
Address sizes: [46]
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Stepping: 7
CPU MHz: 2593.907
BogoMIPS: 87.81
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 64KiB
L1i cache: 64 KiB
L2 cache: 2 MiB
L3 cache: 35.8 MiB
NUMA node0 CPU(s): 0,1
Vulnerability Itlb multihit: KVM: Mitigation: VMX unsupported
Vulnerability L1tf: Mitigation; PTE Inversion
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Full generic retpoline, STIBP disabled, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT Host state unknown
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves md_clear
/proc/cpuinfo content:
/proc/cpuinfo

MicrosoftTeams-image (2)

@aressler38
Copy link

aressler38 commented Apr 26, 2022

Is there a way to add a constraint on the Azure Pipeline to use the CPU model that works? I am hitting this issue in Azure DevOps Pipelines, and I always get model 85, which always fails. I have tried specifying "ubuntu-latest" "ubuntu-20.04" and "ubuntu-18.04" but none have worked.

The below CPU also fails.

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 85
model name	: Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
stepping	: 7
microcode	: 0xffffffff
cpu MHz		: 2593.905
cache size	: 36608 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 21
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves md_clear
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit
bogomips	: 5187.81
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:

This one with ubuntu-18.04 did work:

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 85
model name	: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
stepping	: 4
microcode	: 0xffffffff
cpu MHz		: 2095.078
cache size	: 36608 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 21
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves md_clear
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit
bogomips	: 4190.15
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:

@rrr-michael-aquilina
Copy link

I am also facing this issue. Are there any work arounds or has any progress been made?

@soenneker
Copy link

@rrr-michael-aquilina The workaround that worked for me was moving to the Cosmos emulator (powershell) that's baked in the windows pipeline.

High traffic times can cause the emulator to start slowly, > 5min. I had to make some modifications to it's timeout and such but it's been pretty stable since.

Definitely worth having than not using the emulator at all in the DevOps pipeline.

@christian-be
Copy link

Also faced this issue using the Linux Docker image. Cost me a day of investigating network issues just to find the out the container is immediately shutting down. Using ubuntu-18.04 as suggested in the other Github ticket worked for me, but a fix for 20-04 would be great.

@waszak
Copy link

waszak commented Aug 9, 2022

Ubuntu agent 18.04 is getting depracated so the issue needs to be fixed before that day.
actions/runner-images#6002

@mj-rittermann
Copy link

We are seeing the exact same problem. Running fine on Azure DevOps agents running ubuntu-18.04 but fails on ubuntu-20.04 and ubuntu-22.04.
Would someone please look in to a fix on this, as the ubuntu-18.04 image are beeing deprecated on 4/1/2023 as @waszak mentioned: actions/runner-images#6002

@eddumelendez
Copy link

I run a test today and it works on 20.04, see https://github.com/eddumelendez/testcontainers-cosmodb-gha-test/actions/runs/3153248862/jobs/5129495371

Can someone else confirm?

@nogic1008
Copy link

@eddumelendez
Not yet.
I'm using Cosmos DB container as Service container on GitHub Actions, but "Connection Refused" error still occurs.

ddradar/ddradar#1002
https://github.com/ddradar/ddradar/actions/runs/3155822253/jobs/5134896830

@eddumelendez
Copy link

I think it is flaky, ran two more times and the first failed but the last one succeeded

@mmoayyed
Copy link

Yes it's flaky. I continue to see random failures as well.

@waszak
Copy link

waszak commented Oct 4, 2022

@milismsft do you have any updates on this issue?

@dankarmyy
Copy link

+1 for working on ubuntu 20/22

We run as part of integration testing - only starts (sometimes) on ubuntu 18. But anything higher it just hangs at the "Starting" message in the container logs forever.

Our devs use docker-compose stack for local dependencies which includes cosmosdb, so we would like to just spin up the same stack in ado pipelines.

Ubuntu 18 deprecation date was pushed back to April '23 so we have a bit more time...

@LevYas
Copy link

LevYas commented Oct 14, 2022

Very actual during the current un-scheduled brownout for 18! 20 doesn't work.

@waszak
Copy link

waszak commented Oct 25, 2022

This repo doesn't look like is active so I posted question here.

https://learn.microsoft.com/en-us/answers/questions/1057083/cosmos-db-linux-emulator-doesn39t-work-on-some-int.html

@Meandron
Copy link

Meandron commented Dec 7, 2022

Any news on this?

@waszak
Copy link

waszak commented Jan 26, 2023

Someone asked again today but all we got is the same answer.

We don't have a public facing ETA we can share for now, but we will share on Azure updates when this will be available.

@kiview
Copy link

kiview commented Feb 21, 2023

Since today the next scheduled brown-out of the Ubuntu 18.04 GHA runners happened and we are getting closer to EOL for those runners, any updates or workarounds, especially for GHA users?

@eli-fin
Copy link

eli-fin commented Mar 20, 2023

We are blocked on this issue too. Any update?

@DSpirit
Copy link

DSpirit commented Mar 20, 2023

We are blocked on this issue too. Any update?

Our current workaround is, to self-host agents with a different chipset. See my answer here: #56

@eli-fin
Copy link

eli-fin commented Mar 20, 2023

We are blocked on this issue too. Any update?

Our current workaround is, to self-host agents with a different chipset. See my answer here: #56

Our org has strict policies regarding self-hosted agents, so not as straight forward. But thanks.

@asos-gurpreetsingh
Copy link

We are blocked on this issue too. Any update?

Our current workaround is, to self-host agents with a different chipset. See my answer here: #56

Our org has strict policies regarding self-hosted agents, so not as straight forward. But thanks.

We moved this job to a windows agent and rest are on ubuntu to get around this issue.

@eli-fin
Copy link

eli-fin commented Mar 20, 2023

We moved this job to a windows agent and rest are on ubuntu to get around this issue.

OK. How can your code running on ubuntu access the db running on the windows agent?

@soenneker
Copy link

@milismsft any update here? Ubuntu 18.04 isn't available anymore so it essentially prevents us from using Linux agents..

@aomegax
Copy link

aomegax commented Aug 23, 2023

any news?

@guibranco
Copy link

Any updates on this?

@sajeetharan
Copy link
Member

Hi it is not supported yet, but we are actively exploring options to support this.

@tpischke
Copy link

tpischke commented Dec 1, 2023

This is blocking our project from running Integration Tests on github with CosmosDB, so I hope this will be fixed soon. No update since Oct 12 is not particularly encouraging. Since this involves a very basic use case for two flagship products, I would hope this would get prompt attention.

@razvangoga
Copy link

razvangoga commented Dec 4, 2023

@sajeetharan will the new version you mentioned in #79 here also fix this issue? (we're trying to use the emulator as part of our integration tests in an Azure Devops Pipeline)

@rrr-michael-aquilina
Copy link

@sajeetharan - Can you provide an update on your previous comment? It would be great to use this emulator for end to end integration tests via Azure DevOps!

@sajeetharan
Copy link
Member

hi @rrr-michael-aquilina @razvangoga @tpischke just to confirm, are you referring to the Mac OS intel chip support or #79 ?

@LevYas
Copy link

LevYas commented Feb 7, 2024

I struggle the most with the fact that the Emulator doesn't work in GitHub Actions, like stated in the first post.

@razvangoga
Copy link

razvangoga commented Feb 7, 2024 via email

@AButler
Copy link

AButler commented Feb 7, 2024

hi @rrr-michael-aquilina @razvangoga @tpischke just to confirm, are you referring to the Mac OS intel chip support or #79 ?

Hi @sajeetharan,

I think what most people want is to be able to run the Cosmos emulator in a pipeline/action using Azure DevOps or GitHub. As the Linux runners that both GitHub/DevOps use do not work with the emulator because of the issue described in the first post, we've either had to switch to a Windows runner or built a custom runner.

@rrr-michael-aquilina
Copy link

rrr-michael-aquilina commented Feb 7, 2024

@sajeetharan - I was referring to the original post. For example, ubuntu-latest azure devops agent.

@joelsteventurner
Copy link

Was really looking forward to getting some CI / CD integration tests going with a containerized Cosmos Emulator in our Azure dev-ops pipelines but it seems to be failing intermittently.

This has been great and works very consistently when running on my desktop in Visual Studio.

I have since included these tests as part of a CI/CD pipeline and the tests are failing intermittently (roughly 50% of the time I'd say).
This failure seems to be because the container is being stopped.. for some unknown reason. (sometimes...)

The failed runs seem to happen quickly e.g. within 3-5 seconds of the container starting, I see a Stop Docker container in the log messages there is no "reason" or error I can see for why the container has been stopped. Below of samples of failed and successful runs and my container configuration.

Discussion on Testcontainers

@JonathanLydall
Copy link

JonathanLydall commented Feb 13, 2024

@sajeetharan, it's essentially been two years now since this issue was opened, two years since people reported that it's blocking them from being able to use Microsoft hosted Linux runners on Azure DevOps Pipelines to be able to run integration tests against Cosmos DB using the docker emulator.

I feel really let down by Microsoft here, no explanation has been offered as to why this hasn't been fixed yet, nor has a practical workaround been offered, for example all the following are not options which would work for us:

  • Switching to Windows agents: Other containers we use for other integration tests can't run under Windows agents, and it's important for us that our tests in general run on Linux to spot other potential cross-platform issues.
  • Self hosted agent: The reason we pay for Microsoft hosted agents is so that we don't have the burden of managing our own systems, updating software, etc.
  • Running against production Cosmos DB: We would need to alter our test suite for this to work and incur additional fees (on top what we're already paying for Microsoft hosted build agents!)

It feels like this blocking issue is not being prioritised by Microsoft appropriately, between the Azure DevOps Pipelines and the Cosmos DB Emulator teams, can this please get the attention it deserves.

@jon-signal
Copy link

Building on @JonathanLydall's excellent summary: perhaps it would be productive to change the title of this issue to "Cosmos DB does not work with Microsoft-hosted Linux runners on Azure DevOps Pipelines."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests