-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redocly CLI Hangs When Running In A Container #1592
Comments
Hi @its-hammer-time! Could you check whether it also hangs outside the docker container (when installed globally with |
Hey @tatomyr, it seems to work when I run it in a "Mac" environment. What I mean by that is NPM and DockerCLI both work, but as soon as I try to build it with the desktop-linux setup I described above it hangs. I was doing some testing yesterday with my own spec and it seems to be stemming from a portion of our OpenAPI spec. Essentially, i commented out all of our paths and slowly uncommented them to see if it was potentially coming from a circular dependency that wouldn't resolve when the environment changed. Again this is just a theory, but maybe the OS integrations change so Mac works, but Linux doesn't or something along those lines. Anyways, I narrowed it down to a few endpoints, but what was interesting is it would work if my paths were ordered in a certain way, but if I moved an endpoint around it broke. I'm still trying to dig into this, but my theory is that the internal $ref resolution mechanism is breaking under a very unique case. It would be great if there was a way to enable debug logs on the CLI, but for now I'll keep playing around with it. For example, this does NOT work: /endpoint/A
/endpoint/B
/endpoint/C
/endpoint/D
/endpoint/E But this does? /endpoint/A
/endpoint/B
/endpoint/D
/endpoint/E
/endpoint/C |
Okay, this is very interesting. I was able to isolate a failure case for two endpoints like the following:
I then iteratively removed pieces all the way down endpoint A to see if it was maybe a $ref issue like I explained above. However, what I found is that it's a single day:
description: The day of the month to run the report schedule.
examples:
- 23
type:
- integer
- "null"
minimum: -1
maximum: 31 I'm going to try and create an example spec that reproduces this issue so we have something to work off of together. I know it can be hard to debug these types of issues when you don't get an example spec. |
Okay, I'm starting to believe the For example, with ~20% of the spec, I get build issues maybe 2/10 times, but if I filter it down to ~10% of my spec I don't get any build issues. Then with 100% of the spec I get build issues every time. For context, I'm just running the following command over and over after making a change in my openapi.yaml file. It's the same as in my original post, but I added the docker buildx build --no-cache --platform linux/amd64 -t your-image-name . With that said, it seems like this may be a race condition or a resource constraint. I noticed that there's a different bug ticket regarding CPU constraints which you then linked to a memory issue. |
I don't think the issue you've mentioned is related to your case. The memory issue crops up with the build-docs command (which runs the React renderer) while you're using bundle. I hope you'll be able to come up with some repro because it's indeed very hard to figure out what's wrong without it. |
SummaryPLEASE READ: Refer to "Don't Specify Platform To Docker Build" below as I imagine this is the real culprit Okay, I was able to get a small working example from our existing spec "working".... in other words, it's broken 😄. I'll put my steps below on how I reproduce it as well, but what's funny is that System Setup
Ensure you're using default as your builderI'm fairly certain this builder is provided by docker so I imagine everyone has it? If not, let me know, but here's how to use it:
Note that I also tried using the Build The ImageFrom now on, I essentially just re-run this command whenever I want to "test" the container build. Note that we're using the no cache arg so we can ensure a full build every time.
Test CasesI've also noticed the following changes to the spec seem to resolve the issue. At this point, I'm honestly not sure why it's behaving like this. The "fixes" seem random to me, but hopefully you will have more insight as to what may be happening. Don't Use -1 Minimum ValueAs mentioned above, in No Changes: 0% success rate (0/10 builds passed) Using 0: 100% success rate (10/10 builds passed) Commented Out: 100% success rate (10/10 builds passed) Don't Include Campaign SchemaIn No Changes: 0% success rate (0/10 builds passed) Campaign Commented Out: 100% success rate (10/10 builds passed) Don't Specify Platform To Docker BuildI imagine this is actually the real culprit. The M1 Macs are based on ARM and I'm requesting that it build for amd/64 so Docker is having to perform some magic to get it working. If I don't specify the platform and let docker run it's default platform then the CLI actually works. Assuming this is related to the platform, it's very strange that doing minor things like changing a However, my companies build systems build for linux since we deploy to K8 pods which are linux/amd64.
|
Thank you! I'll review the example a bit later. |
Using the example.zip above, I was able to confirm that using https://github.com/Redocly/redocly-cli/releases/tag/%40redocly%2Fcli%401.13.0 |
Hey @tatomyr, any update on this? |
Describe the bug
We're using the Redocly CLI to bundle our OpenAPI specs, but for some reason it seems to hang randomly. We believe this started with Docker image
redocly/cli:1.13.0
, but we're not 100% certain. For now we've snapshotted ourselves to 1.12.0.For context, the CLI works when I install it directly with NPM or use the
docker pull ... && docker run ...
commands found on your documentation. However, if I intentionally use thedesktop-linux
builder in Docker Hub that's when it ends up hanging.Unfortunately, it doesn't look like there's any way to enable debug logs with the CLI so I'm not sure if it's pausing on something related to our OpenAPI spec or if it's a genuine issue which is making it hard to determine where it's coming from.
To Reproduce
To reproduce this issue on my ARM Mac, I ran the following:
export DOCKER_BUILDKIT=1
docker buildx create --name linuxbuilder --use
docker buildx inspect linuxbuilder --bootstrap
docker buildx build --platform linux/amd64 -t your-image-name .
Expected behavior
The spec should bundle successfully or at least error out with some sort of reason why it failed.
Logs
After running the steps above, I can see this. Notice that this step has been running for 481 seconds already (and climbing)
OpenAPI description
We are using OpenAPI 3.1.0. I'm not sure if I can post my spec here so I will try to find a test example that reproduces this issue. If I do, I will post it as a comment below.
Redocly Version(s)
For the test scenario above, I'm pulling the latest which I believe is 1.16.0
Node.js
Version(s)Using the Docker image provided by you
Additional context
N/A
The text was updated successfully, but these errors were encountered: