New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

container terminal hangs (unpredictably) when (large amounts of) output is written to terminal(?) #199

Open
cons0l3 opened this Issue Nov 3, 2016 · 46 comments

Comments

Projects
None yet
@cons0l3

cons0l3 commented Nov 3, 2016

I am trying to setup a cpython compile/build windows environment. I add chocolatey and install via choco the msvc compiler, mercurial, subversion and all the rest of the dependencies. then pulling the cpython mercurial (hg pull https://hg.python.org/cpython) successfully. all this has not much output to the terminal.

During the compiling of python the build script requires a few subversion pulls (aka exports). Below you will find a very reduced Dockerfile and a build.bat script to reproduce the problem.

If running the build.bat with the svn exports during docker build it will work. no freezing of the terminal.

If running the build.bat in a started container

  1. docker run --rm -it {BUILD-IMAGE} powershell
  2. remove folders from building the container:
PS C:\> Remove-Item -Recurse -Force C:\openssl-1.0.2j
PS C:\> Remove-Item -Recurse -Force C:\bzip2-1.0.6
  1. PS C:\> build.bat

the output of subversion will freeze (sometimes earlier then other times), the container is still in state running. docker exec -it {CONTAINER} powershell works. You will find a folder where subversion started the export. PS C:\ps will show you another powershell is still running.

=> I guess the terminal output is screwing things up.

Dockerfile

FROM microsoft/windowsservercore

# set powershell to default shell for the following RUN
SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop';"]

# install choclatey
RUN $env:chocolateyUseWindowsCompression = 'false'; iwr https://chocolatey.org/install.ps1 -UseBasicParsing | iex;

# install subversion
RUN choco install -y \
    svn

# actually from inside the checkout PCBuild/build.bat run with `build -d -e`,
# but here only the svn export
COPY build.bat C:\build.bat

# running during build...
RUN & C:\build.bat

CMD ["powershell"]

build.bat

@echo off
echo.executing svn export http://svn.python.org/projects/external/openssl-1.0.2j
svn export http://svn.python.org/projects/external/openssl-1.0.2j
echo.executing svn export http://svn.python.org/projects/external/bzip2-1.0.6
svn export http://svn.python.org/projects/external/bzip2-1.0.6

the svn exports put out the exported files very quickly. and then it sometimes hangs unpredictably. it might execute all of it, it might hang in the first export, sometimes in the second.

docker ps states the container as "running".

docker info

λ docker info                                                                      
Containers: 2                                                                      
 Running: 1                                                                        
 Paused: 0                                                                         
 Stopped: 1                                                                        
Images: 29                                                                         
Server Version: 1.12.2-cs2-ws-beta                                                 
Storage Driver: windowsfilter                                                      
 Windows:                                                                          
Logging Driver: json-file                                                          
Plugins:                                                                           
 Volume: local                                                                     
 Network: nat null overlay                                                         
Swarm: inactive                                                                    
Security Options:                                                                  
Kernel Version: 10.0 14393 (14393.351.amd64fre.rs1_release_inmarket.161014-1755)   
Operating System: Windows 10 Pro                                                   
OSType: windows                                                                    
Architecture: x86_64                                                               
CPUs: 8                                                                            
Total Memory: 11.99 GiB                                                            
Name: BLAUBEERE10                                                                  
ID: E2WR:C7YS:2BZ5:TNFX:NCJL:FZPY:6U6F:3MFC:CBX4:DJCP:LZFO:RZS3                    
Docker Root Dir: E:\Docker                                                         
Debug Mode (client): false                                                         
Debug Mode (server): false                                                         
Registry: https://index.docker.io/v1/                                              
Insecure Registries:                                                               
 127.0.0.0/8                                                                       

docker version

λ docker version
Client:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   6b644ec
 Built:        Thu Oct 27 00:09:21 2016
 OS/Arch:      windows/amd64
 Experimental: true

Server:
 Version:      1.12.2-cs2-ws-beta
 API version:  1.25
 Go version:   go1.7.1
 Git commit:   050b611
 Built:        Tue Oct 11 02:35:40 2016
 OS/Arch:      windows/amd64

winver of host

version 1607 (build 14393.351

@cons0l3

This comment has been minimized.

cons0l3 commented Nov 4, 2016

screenshot of powershell in container. excuse the errors on the top. that is just me trying to run a script in powershell. the actual problem is all the way at the bottom where it hangs and will not continue...

image

@friism

This comment has been minimized.

friism commented Nov 6, 2016

Have you determined whether this is caused by terminal output or writing the files? Maybe try with the --quiet svn parameter. It would also be good to understand whether this also happens in a cmd.exe shell.

If this is a general problem, we should get an issue on docker/docker too.

@panmanphil

This comment has been minimized.

panmanphil commented Nov 6, 2016

I've had this happen a couple of times on a container on a windows 10 box. The output in that case was from an http call and while the container did eventually stop, it was listed as Dead in docker ps and couldn't be deleted. Rare though

@cons0l3

This comment has been minimized.

cons0l3 commented Nov 6, 2016

I redirected all the output of stdout and stderr of build.bat to respective files c:\build.log and c:\build.err. it works and compiles.

if I run PS C:> type build.log from inside the container the shell/terminal hangs. So I still believe it is sending a lot of output to the from the container terminal to the host terminal window?

If I run PS C:\> more build.log from inside the container and step through the log file it works. So there is no wonky chars in the file that could mess it up. But, if I hit the space quickly it again hangs the terminal.

if I run docker exec -it {CONTAINER_ID} cmd /c type c:\build.log it will print all the build.log as expected.
if I run my compilation from outside the container it works great.

@friism

This comment has been minimized.

friism commented Nov 7, 2016

Pinging @lzybkr and @jhowardmsft - is this a ps-readline problem?

@dgageot

This comment has been minimized.

dgageot commented Dec 2, 2016

@cons0l3 Could you please give a try to Docker for Windows Beta 31. It contains an updated version of dockerd for Windows Containers.

@cons0l3

This comment has been minimized.

cons0l3 commented Dec 2, 2016

@dgageot retest with beta31 was not successful. it still hangs when outputting lots of lines. it is independent which program is used e.g. C:> type a large text file to console. it hangs.

@dgageot

This comment has been minimized.

dgageot commented Dec 3, 2016

@friism @lzybkr @jhowardmsft is it a know Windows Containers issue?

@rn

This comment has been minimized.

Contributor

rn commented Jan 7, 2017

@cons0l3 Could you try the latest beta (Beta35) with docker engine 1.13.0-rc5? I believe there have been some console related changes since the beta you tried/reported on last.

With Beta35 I can don things like:

get-process | convertto-json

in a servercore container, which produces around to 4000 lines of output.

@cons0l3

This comment has been minimized.

cons0l3 commented Jan 8, 2017

retest was not successfull. it still hangs the terminal as described above. retested all different error variants on thre different computers.

λ docker version
Client:
Version: 1.13.0-rc5
API version: 1.25
Go version: go1.7.3
Git commit: 43cc971
Built: Thu Jan 5 03:07:30 2017
OS/Arch: windows/amd64

Server:
Version: 1.13.0-rc5
API version: 1.25 (minimum version 1.24)
Go version: go1.7.3
Git commit: 43cc971
Built: Thu Jan 5 03:07:30 2017
OS/Arch: windows/amd64
Experimental: true

λ docker info
Containers: 3
Running: 1
Paused: 0
Stopped: 2
Images: 29
Server Version: 1.13.0-rc5
Storage Driver: windowsfilter
Windows:
Logging Driver: json-file
Plugins:
Volume: local
Network: l2bridge l2tunnel nat null overlay transparent
Swarm: inactive
Default Isolation: hyperv
Kernel Version: 10.0 14393 (14393.576.amd64fre.rs1_release_inmarket.161208-2252)
Operating System: Windows 10 Pro
OSType: windows
Architecture: x86_64
CPUs: 8
Total Memory: 11.99 GiB
Name: xxx
ID: E2WR:C7YS:2BZ5:TNFX:NCJL:FZPY:6U6F:3MFC:CBX4:DJCP:LZFO:RZS3
Docker Root Dir: E:\Docker
Debug Mode (client): false
Debug Mode (server): true
File Descriptors: -1
Goroutines: 33
System Time: 2017-01-08T15:12:23.8356675+01:00
EventsListeners: 1
Http Proxy:
Https Proxy:
No Proxy:
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false

@jhowardmsft

This comment has been minimized.

jhowardmsft commented Jan 13, 2017

One question - is this using the Windows 10 legacy console, or the native console? (I assume the docker client is running locally on the Windows 10 box). You can go into properties for the console. Legacy will have the checkbox checked (and have been restarted).

The reason I ask is an important one - OSs prior to Windows 10, or on Windows 10 where legacy mode is enabled, we use a terminal emulator in the docker client itself. When not checked and on Windows 10, we use the VT processing capabilities in Windows itself. That would help me to figure out the next step for debugging this.

@rn

This comment has been minimized.

Contributor

rn commented Jan 20, 2017

I was now able to repro this with 1.13.0 on windows 10 pro (10.0.14393.693). The Legacy console tickbox is not ticked.
I just did a:

docker run --rm -ti microsoft/nanoserver powershell

followed by:

Get-ChildItem -Recurse

here is the docker info:

PS C:\Program Files> docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 1.13.0
Storage Driver: windowsfilter
 Windows:
Logging Driver: json-file
Plugins:
 Volume: local
 Network: l2bridge l2tunnel nat null overlay transparent
Swarm: inactive
Default Isolation: hyperv
Kernel Version: 10.0 14393 (14393.693.amd64fre.rs1_release.161220-1747)
Operating System: Windows 10 Pro
OSType: windows
Architecture: x86_64
CPUs: 4
Total Memory: 15.89 GiB
Name: win-nuc0
ID: 6JC7:H6ZU:QYOR:V4EZ:DDX3:WQ5B:MJSN:ROU7:5HZI:U5WO:ZW4O:MYOU
Docker Root Dir: C:\ProgramData\Docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: -1
 Goroutines: 18
 System Time: 2017-01-20T14:56:17.1046448Z
 EventsListeners: 0
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

@jhowardmsft would it be better to move this over to docker/docker?

@jhowardmsft

This comment has been minimized.

jhowardmsft commented Jan 26, 2017

I'm investigating - looks like this only repros in Hyper-V containers, so is most likely a platform bug from what I've debugged so far. Let me get back to you once I know more and have looped in some folks internally here.

@jhowardmsft

This comment has been minimized.

jhowardmsft commented Jan 26, 2017

Have verified this is a platform bug in RS1. It only affects Hyper-V containers as I mentioned above. The fix is in post-RS1 builds (I verified on build 15010 rs_container). I don't know whether a fix will be back-ported though. @PatrickLang for tracking.

@rneugeba There probably little point having an issue open here as there's nothing that can be done in docker for Windows or in any of the OSS code supporting Docker running on Windows to fix - it requires a platform fix.

Internal references: VM CL539473 (fix for 9710744, 9701376 and 8444699)

@rn

This comment has been minimized.

Contributor

rn commented Jan 26, 2017

@jhowardmsft thanks for digging into this. I'll verify tomorrow with the latest insider build and then may close this.

In the future which issue tracker is the best for this type of stuff? I'm happy to redirect users to the appropriate forum

@jhowardmsft

This comment has been minimized.

jhowardmsft commented Jan 26, 2017

In the future which issue tracker is the best for this type of stuff? I'm happy to redirect users to the appropriate forum

I'm not sure, this type of issue (platform bug) has come up a few times. I'll leave that for @PatrickLang to determine 😄

@rn

This comment has been minimized.

Contributor

rn commented Jan 27, 2017

@jhowardmsft @PatrickLang I retested on insider build 15014.1000 and the Get-ChildItem -Recurse still hangs.

@PatrickLang

This comment has been minimized.

PatrickLang commented Feb 1, 2017

@rneugeba best thing is to raise it with your MS contacts outside of GitHub. Having the VSO IDs (like John added above) is a helpful breadcrumb so I can look up status but unfortunately there's no public database we can share for internally tracked issues & changes

@rn

This comment has been minimized.

Contributor

rn commented Feb 15, 2017

I still see the issue on insider build 15031 and docker 1.13.1

@dheater

This comment has been minimized.

dheater commented Mar 9, 2017

I seem to be able to work around this for now by redirecting stdout to $null

@jhowardmsft

This comment has been minimized.

jhowardmsft commented Mar 9, 2017

Update - it doesn't look like the fix is in 3B - it is slated for 5B (the May 2017 round of updates).

Internal Ref: VSO #11056625

@donny-dont

This comment has been minimized.

donny-dont commented Mar 17, 2017

@jhowardmsft really sorry to hear that. I've been working on doing builds with VS in containers and currently the only way around this is to redirect to a file or null which makes it difficult for CI systems.

@AArnott

This comment has been minimized.

AArnott commented Apr 30, 2017

I see terminal hangs as well (with the lightweight process containers -- Hyper-V manager shows nothing when a VM is running). I'm on Windows 10 15063.rs2_release.170317-1834 and using Docker:

Version 17.03.1-ce-win5 (10743)
Channel: stable
b18e2a5

The terminal hangs are highly correlated to commands the show progress bars such as "npm install"
and powershell scripts that show the ascii art progress box. Perhaps it's the escape codes they send to move the cursor around to print to other locations that's causing the issue.

I randomly have issues where TAB completion doesn't work in the terminal, left arrow erases characters on the command line, and Ctrl+V inserts ^V instead of pasting as it did in that same terminal window before invoking a docker exec command.

I also see terminal corruption, where after a "cls", I type "dir" and the directory listing is printed, along with a bunch of junk that had been printed previously. I can repeat these commands as many times as I wish and it still produces junk around the "dir".

@AArnott

This comment has been minimized.

AArnott commented May 1, 2017

BTW, the hangs seem to only repro when I use the -it switch. If I just use -i the terminal is much less useful (tabs don't auto-complete in cmd.exe, for example) but so far it doesn't seem to hang.

@richardgavel

This comment has been minimized.

richardgavel commented May 2, 2017

Does anyone know if this is just a Windows 10 issue or does it also occur with Windows Server 2016 (Hyper-V or Lightweight) Containers?

@jhowardmsft

This comment has been minimized.

jhowardmsft commented May 2, 2017

Both. It's a platform bug. See #199 (comment)

@jhowardmsft

This comment has been minimized.

jhowardmsft commented May 2, 2017

Edit to ^^ - it's a problem in the GCS component, which is only in use in Hyper-V containers. Windows Server containers on Windows Server 2016 shouldn't be affected, but Hyper-V containers on both server and client will be affected.

@richardgavel

This comment has been minimized.

richardgavel commented Jun 5, 2017

I don't seem to get this issue anymore, has it been confirmed as fixed?

@jhowardmsft

This comment has been minimized.

jhowardmsft commented Jun 5, 2017

@bpotvin

This comment has been minimized.

bpotvin commented Aug 8, 2017

I'm in a slightly different environment, but I'm consistently seeing console hang still today. I'm using 17.06.0-ce-win19 (12801) on Windows 10.0.15063 Enterprise and I'm running an Ubuntu 14.04 container. I'm using a batch to open the console with the relevant bits here:

set _FLAGS=-a stdin -a stdout -a stderr -i -t --rm --name console --privileged
docker run %_FLAGS% -v %_VOLUME% -e ADUSER=%USERNAME% -u %USERNAME% ubuntu-container

I have found that, if I run with new console features, ("Use legacy console" NOT ticked), then I get random, but consistent hangs, typically if I type very quickly with an occasional BS to correct a typo.

If I run with "Use legacy console" ticked, then I do not get any hangs. I do, however, see some odd behavior out of VI: the arrow keys don't work. The nano editor seems to work, (and there's always ed), so turning on the legacy console seems to be a solution for the time being.

Is this still an open (OS) bug, or has it been closed?

@heaths

This comment has been minimized.

heaths commented Sep 2, 2017

I've got all the latest updates for Windows 10 and Docker for Windows and still see this bug both when outputting text (not even necessarily large amounts of it) or as @AArnott pointed out, tabbing for auto-completion (and 50/50 if it works). I keep having to docker exec -it to my running container leaving tons of cmd or powershell processes (whichever is the ENTRYPOINT) running and often have to re-init state (environment variables, etc.).

@bpotvin

This comment has been minimized.

bpotvin commented Sep 2, 2017

Yeah, for me the hang is on input. Just the other day I was able to accidentally discover a 100% every-time repro: on a docket console where the command prompt has the "Use legacy console" NOT checked, if I press two keys at the same time, in example my finger accidentally hits K and L at the same time, then console input hangs and I have to CTRL+BREAK out of it.

@jasonbivins

This comment has been minimized.

jasonbivins commented Dec 5, 2017

@bpotvin Thanks for reporting - we have this entered in as a bug, but I don't have a timeframe for the fix yet

@mablae

This comment has been minimized.

mablae commented Dec 14, 2017

Facing the same issues when building/compile large gluon project with many layers of bash/make invocations (freifunk images)

When hitting an arrow key it seems to continue just fine.

Version 17.12.0-ce-rc2-win41 (14746)
Channel: edge
0f8a7d2
@vMcJohn

This comment has been minimized.

vMcJohn commented Dec 14, 2017

I frequently run into the same bug that @bpotvin reported. Use Legacy Mode in the console seems to work around the problem okay for me for now. @jasonbivins is there a bug over at MS that we can ping on? Thanks!

@jasonbivins

This comment has been minimized.

jasonbivins commented Jan 3, 2018

terminal output Issue reported here too #1471

@michaelkschmidt

This comment has been minimized.

michaelkschmidt commented Jan 28, 2018

@bpotvin @vMcJohn another workaround is to use "winpty" to launch docker exec:

winpty docker exec -it

This lets things like resizing and git colorized output to work (both broke for me with "Legacy Mode")

Note you might have to escape any paths:
moby/moby#24029

@pdefreitas

This comment has been minimized.

pdefreitas commented Feb 11, 2018

I can confirm this. Whenever I build windows images that produce large amounts of output sometimes the proccess hangs failing the build.

@docker-for-desktop-robot

This comment has been minimized.

Collaborator

docker-for-desktop-robot commented May 14, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale comment.
Stale issues will be closed after an additional 30d of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle stale

@panmanphil

This comment has been minimized.

panmanphil commented May 14, 2018

Any updates on this? I'll be presenting on Docker for Windows at a large conference in August and the gotcha list is always a valuable part of it

@heaths

This comment has been minimized.

heaths commented May 14, 2018

/lifecycle frozen

This continues to repro over the course of many updates. A status update would be nice.

@mallyvai

This comment has been minimized.

mallyvai commented Jun 4, 2018

We saw this bug occur regularly on an all-Linux/BSD stack running a large Rails app.

In CI:
Debian (Jessie) Docker image (ruby:2.3.4)
running on an
Ubuntu EC2 instance.

Locally:
That same docker image running locally on a Mac laptop (fully patched High Sierra).

I was printing out large debugging lines (100k+ characters per line, for a legitimate CI use-case) and noticed the build timing out.
After examining the log output (and lack-of-log-output), I was able to cleanly reproduce this bug with the following test case:

  1. Generate a randomly-generated file, with 4 lines, and 100,000 characters per line
  2. COPY it into an image derived from the aforementioned ruby:2.3.4 base image as my_large_file.dat
  3. Run the image as a container with a bash prompt opened inside of the container
  4. Open up irb inside of this container's bash prompt.
  5. puts File.read('my_large_file.dat')
  6. Keep my hands off the keyboard after I hit enter
  7. Try about ~10-20 times
  8. ~5-10% of the time I'd see the output simply hang, and the container do nothing. To break this hang, I simply pressed any key on the keyboard.

This bug was originally causing a confusing series of flakey CI tests for us. I got around this by writing a SafeLogger that broke up any line after a few thousand characters, which alleviated the problem.

This seems like a long-standing issue and it'd be great to get more clarity on what might be causing it!

@tangkhaiphuong

This comment has been minimized.

tangkhaiphuong commented Jun 7, 2018

Same to me. I found that if a large of output from print to console inside the container can cause Docker freeze unpredictably for a long time.

When I found this issue, then I use the command to attach to the container to see the problem. The large of console output print to screen then container warn up again.

My boss is angry and asks me why this issue occurs when I upgrade docker
from 17.03.1-ce (2017-03-27)
to: 18.03.0-ce (2018-03-21)
environment: CentOS 7.0

I try with disable log driver with --log-driver none but the issue still happens :(

Anybody help?

@tangkhaiphuong

This comment has been minimized.

tangkhaiphuong commented Jun 11, 2018

I found that if we print many Unicode/Utf8 string to console/stdout can cause byte shift then print evil character (out of ASCII code) and random freeze (actually console waiting for input) due system detecting character which I don't know cause waiting for input from the console.
Because of my program in node.js and only single thread. So waiting input cause block all event loop then I try to press "Enter", program access and continue to run.

I don't print any more log to console and this issue is gone. So I just sharing my case for tracing if anybody has problem same to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment