Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Haskell stack hangs during setup #2290

Closed
ghost opened this issue Jul 6, 2017 · 5 comments
Closed

Haskell stack hangs during setup #2290

ghost opened this issue Jul 6, 2017 · 5 comments
Assignees

Comments

@ghost
Copy link

ghost commented Jul 6, 2017

Version: Microsoft Windows [Version 10.0.16232.1000]
Issue: Command stack setup hangs while trying to download a file:

 $> stack setup -v
Version 1.4.0, Git revision e714f1dd3fade19496d91bd6a017e435a96a6bcd (4640 commits) x86_64 hpack-0.17.0  
2017-07-05 21:12:23.898157: [debug] Checking for project config at: /home/sanghakchun/stack.yaml  
@(Stack/Config.hs:935:9)  
2017-07-05 21:12:23.899831: [debug] Checking for project config at: /home/stack.yaml  
@(Stack/Config.hs:935:9)  
2017-07-05 21:12:23.900575: [debug] Checking for project config at: /stack.yaml  
@(Stack/Config.hs:935:9)  
2017-07-05 21:12:23.901243: [debug] No project config file found, using defaults.  
@(Stack/Config.hs:964:13)  
2017-07-05 21:12:23.904633: [debug] Run from outside a project, using implicit global project config  
@(Stack/Config.hs:526:13)  
2017-07-05 21:12:23.905576: [info] Writing implicit global project config file to: /home/sanghakchun/.stack/global-project/stack.yaml  
@(Stack/Config.hs:554:20)  
2017-07-05 21:12:23.906408: [info] Note: You can change the snapshot via the resolver field there.  
@(Stack/Config.hs:555:20)  
2017-07-05 21:12:23.907175: [debug] Downloading snapshot versions file from https://s3.amazonaws.com/haddock.stackage.org/snapshots.json  
@(Stack/Config.hs:177:5)
^C

stack should progress beyond this step, but instead the command hangs at this point. On rare occasion, the command progresses to the next stage, where it hangs trying to fetch another file:

2017-07-05 21:05:53.831107: [debug] Downloading snapshot versions file from https://s3.amazonaws.com/haddock.stackage.org/snapshots.json
@(Stack/Config.hs:177:5)
2017-07-05 21:05:58.023132: [debug] Done downloading and parsing snapshot versions file
@(Stack/Config.hs:179:5)
2017-07-05 21:05:58.023741: [info] Using latest snapshot resolver: lts-8.21
@(Stack/Config.hs:617:17)
2017-07-05 21:05:58.032268: [debug] Trying to decode /home/sanghakchun/.stack/build-plan-cache/x86_64-linux/lts-8.21.cache
@(Data/Store/VersionTagged.hs:68:5)
2017-07-05 21:05:58.033061: [debug] Exception ignored when attempting to load /home/sanghakchun/.stack/build-plan-cache/x86_64-linux/lts-8.21.cache: /home/sanghakchun/.stack/build-plan-cache/x86_64-linux/lts-8.21.cache: openBinaryFile: does not exist (No such file or directory)
@(Data/Store/VersionTagged.hs:86:9)
2017-07-05 21:05:58.033427: [debug] Failure decoding /home/sanghakchun/.stack/build-plan-cache/x86_64-linux/lts-8.21.cache
@(Data/Store/VersionTagged.hs:75:13)
2017-07-05 21:05:58.033939: [debug] Decoding build plan from: /home/sanghakchun/.stack/build-plan/lts-8.21.yaml
@(Stack/BuildPlan.hs:493:5)
2017-07-05 21:05:58.034598: [debug] Decoding build plan from file failed: InvalidYaml (Just (YamlException "Yaml file not found: /home/sanghakchun/.stack/build-plan/lts-8.21.yaml"))
@(Stack/BuildPlan.hs:498:13)
2017-07-05 21:05:58.035647: [debug] Downloading build plan from: https://raw.githubusercontent.com/fpco/lts-haskell/master//lts-8.21.yaml
@(Stack/BuildPlan.hs:503:13)
2017-07-05 21:05:58.036219: [debug] Downloading /fpco/lts-haskell/master//lts-8.21.yaml
@(Network/HTTP/Download.hs:78:5)

Offending strace portion:

socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 10
fcntl(10, F_GETFL)                      = 0x2 (flags O_RDWR)
fcntl(10, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
connect(10, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("54.231.113.208")}, 16) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_value={int=0, ptr=0}} ---
rt_sigreturn({mask=[]})                 = 42
connect(10, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("54.231.113.208")}, 16) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_value={int=0, ptr=0}} ---
rt_sigreturn({mask=[]})                 = 42
connect(10, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("54.231.113.208")}, 16) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_value={int=0, ptr=0}} ---
rt_sigreturn({mask=[]})                 = 42
connect(10, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("54.231.113.208")}, 16) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_value={int=0, ptr=0}} ---
rt_sigreturn({mask=[]})                 = 42
connect(10, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("54.231.113.208")}, 16) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_value={int=0, ptr=0}} ---
rt_sigreturn({mask=[]})                 = 42
connect(10, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("54.231.113.208")}, 16) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_value={int=0, ptr=0}} ---
rt_sigreturn({mask=[]})                 = 42

`... and it repeats forever`

Issue seems network-related

Full strace: https://gist.github.com/sanghakchun/46850f9407ca37b46470c58c40956660

@benhillis benhillis self-assigned this Jul 11, 2017
@benhillis
Copy link
Member

I'll take a look at this and see if I can get a repro.

@benhillis
Copy link
Member

benhillis commented Jul 26, 2017

@sanghakchun - Thank you very much for the excellent repro steps and debugging information. As you suspected this is network related. Here's the rundown:

The application sets non-blocking mode on the socket file descriptor and calls connect, expecting it to return EINPROGRESS. Since we don’t support non-blocking connect we do a normal blocking connect. The application also has a posix timer delivering a signal every 10000000 nanoseconds which is causing each of these connect to get interrupted by the signal before the request can be completed (it is going to network after all). This causes the connect syscall to restart and get interrupted again by this timer, rinse, repeat.

Non-blocking connect is on our backlog, but until that is implemented I've drafted a fix that will not allow connect requests that should be non-blocking to be interrupted by signals. With this fix stack setup is able to proceed successfully.

This change is in code review and will be available in an insider build (and the Fall Creators Update).

@ghost
Copy link
Author

ghost commented Jul 26, 2017

@benhillis Thanks for the update and the upcoming fix! Looking forward to the build.

@sunilmut
Copy link
Member

sunilmut commented Aug 2, 2017

@sanghakchun - Thanks for the post and the details. Thanks @benhillis for brining this to my attention and the workaround fix. PR for non-blocking connect is under review and will get in soon. It unblocks this scenario. The fix will also be available for Fall Creators Update.

@sunilmut sunilmut added fixed and removed fixinbound labels Oct 27, 2017
@sunilmut
Copy link
Member

This is fixed in 16273.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants