Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mystery failure accessing local server web pages #110

Open
scripting opened this Issue Feb 27, 2019 · 35 comments

Comments

Projects
None yet
8 participants
@scripting
Copy link
Owner

scripting commented Feb 27, 2019

I'm having a problem accessing services on the same server.

When making a request from a Node.js process of a process running on the same server, I get an ECONNREFUSED. This has happened on two servers. The code used to work. It's part of my serverMontir app.

Then I tried accessing the server through the CURL command line utility and it also failed.

Yet if I access the same address from another computer, it works.

WTF is going on?

@scripting scripting changed the title Mystery accessing local server web pages Mystery failure accessing local server web pages Feb 27, 2019

@tedchoward

This comment has been minimized.

Copy link

tedchoward commented Feb 27, 2019

Can you share some sample code?

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 27, 2019

Ted in this case I think the problem has more to do with the system, rather than language, because CURL showed the same behavior as the node app.

Here's a screen shot of a terminal session where I show the source of the test app, the result of running it, and the curl call.

image

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 27, 2019

The server is on http://digitalocean.com

@theboxfactory

This comment has been minimized.

Copy link

theboxfactory commented Feb 27, 2019

Dave I think your correct it's configuration issue, not code.

What does a 'ping likes.scripting.com' on the server command line return?

Do you have an entry for 'likes.scripting.com' in your server hosts file?

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 27, 2019

image

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 27, 2019

I don't modify my server hosts file.

Also that's just one example, all the other sites on the same server return ECONNREFUSED.

And when I run it on another server, it fails to connect with servers running on that server, and works fine with the apps running on this server.

@tedchoward

This comment has been minimized.

Copy link

tedchoward commented Feb 27, 2019

This is a tough one. I have a digital ocean server, where I run river5.

screen shot 2019-02-27 at 3 54 37 pm

As you see, I can curl that server from that server.

So, what is different between our two environments?

@tedchoward

This comment has been minimized.

Copy link

tedchoward commented Feb 27, 2019

How do you run your node apps on digital ocean? Are they running as a daemon? Do you use a process manager?

@tedchoward

This comment has been minimized.

Copy link

tedchoward commented Feb 27, 2019

I'm using pm2 to run river5 as a background process.

ted@river5 in ted/
› pm2 list
┌────────┬────┬────────┬────────┬────────┬─────┬────────┬───────────┐
│ Name   │ id │ mode   │ status │ ↺      │ cpu │ memory │
├────────┼────┼────────┼────────┼────────┼─────┼────────┼───────────┤
│ river5 │ 0  │ 0.5.15 │ fork   │ online │ 1   │ 0.3%   │ 72.0 MB   │
└────────┴────┴────────┴────────┴────────┴─────┴────────┴───────────┘
 Use `pm2 show <id|name>` to get more details about an app
@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 27, 2019

@brentashley

This comment has been minimized.

Copy link

brentashley commented Feb 28, 2019

if you do
netstat -an | grep LISTEN
you will see what address(es) your http(s) ports are bound to. If they are bound to the external address only, they will not be listening on localhost (127.0.0.1), but if they are bound to 0.0.0.0 they are listening on all interfaces.

also, using ping, dig and nslookup can give you an idea whether different machines see themselves and other machines by different addresses.

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 28, 2019

@brentashley

This comment has been minimized.

Copy link

brentashley commented Feb 28, 2019

If the machines use different DNS servers or have host entries, the external name could resolve differently. That is, if the machines hosts file has the name mapped to 127.0.0.1, that is what it will use rather than the external interface.

@justchapman

This comment has been minimized.

Copy link

justchapman commented Feb 28, 2019

I'm wondering if this is a firewall issue on the host. Maybe its blocking a request originating from the localhost to port 80?

@theboxfactory

This comment has been minimized.

Copy link

theboxfactory commented Feb 28, 2019

Dave can please try adding an entry to you hosts file as follows:

127.0.0.1 likes.scripting.com

Please let know how if anything, this impacts the curl and your test.js script

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 28, 2019

I appreciate all the help, that's the first thing.

Second. I never ever under any circumstances change the hosts file. I move apps around a lot and they have to work where ever I put them. I use DNS.

I did a host command on the same server likes.scripting.com is running on. This is what it returned.

image

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 28, 2019

Okay here something different about my systems from yours, probably.

I'm running my own HTTP server, PagePark.

So the delegation is going through that app.

This is the code that's being executed on a delegated request.

https://github.com/scripting/pagePark/blob/master/pagepark.js#L539

I'm starting to chase down this thread. I'm guessing something must have changed somewhere or there's a bug that's been in there for years that's only showing up now for some reason.

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 28, 2019

PagePark is running on port 1339.

That's mapped onto port 80 with this command.

sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 1339

Same thing on all my servers which run PagePark, which is almost all of them.

@papascott

This comment has been minimized.

Copy link

papascott commented Feb 28, 2019

I see on DNS you're running feedbase.io on the same server. Does curl http://feedbase.io return the same error as curl http://likes.scripting.com?

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 28, 2019

@papascott -- yes. that was in the initial spec of the problem. But I just re-verified it.

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 28, 2019

The request coming from curl does not show up in the pagepark log.

That's pretty convincing evidence that the request is not being seen by pagepark.

@justchapman

This comment has been minimized.

Copy link

justchapman commented Feb 28, 2019

I vaguely remember running into a similar issue with nginx to a NAT'd address. Try briefly stopping the firewall to test. That would help eliminate culprits.

@mcenirm

This comment has been minimized.

Copy link

mcenirm commented Feb 28, 2019

It's been a while since I delved that deeply into iptables, but PREROUTING doesn't apply to traffic from the machine, since no routing is actually needed, I guess. I think the trick is to add a duplicate rule in the OUTPUT chain. Be sure to specify the destination (-d) so it doesn't redirect on traffic to other systems.

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 28, 2019

@papascott

This comment has been minimized.

Copy link

papascott commented Feb 28, 2019

Grasping at straws here... might there be a crashed or hung process blocking access to port 80 (but only from localhost)?

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 28, 2019

@brentashley

This comment has been minimized.

Copy link

brentashley commented Feb 28, 2019

You should be able to confirm or deny arrival at port 80 with tcpdump

I think you should also be able to confirm arrival at port 1339 with tcpdump after the iptables redirect

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 28, 2019

@brentashley

This comment has been minimized.

Copy link

brentashley commented Feb 28, 2019

find out which interface has the external address from ifconfig. let's say that interface is called eth0.

sudo tcpdump -i eth0 tcp port 80

then curl to port 80 from another ssh session and you should see that traffic. (you may see lots of other traffic since it is a live server with other things on port 80 too so you will have to look for the right source ip to confirm your traffic.

then you can hit ctrl-c to stop tcpdump, and try again with port 1339

sudo tcpdump -i eth0 tcp port 1339

and see if your requests are getting that far.

@jystervinou

This comment has been minimized.

Copy link

jystervinou commented Feb 28, 2019

curl http://likes.scripting.com:1339 is also failing from the local server?

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Feb 28, 2019

@jystervinou -- interesting question. it does not fail.

image

@jystervinou

This comment has been minimized.

Copy link

jystervinou commented Feb 28, 2019

So the web server part is configured and listening correctly on the right ip/port.

I'd dive into the iptable's stuff, but i don't know that well. @brentashley 's suggestion to see the tcp traffic arriving to port 80 is a good idea (with tcpdump).

If that's a problem with iptable, that does not explain yet why it would have changed in the recent days.

Can someone explain how to see iptable's log of dropped packets? Are they in /var/log/messages?

@justchapman

This comment has been minimized.

Copy link

justchapman commented Feb 28, 2019

Are you by chance running something like fail2ban or APF Firewall with BFD? It's possible it auto-added a block to your firewall. That would explain the sudden change.

Try flushing your iptables: iptables –flush

then test again. If it works now, it's a firewall rule.

Be sure to restart the iptables service to re-apply your ruleset after testing.

@mcenirm

This comment has been minimized.

Copy link

mcenirm commented Mar 1, 2019

I have no explanation for the change in behavior. Based on the iptables settings you described, it shouldn't have worked from the same machine at all.

The direct solution to your problem is to add the rule to the OUTPUT chain, in addition to the PREROUTING chain, but restrict the destination to the externally-visible IP address:

iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 1339
iptables -t nat -A OUTPUT -d 104.236.68.5 -p tcp --dport 80 -j REDIRECT --to-port 1339

Another approach is to have pagepark use port 80 directly, so that the iptables voodoo is unnecessary. If you're using a systemd-based Linux distribution, such as recent Ubuntu or Centos 7, then the use of socket activation is a good stable way to keep things working consistently. See Ruben Vermeersch's Deploying Node.js with systemd for a nice how-to.

I tried the more direct approach, where pagepark listens on port 80 (ie, setting myPort to 80), using privbind (privbind -u worker forever start pagepark.js), but for some reason, it didn't work as expected, while something unrelated to nodejs worked just fine (privbind -u worker nc -v -l 999). This was on the nodejs droplet image: Ubuntu 18.04.1, node v8.10.0, privbind 1.2. There is also authbind, but it requires a bit more setup, so I haven't tested it yet.

You can also try a conventional reverse proxy, like nginx or digitalocean's load balancers, but I understand that would overlap pagepark's delegation feature.

Finally, if you're running pagepark as root, then all of the above is irrelevant, since root can bind port 80 without any trouble.

@scripting

This comment has been minimized.

Copy link
Owner Author

scripting commented Mar 1, 2019

There's a lot of new info here, and it's hard for me to digest because I am only surface-level aware of the way the TCP functions in Linux work. I will try to figure out what you all are saying. ;-)

In the meantime, there is a possible source of the problem. One of the servers moved from EC2 in Amazon's cloud to Digital Ocean fairly recently. It's possible the problem surfaced during the move, and was only noticed now. The reason -- the software that's got the problem is the software that watches my servers to see if everything is working. ;-)

So what we're seeing here may be a difference between AWS and Digital Ocean.

Note also: The urgency is gone because I worked around the problem, by moving the serverMonitor app onto its own virtual server. So it doesn't have to watch any processes on the same server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.