Problems with EC2? #1185

ghost · 2018-11-03T14:28:30Z

I've searched before posting and see some similar symptoms, but didn't find an answer.

I'm running Algo on an EC2 t2.micro instance. The first one ran for a week or so, but seemed to slow down and eventually connection became impossible.

I terminated that instance and created a new one today. It worked well for a while, but began disconnecting and has now reached a state where I can't connect after multiple attempts (via macOS High Sierra Settings|Network).

During this time, I'm able to connect to the instance via SSH (top doesn't show anything unusual as far as I can tell), and the AWS Console shows status green/running state.

Wup, of course while I'm writing this I was finally able to connect, but again this is taking multiple attempts, so am not sure how long it will last.

Has anybody else been seeing similar behavior? I guess it could be a connection problem on my end, though this isn't happening with anything else, or even possibly some provider intervention, but am not sure about that either.

If it continues to occur I also guess I could try Digital Ocean, but read about its IPs being blocked by certain companies (likely relying on third-party options which may only proliferate).

Another update: I disconnected deliberately, but now can't reconnect--probably 8-10 attempts have failed.

Question: Do I need to wait a certain amount of time before attempting to reconnect?

Also, is there anything I can be looking at or for, such as particular processes running or not?

The text was updated successfully, but these errors were encountered:

TC1977 · 2018-11-03T16:27:57Z

Wonder if this is related to the infamous #963. Connect via SSH and try sudo ip xfrm pol list, and see if you don't have a ton of stale policies that aren't being used. Usually only outgoing policies are retained. Also you can try grep unable /var/log/syslog | grep charon and see if you have a whole bunch of error messages like "unable to install policy, the reqid already exists".

See #963 for more discussion, and also #1178 . The only solution we really have for now is to flush the policies and restart strongSwan, using ip xfrm pol flush and then sudo ipsec restart. @davidemyers just posted a script that should do that automatically in #1178.

ghost · 2018-11-04T05:57:10Z

Hey @TC1977, thanks for the help and the quick response. I'll have to learn more about what exactly these policies do, but I don't think I'm seeing too much there that wouldn't normally be--six that I think are standard network addresses, and four each of 'any address,' IPV4 and IPV6 versions. But I'm certainly not an expert on this (or many other things).

grep unable /var/log/syslog | grep charon came up empty.

I gather from other posts that disconnects aren't normal, but I notice that even when it's working there's still what seems like a timeout/disconnect that occurs, I think after a period of inactivity.

I'm fully flushed and restarted, and will see if I notice any difference now. Disconnect or inability to connect isn't reproducible on demand for me, as I was able to successfully connect again this morning, though again with an eventual disconnect/timeout as mentioned above.

By the way Team Algo, other than a hiccup with figuring out what and where to run after cloning the repo (I think the README is geared toward downloading the zipped version), this was one of the better "out of the box" open source experiences I've had. No extensive, separate pre-req installations, no multiple config files and steps, just follow the prompts to enter my key id and key and let Algo do its thing. Very polished and smoothed out (when we know that last 10-20% takes a lot of time and willpower to complete).

[Plus you didn't close the issue and tell me to go post on StackExchange, where questions go to die. :)]

TC1977 · 2018-11-04T13:54:45Z

Sure. To clarify, #963 refers specifically to a connecting/reconnecting loop that seems to be a bug in strongSwan. No real resolution yet. If you’re not seeing any “unable to install policy” errors, then you’ve got a different problem. “Stale policies” with this bug will show up in ip xfrm pol list as outgoing policies (“dir out”) without corresponding in or fwd policies.

Disconnects after a period of inactivity are normal, and if you have “Connect on demand” enabled on iOS or macOS, it will just reconnect when it needs to send traffic again.

Some other places to look for problems in the Algo server are sudo ipsec statusall to confirm that you do actually connect and send traffic, and service dnscrypt-proxy status to make sure your DNS service is working.

ghost · 2018-11-07T10:04:27Z

Not sure on this one so will close for now--the timeouts are normal then, and if I'm unable to connect I'll try the commands again, or just wait a bit.

I know it does work fine when connected, because I've been checking my IP.

Thanks again for the help. 👍

ghost closed this as completed Nov 7, 2018

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with EC2? #1185

Problems with EC2? #1185

ghost commented Nov 3, 2018 •

edited by ghost

Loading

TC1977 commented Nov 3, 2018 •

edited

Loading

ghost commented Nov 4, 2018 •

edited by ghost

Loading

TC1977 commented Nov 4, 2018

ghost commented Nov 7, 2018

Problems with EC2? #1185

Problems with EC2? #1185

Comments

ghost commented Nov 3, 2018 • edited by ghost Loading

TC1977 commented Nov 3, 2018 • edited Loading

ghost commented Nov 4, 2018 • edited by ghost Loading

TC1977 commented Nov 4, 2018

ghost commented Nov 7, 2018

ghost commented Nov 3, 2018 •

edited by ghost

Loading

TC1977 commented Nov 3, 2018 •

edited

Loading

ghost commented Nov 4, 2018 •

edited by ghost

Loading