Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Countless retries and log spam for misconfigure port. #31

Closed
simitt opened this issue Sep 18, 2017 · 27 comments
Closed

Countless retries and log spam for misconfigure port. #31

simitt opened this issue Sep 18, 2017 · 27 comments
Labels

Comments

@simitt
Copy link
Contributor

simitt commented Sep 18, 2017

I misconfigured the apm server url by accident serverUrl: 'https://localhost:82000'.
This lead to countless retries of sending data to the APM Server, see logs:

logging error 6a9ff729-cec0-45d5-bc5e-a57e27f2a25d with Elastic APM
logging error 54c4eb0f-5a98-4b41-b07e-5b4f60a2ba5f with Elastic APM
logging error 8cbe19af-37fb-468b-b9bf-8070b5531ca4 with Elastic APM
logging error f9d805e8-488f-4a41-b4f6-20154d8101fe with Elastic APM
logging error 53df0b4b-2cd0-419c-8af4-27be7f36d716 with Elastic APM
logging error 80fd91b6-c09f-48cb-8787-b8ac0d1d1775 with Elastic APM
logging error c9858407-17a3-4968-8cce-297e0bd31f3d with Elastic APM
logging error b3a69e7b-8eec-4580-a219-e72eb4e5b637 with Elastic APM
logging error 6935b29d-9629-4dc2-8b62-9954e6559693 with Elastic APM
logging error 6e31e7cb-4a2d-44e1-bb1c-279bef550de5 with Elastic APM
logging error cf647462-65e1-48e3-a3c4-15a3779bce6f with Elastic APM
logging error 79e1590c-bdad-4c41-b97e-ccc43189bdbb with Elastic APM
...

When I just misconfigure to another wrong port, e.g. 8201 I get a proper Connection Refused error.

@simitt simitt added the bug label Sep 18, 2017
@watson
Copy link
Contributor

watson commented Sep 19, 2017

For reference, this is the actual error that occurs. It's thrown by the agent and then caught again by the agent and so on:

net.js:1023
      throw new RangeError('"port" option should be >= 0 and < 65536: ' + port);
      ^

RangeError: "port" option should be >= 0 and < 65536: 82000
    at lookupAndConnect (net.js:1023:13)
    at TLSSocket.Socket.connect (net.js:995:5)
    at TLSSocket.connect (/Users/watson/code/node_modules/elastic-apm/lib/instrumentation/async-hooks.js:125:29)
    at Object.exports.connect (_tls_wrap.js:1070:12)
    at Agent.createConnection (https.js:111:22)
    at Agent.createSocket (_http_agent.js:214:26)
    at Agent.addRequest (_http_agent.js:184:10)
    at Agent.addRequest (/Users/watson/code/node_modules/elastic-apm/lib/instrumentation/async-hooks.js:144:23)
    at new ClientRequest (_http_client.js:272:16)
    at Object.request (http.js:39:10)
    at Object.request (/Users/watson/code/node_modules/elastic-apm/lib/instrumentation/http-shared.js:92:22)
    at Object.request (https.js:239:15)
    at /Users/watson/code/node_modules/elastic-apm/node_modules/elastic-apm-http-client/index.js:56:35
    at Gzip.onEnd (zlib.js:130:5)
    at emitNone (events.js:110:20)
    at Gzip.emit (events.js:207:7)
    at endReadableNT (_stream_readable.js:1045:12)
    at elasticAPMCallbackWrapper (/Users/watson/code/node_modules/elastic-apm/lib/instrumentation/index.js:109:27)
    at _combinedTickCallback (internal/process/next_tick.js:138:11)
    at process._tickCallback (internal/process/next_tick.js:180:9)

@watson
Copy link
Contributor

watson commented Sep 19, 2017

If the agent ever goes into an infinite loop like this a trick is to add this config option: captureExceptions: false. Then it will catch uncaught exceptions.

@watson
Copy link
Contributor

watson commented Sep 19, 2017

But obviously the agent shouldn't throw because of this. And it would be nice if we in some way to guard against these infinite call loops 🤔

@watson
Copy link
Contributor

watson commented Sep 19, 2017

After thinking about this I say that it's expected behaviour for the agent to throw in this case. So the actual issue here is the looping uncaught exception. I don't have a good way of detecting this at the moment. Suggestions much appreciated 😃

@simitt
Copy link
Contributor Author

simitt commented Sep 19, 2017

Should the agent really be allowed to throw exceptions rather than to just log them once per occurence?
This worries me a bit, from the perspective that this could lead to major overhead in the app if there is some misconfiguration.

@watson
Copy link
Contributor

watson commented Sep 19, 2017

The looping uncaught exception should obviously not happen, but throwing if you provide invalid input is a fairly normal behaviour in Node.js land. You could argue that this module is a special case though, not sure.

@watson
Copy link
Contributor

watson commented Dec 12, 2017

@Qard would love your input on this 😃

@Qard
Copy link
Contributor

Qard commented Dec 12, 2017

I don't really have an opinion either way. Crashing is fairly standard in Node.js, but also APM vendors generally try to have as little side effects as possible, so they generally downgrade internal errors to warnings and just disable themselves.

If we want to just warn, we can catch the connection error internally and log that. If we want to throw, we could catch the connection error, put a special property on it and rethrow, then have our uncaughtException handler abort properly when it encounters an error with our special no-really-i-want-to-fail property.

@Qard
Copy link
Contributor

Qard commented Feb 15, 2018

Do we have a decision for what to do about this? This issue has been open for awhile...

@watson
Copy link
Contributor

watson commented Feb 16, 2018

No, but thanks for bumping it. We should find a good solution. After this was opened another related issue has been created as well: #232

@watson
Copy link
Contributor

watson commented Feb 16, 2018

The two options we have is either:

  1. Throw on errors, but take care not to catch is out selfs as an uncaught exception
    1. One way to ensure we don't catch it our self is to add a property to the error that we can listen for
    2. Another way is to analyze the stack trace to see if we're part of it
  2. Log errors to stderr instead of throwing

Only 1.ii handles the case where we throw by a mistake (e.g. a coding mistake or if someone passes us an unexpected argument that we try to access in a wrong way), so I'm leaning towards that solution. But I'm not sure how hard it would be to correctly detect these stack traces

@Qard
Copy link
Contributor

Qard commented Oct 26, 2018

How have the changes in API v2 impacted this? Is this still relevant, or can we close this?

@watson
Copy link
Contributor

watson commented Nov 5, 2018

This is also an issue with v2 as far as I can see. The backoff algorithm should take care of that however when we implement that

@prashant-shahi
Copy link

Are these issues resolved?

I am getting ECONNRESET like 20 seconds afterwards on the micro-service from where the transaction initiates and ECONNREFUSED on the rest.

@watson
Copy link
Contributor

watson commented Dec 11, 2018

@coolboi567 No this particular issue is not resolved. There's a in-progress PR: elastic/apm-nodejs-http-client#17

But are you sure that this is the issue you're experiencing? From reading your description, I'm not entirely sure what you're experiencing

@prashant-shahi
Copy link

But are you sure that this is the issue you're experiencing?

Actually, I encountered this when I was checking the APM client behavior(and application's behavior) when APM server is down or different APM address was mentioned.

Application behavior was fine. All the native functionalities seem to work fine.
Just the APM client was flooding the console with ECONNRESET and ECONNREFUSED.
Can I redirect only these APM client log to a different file?

@watson
Copy link
Contributor

watson commented Dec 12, 2018

@coolboi567 Thanks for clarifying. It sounds like a slightly different but related issue. But the good news is that it will also be fixed by elastic/apm-nodejs-http-client#17.

Regarding redirecting the log output: By default the agent will log to stdout/stderr. But you can supply a custom logger using the logger config option. That will allow you to redirect the output to a specific file.

@MHDMAM
Copy link

MHDMAM commented Dec 17, 2018

Well I get some thing when I miss configure something or missing some npm modules.

However, I was checking my production server logs and found this for the first time -not quite sure if it's related -
logging error 81f41103-f2e8-404b-8624-426c1e2ac265 with Elastic APM
but I was wondering how I can check what caused it.

@watson
Copy link
Contributor

watson commented Dec 17, 2018

@MHDMAM That sounds unrelated to this issue. Could I get you to ask on our our Discuss forum instead please? 😃

@prashant-shahi
Copy link

@watson Hey Thomas, any suggestion on a lite logger to separate agent's output in a different file would be really helpful.

@vigneshshanmugam
Copy link
Member

@prashant-shahi
Copy link

@vigneshshanmugam Hey Vignesh, can I achieve the same without using any logging module?
If yes, how can I achieve so?
If no, can the same be achieved using winston since we are already using it?

@vigneshshanmugam
Copy link
Member

vigneshshanmugam commented Dec 31, 2018

@coolboi567 The default logger used in the agent only writes logs to STDOUT/STDERR based on the log level and does not have support for writing it to files.

Winston does support it - https://github.com/winstonjs/winston/blob/master/docs/transports.md#file-transport

@codebrain
Copy link

Hi @vigneshshanmugam - did you mean to mention @coolboi567 instead?

@vigneshshanmugam
Copy link
Member

@codebrain Oops yeah sorry for the spam.. updated my comment :)

@watson
Copy link
Contributor

watson commented Jan 7, 2019

You can read how to change the logger here: https://www.elastic.co/guide/en/apm/agent/nodejs/current/configuration.html#logger

Qard pushed a commit to Qard/apm-agent-nodejs that referenced this issue Jan 17, 2019
@Qard Qard mentioned this issue Jan 17, 2019
3 tasks
Qard pushed a commit to Qard/apm-agent-nodejs that referenced this issue Jan 17, 2019
@alvarolobato
Copy link

We have done some improvements to this situation (see linked PRs and comments about changing the logger.), at this point we don't see any other clear improvement in this line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants