Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleanup discovery dependency logs #403

Merged
merged 17 commits into from Oct 21, 2019

Conversation

@ilkinabdullayev
Copy link
Contributor

ilkinabdullayev commented Oct 7, 2019

Hided Messages

HIDE1:

2019-10-03 07:35:07.817 <ZWEADS1:main:13854> abdil01 INFO  (org.apache.coyote.http11.Http11NioProtocol) Initializing ProtocolHandler ["https-jsse-nio-127.0.0.1-10011”]
2019-10-03 07:36:27.940 <ZWEADS1:main:14186> abdil01 INFO  (org.eclipse.jetty.util.log) Logging initialized @5080ms to org.eclipse.jetty.util.log.Slf4jLog
2019-10-03 07:36:29.936 <ZWEADS1:main:14186> abdil01 INFO  (org.apache.tomcat.util.net.NioSelectorPool) Using a shared selector for servlet write/read

HIDE2:

It is hiding there’s Tomcat started on port(s): 10011 (https) with context path ‘’ log already

2019-10-03 13:03:24.659 <ZWEADS1:main:72754> abdil01 INFO  (org.springframework.boot.web.embedded.tomcat.TomcatWebServer) Tomcat initialized with port(s): 10011 (https)

HIDE3:

There are hiding because every service sent heart beat to eureka, when a service is not registered it says ‘register’. So it will always be there for every service at first time

2019-10-07 07:45:55.890 <ZWEADS1:https-jsse-nio-127.0.0.1-10011-exec-10:27748> abdil01 WARN  (com.netflix.eureka.registry.AbstractInstanceRegistry) DS: Registry: lease doesn't exist, registering resource: DISCOVERY - localhost:discovery:10011
2019-10-07 07:45:55.890 <ZWEADS1:https-jsse-nio-127.0.0.1-10011-exec-10:27748> abdil01 WARN  (com.netflix.eureka.resources.InstanceResource) Not Found (Renew): DISCOVERY - localhost:discovery:10011

NEED INVESTIGATION

PROBLEM1:

2019-10-03 13:29:12.924 <ZWEADS1:main:2914> at670475 WARN  (com.netflix.eureka.cluster.PeerEurekaNodes) The replica size seems to be empty. Check the route 53 DNS Registry
Because this is related to Amazon Route 53m, a highly available and scalable cloud Domain Name System for AWS.

ANSWER:
From https://blog.asarkar.org/technical/netflix-eureka/:
Netflix code (com.netflix.eureka.cluster.PeerEurekaNodes.isThisMyUrl) filters out the peer URLs that are on the same host. This may have been done to prevent the server registering as its own peer (I’m guessing here) but because they don’t check for the port, peer awareness doesn’t work unless the Eureka hostnames in the eureka.client.serviceUrl.defaultZone are different. The hacky workaround for this is to define unique hostnames and then map them to 127.0.0.1 in the /etc/hosts file (or its Windows equivalent). Spring Cloud doc talks about this workaround but fails to mention why it’s needed.

PROBLEM2:

2019-10-07 08:21:58.078 <ZWEADS1:Thread-14:37164> abdil01 WARN  (com.netflix.eureka.registry.AbstractInstanceRegistry) There is an existing lease and the existing lease's dirty timestamp 1570436488118 is greater than the one that is being registered 1570436427966
2019-10-07 08:21:58.078 <ZWEADS1:Thread-14:37164> abdil01 WARN  (com.netflix.eureka.registry.AbstractInstanceRegistry) Using the existing instanceInfo instead of the new instanceInfo as the registrant

Netflix/eureka#802

ANSWER:
Problem is happening because of Eureka tries to register itself twice. Other clients act as normal as expected. When they are running, they registering itself to eureka one time. Since our Eureka is eureka-client also, it tries two times registration. And at second registration time, it sent timestamp which is less than the previous request.
Probably we should avoid to set up our eureka as a client.

PROBLEM3:

2019-10-07 11:18:11.022 <MAS2ADS1:TaskBatchingWorker-target_usilca32.lvn.broadcom.net-0:17106370> MASSERV ERROR (c.n.e.c.Replication
TaskProcessor) It seems to be a socket read timeout exception, it will retry later. if it continues to happen and some eureka node o
ccupied all the cpu time, you should set property 'eureka.server.peer-node-read-timeout-ms' to a bigger value
com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out
at com.sun.jersey.client.apache4.ApacheHttpClient4Handler.handle(ApacheHttpClient4Handler.java:187)
at com.netflix.eureka.cluster.DynamicGZIPContentEncodingFilter.handle(DynamicGZIPContentEncodingFilter.java:48)
at com.netflix.discovery.EurekaIdentityHeaderFilter.handle(EurekaIdentityHeaderFilter.java:27)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:570)
at com.netflix.eureka.transport.JerseyReplicationClient.submitBatchUpdates(JerseyReplicationClient.java:116)
at com.netflix.eureka.cluster.ReplicationTaskProcessor.process(ReplicationTaskProcessor.java:80)
at com.netflix.eureka.util.batcher.TaskExecutors$BatchWorkerRunnable.run(TaskExecutors.java:187)
at java.lang.Thread.run(Thread.java:812)

Default value of peer-node-read-timeout-ms is 200 ms

PROBLEM4:

2019-10-07 11:19:15.315 <MAS2ADS1:TaskBatchingWorker-target_usilca32.lvn.broadcom.net-17:17106370> MASSERV WARN (c.n.e.c.Replicatio
nTask) The replication of task APICATALOG/usilca31.lvn.broadcom.net:apicatalog:10012:Heartbeat@usilca32.lvn.broadcom.net failed with
response code 409
2019-10-07 11:19:15.316 <MAS2ADS1:TaskBatchingWorker-target_usilca32.lvn.broadcom.net-17:17106370> MASSERV WARN (c.n.e.c.PeerEureka
Node) Peer wants us to take the instance information from it, since the timestamp differs,Id : usilca31.lvn.broadcom.net:apicatalog:
10012 My Timestamp : 1570447124929, Peer's timestamp: 1570447155048

Github issue
#360

PROBLEM5:

2019-10-07 11:18:04.759 <MAS2ADS2:main:396066> MASSERV WARN (o.s.j.s.JmxUtils) Found more than one MBeanServer instance. Returning
first from list.

Found here possible answer, old, but still could be: https://jira.sakaiproject.org/browse/SAK-38490?focusedCommentId=113671&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-113671
Yep, looks like true, I didn't find this parameter '-Dcom.sun.management.jmxremote' in VM options on Mainframe, but it is exist in launching Services from IDEA's Dashboard.
UPD: tried added this parameter on mainframe - not helped.

@ilkinabdullayev ilkinabdullayev changed the title (WP)cleanup discovery dependency logs (WIP)cleanup discovery dependency logs Oct 7, 2019
@ilkinabdullayev ilkinabdullayev force-pushed the us/cleanup_discovery_dependency_logs branch from 191ecd9 to 5334fde Oct 7, 2019
@codecov

This comment has been minimized.

Copy link

codecov bot commented Oct 7, 2019

Codecov Report

❗️ No coverage uploaded for pull request base (master@4e308da). Click here to learn what that means.
The diff coverage is 84.61%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #403   +/-   ##
=========================================
  Coverage          ?   71.01%           
  Complexity        ?       12           
=========================================
  Files             ?      244           
  Lines             ?     4433           
  Branches          ?      541           
=========================================
  Hits              ?     3148           
  Misses            ?     1154           
  Partials          ?      131
Impacted Files Coverage Δ Complexity Δ
...aas/gateway/error/MessageServiceConfiguration.java 0% <ø> (ø) 0 <0> (?)
...com/ca/mfaas/product/logging/ApimlLogInjector.java 89.47% <83.33%> (ø) 0 <0> (?)
...mfaas/product/logging/ApimlDependencyLogHider.java 85.71% <85.71%> (ø) 0 <0> (?)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e308da...372ec82. Read the comment docs.

@vsev0lod

This comment has been minimized.

Copy link
Collaborator

vsev0lod commented Oct 14, 2019

Just a reminder about:
ApiMessage is a raw type. References to generic type ApiMessage should be parameterized

@ilkinabdullayev ilkinabdullayev force-pushed the us/cleanup_discovery_dependency_logs branch from 4f60cb5 to 6057fe6 Oct 15, 2019
@ilkinabdullayev ilkinabdullayev changed the title (WIP)cleanup discovery dependency logs cleanup discovery dependency logs Oct 15, 2019
ilkinabdullayev and others added 15 commits Oct 7, 2019
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: Elena Kubantseva <elena.kubantseva@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
…to even more

Signed-off-by: Elena Kubantseva <elena.kubantseva@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Signed-off-by: janda06 <david.janda@broadcom.com>
Copy link
Contributor

jandadav left a comment

Great work, thanks!

@ilkinabdullayev ilkinabdullayev merged commit f0f3c00 into master Oct 21, 2019
5 checks passed
5 checks passed
DCO DCO
Details
WIP Ready for review
Details
codecov/patch 84.61% of diff hit (target 80%)
Details
codecov/project 71.01% (target 70%)
Details
continuous-integration/jenkins/pr-merge This commit looks good
Details
@delete-merged-branch delete-merged-branch bot deleted the us/cleanup_discovery_dependency_logs branch Oct 21, 2019
taban03 added a commit that referenced this pull request Nov 21, 2019
cleanup discovery dependency logs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.