New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[worker-checker] ERROR:pika.adapters.base_connection:Socket Error: 104 #65

Open
graphicore opened this Issue Apr 19, 2018 · 0 comments

Comments

1 participant
@graphicore
Collaborator

graphicore commented Apr 19, 2018

The checkers did not start checking a drag and drop job and it seems like this is the relevant error.

 {
 insertId: "i1e6rbg2t68h7d"  
 labels: {
  compute.googleapis.com/resource_name: "fluentd-gcp-v2.0.9-dk6pn"   
  container.googleapis.com/namespace_name: "default"   
  container.googleapis.com/pod_name: "fontbakery-worker-checker-3952278727-3z9j1"   
  container.googleapis.com/stream: "stderr"   
 }
 logName: "projects/fontbakery-168509/logs/fontbakery-worker-checker"  
 receiveTimestamp: "2018-04-19T11:42:44.875777889Z"  
 resource: {
  labels: {
   cluster_name: "fontbakery-dashboard-1"    
   container_name: "fontbakery-worker-checker"    
   instance_id: "357943439958451433"    
   namespace_id: "default"    
   pod_id: "fontbakery-worker-checker-3952278727-3z9j1"    
   project_id: "fontbakery-168509"    
   zone: "us-central1-a"    
  }
  type: "container"   
 }
 severity: "ERROR"  
 textPayload: "ERROR:pika.adapters.base_connection:Socket Error: 104
"  
 timestamp: "2018-04-19T11:42:42Z"  
}

AND:

 {
 insertId: "i1e6rbg2t68h8e"  
 labels: {
  compute.googleapis.com/resource_name: "fluentd-gcp-v2.0.9-dk6pn"   
  container.googleapis.com/namespace_name: "default"   
  container.googleapis.com/pod_name: "fontbakery-worker-checker-3952278727-3z9j1"   
  container.googleapis.com/stream: "stderr"   
 }
 logName: "projects/fontbakery-168509/logs/fontbakery-worker-checker"  
 receiveTimestamp: "2018-04-19T11:42:44.875777889Z"  
 resource: {
  labels: {
   cluster_name: "fontbakery-dashboard-1"    
   container_name: "fontbakery-worker-checker"    
   instance_id: "357943439958451433"    
   namespace_id: "default"    
   pod_id: "fontbakery-worker-checker-3952278727-3z9j1"    
   project_id: "fontbakery-168509"    
   zone: "us-central1-a"    
  }
  type: "container"   
 }
 severity: "ERROR"  
 textPayload: "ERROR:pika.adapters.blocking_connection:Connection close detected; result=BlockingConnection__OnClosedArgs(connection=<SelectConnection CLOSED socket=None params=<ConnectionParameters host=10.51.247.140 port=5672 virtual_host=/ ssl=False>>, reason_code=-1, reason_text="error(104, 'Connection reset by peer')")
"  
 timestamp: "2018-04-19T11:42:42Z"  
}

Restarting the pods (the fontbakery-worker-checker replica set) helped.

This should self heal, and a crash of the workers would have caused them to be restarted by kubernetes.

Thus, error handling of pika errors should let the pod crash not try to revover internally.

Sadly, there's not much more information available in i.e. https://console.cloud.google.com/logs/viewer?resource=container%2Fcluster_name%2Ffontbakery-dashboard-1%2Fnamespace_id%2Fdefault&logName=projects%2Ffontbakery-168509%2Flogs%2Ffontbakery-worker-checker&expandAll=false&timestamp=2018-04-19T11:43:12.000000000Z&project=fontbakery-168509&minLogLevel=0&interval=JUMP_TO_TIME&scrollTimestamp=2018-04-17T04:44:09.000000000Z&filters=text:ERROR:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment