Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stacktrace in development environment #31

Closed
mkudlej opened this issue Dec 1, 2016 · 11 comments
Closed

stacktrace in development environment #31

mkudlej opened this issue Dec 1, 2016 · 11 comments
Assignees

Comments

@mkudlej
Copy link

mkudlej commented Dec 1, 2016

I've tried to install all components from sources by Ansible playbooks
from https://github.com/Tendrl/usmqe-setup.

I see this stacktrace when I call

curl localhost:9393/1.0/Flows
$ shotgun
== Shotgun/Puma on http://127.0.0.1:9393/
Puma starting in single mode...
* Version 3.6.0 (ruby 2.0.0-p598), codename: Sleepy Sunday Serenity
* Min threads: 0, max threads: 16
* Environment: development
* Listening on tcp://127.0.0.1:9393
Use Ctrl-C to stop
Etcd::KeyNotFound - Key not found:
        /usr/local/share/gems/gems/etcd-0.3.0/lib/etcd/client.rb:141:in `process_http_request'
        /usr/local/share/gems/gems/etcd-0.3.0/lib/etcd/client.rb:114:in `api_execute'
        /usr/local/share/gems/gems/etcd-0.3.0/lib/etcd/keys.rb:22:in `get'
        /opt/tendrl/tendrl-api/node.rb:6:in `block in <class:Node>'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1603:in `call'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1603:in `block in compile!'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1006:in `[]'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1006:in `block in process_route'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1004:in `catch'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1004:in `process_route'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:957:in `block in filter!'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:957:in `each'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:957:in `filter!'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1075:in `block in dispatch!'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1058:in `block in invoke'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1058:in `catch'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1058:in `invoke'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1073:in `dispatch!'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:898:in `block in call!'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1058:in `block in invoke'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1058:in `catch'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1058:in `invoke'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:898:in `call!'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:886:in `call'
        /usr/local/share/gems/gems/rack-protection-1.5.3/lib/rack/protection/xss_header.rb:18:in `call'
        /usr/local/share/gems/gems/rack-protection-1.5.3/lib/rack/protection/path_traversal.rb:16:in `call'
        /usr/local/share/gems/gems/rack-protection-1.5.3/lib/rack/protection/json_csrf.rb:18:in `call'
        /usr/local/share/gems/gems/rack-protection-1.5.3/lib/rack/protection/base.rb:49:in `call'
        /usr/local/share/gems/gems/rack-protection-1.5.3/lib/rack/protection/base.rb:49:in `call'
        /usr/local/share/gems/gems/rack-protection-1.5.3/lib/rack/protection/frame_options.rb:31:in `call'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/nulllogger.rb:9:in `call'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/head.rb:13:in `call'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:180:in `call'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:2014:in `call'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/urlmap.rb:66:in `block in call'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/urlmap.rb:50:in `each'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/urlmap.rb:50:in `call'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/lint.rb:49:in `_call'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/lint.rb:37:in `call'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/showexceptions.rb:24:in `call'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/commonlogger.rb:33:in `call'
        /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:217:in `call'
        /usr/local/share/gems/gems/shotgun-0.9.2/lib/shotgun/loader.rb:86:in `proceed_as_child'
        /usr/local/share/gems/gems/shotgun-0.9.2/lib/shotgun/loader.rb:31:in `call!'
        /usr/local/share/gems/gems/shotgun-0.9.2/lib/shotgun/loader.rb:18:in `call'
        /usr/local/share/gems/gems/shotgun-0.9.2/lib/shotgun/favicon.rb:12:in `call'
        /usr/local/share/gems/gems/shotgun-0.9.2/lib/shotgun/static.rb:14:in `call'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/urlmap.rb:66:in `block in call'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/urlmap.rb:50:in `each'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/urlmap.rb:50:in `call'
        /usr/local/share/gems/gems/rack-1.6.4/lib/rack/builder.rb:153:in `call'
        /usr/local/share/gems/gems/puma-3.6.0/lib/puma/configuration.rb:225:in `call'
        /usr/local/share/gems/gems/puma-3.6.0/lib/puma/server.rb:578:in `handle_request'
        /usr/local/share/gems/gems/puma-3.6.0/lib/puma/server.rb:415:in `process_client'
        /usr/local/share/gems/gems/puma-3.6.0/lib/puma/server.rb:275:in `block in run'
        /usr/local/share/gems/gems/puma-3.6.0/lib/puma/thread_pool.rb:116:in `call'
        /usr/local/share/gems/gems/puma-3.6.0/lib/puma/thread_pool.rb:116:in `block in spawn_thread'
127.0.0.1 - - [01/Dec/2016:16:10:06 +0000] "GET /1.0/Flows HTTP/1.1" 500 30 0.0073
@TimothyAsir
Copy link
Contributor

TimothyAsir commented Dec 6, 2016

Its looks like some issue with etcd configuration.
Could you please check the configuration in etcd.yml in /etc/tendrl directory under production tag.
Once you configure the etcd, please restart the httpd service and then restart tendrl-apid service.
Is it resolved now or do you still facing the issue.

Could you please share the following details:

  1. cat /etc/tendrl/etcd.yml
  2. cat config/etcd.yml from the place where you deploy
  3. service httpd status
  4. cat /etc/etcd/etcd.conf and the hostname of the machine where you are running etcd
  5. service etcd status from the machine where you are running etcd

@sankarshanmukhopadhyay
Copy link
Contributor

@mkudlej were you able to provide the detail sought by @TimothyAsir ?

@mbukatov
Copy link
Contributor

I see the same issue with tendrl-api installed from sources from current master (commit 74db59b). I noticed this during checking of our setup playbooks (see usmqe/usmqe-setup#6 for details)

Reproducer

I install both etcd and tendrl-api on the same machine, as described in tendrl_gluster.yml playbook:

---
- hosts: usm_server
  remote_user: root
  vars:
    install_from: source
  roles:
    - etcd
    - tendrl-repo
    - { role: epel, epel_enabled: 1 } # based on https://github.com/Tendrl/tendrl-api/issues/25
    - tendrl-api

Then just after the installation (when the playbook listed above finishes), I can't get any answer from the tendrl-api. Both httpd or puma server provides the same answer (so it's not related to httpd configuration):

# curl http://localhost/api/1.0/Flows
<h1>Internal Server Error</h1>
# curl http://localhost:9292/1.0/Flows
<h1>Internal Server Error</h1>

Further details

All details here are provided from the server machine, where both etcd and tendrl-api is isntalled.

Restart of tendlr-api doesn't solve the issue:

# systemctl restart tendrl-apid
# curl http://localhost:9292/1.0/Flows
<h1>Internal Server Error</h1>

Configuration of tendrl-api:

# cat /etc/tendrl/etcd.yml 
---
:development:
  :base_key: ''
  :host: '10.34.108.69'
  :port: 2379
  :user_name: 'username'
  :password: 'password'

:test:
  :base_key: ''
  :host: '10.34.108.69'
  :port: 2379
  :user_name: 'username'
  :password: 'password'

:production:
  :base_key: ''
  :host: '10.34.108.69'
  :port: 2379
  :user_name: 'username'
  :password: 'password'

The address listed there matches the global ipv4 address of the server machine:

# ip a s | grep global
    inet 10.34.108.69/24 brd 10.34.108.255 scope global dynamic eth0
    inet6 2620:52:0:226c:5054:ff:fe12:3824/64 scope global mngtmpaddr dynamic

Status of tendrl-apid service:

# systemctl status tendrl-apid
● tendrl-apid.service - Tendrl Api Daemon
   Loaded: loaded (/etc/systemd/system/tendrl-apid.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2016-12-20 13:02:55 UTC; 49min ago
 Main PID: 22173 (ruby)
   CGroup: /system.slice/tendrl-apid.service
           └─22173 puma 3.6.0 (tcp://0.0.0.0:9292) [tendrl-api]

Dec 20 13:48:29 mbukatov-usm5-server.example.com puma[22173]: /usr/local/share/gems/gems/sinatra-1.4.5/lib/sinatra/base.rb:2014:in `call'
Dec 20 13:48:29 mbukatov-usm5-server.example.com puma[22173]: /usr/local/share/gems/gems/rack-1.6.4/lib/rack/urlmap.rb:66:in `block in call'
Dec 20 13:48:29 mbukatov-usm5-server.example.com puma[22173]: /usr/local/share/gems/gems/rack-1.6.4/lib/rack/urlmap.rb:50:in `each'
Dec 20 13:48:29 mbukatov-usm5-server.example.com puma[22173]: /usr/local/share/gems/gems/rack-1.6.4/lib/rack/urlmap.rb:50:in `call'
Dec 20 13:48:29 mbukatov-usm5-server.example.com puma[22173]: /usr/local/share/gems/gems/puma-3.6.0/lib/puma/configuration.rb:225:in `call'
Dec 20 13:48:29 mbukatov-usm5-server.example.com puma[22173]: /usr/local/share/gems/gems/puma-3.6.0/lib/puma/server.rb:578:in `handle_request'
Dec 20 13:48:29 mbukatov-usm5-server.example.com puma[22173]: /usr/local/share/gems/gems/puma-3.6.0/lib/puma/server.rb:415:in `process_client'
Dec 20 13:48:29 mbukatov-usm5-server.example.com puma[22173]: /usr/local/share/gems/gems/puma-3.6.0/lib/puma/server.rb:275:in `block in run'
Dec 20 13:48:29 mbukatov-usm5-server.example.com puma[22173]: /usr/local/share/gems/gems/puma-3.6.0/lib/puma/thread_pool.rb:116:in `call'
Dec 20 13:48:29 mbukatov-usm5-server.example.com puma[22173]: /usr/local/share/gems/gems/puma-3.6.0/lib/puma/thread_pool.rb:116:in `block in spawn_thread'

Then there is etcd, which seems to run fine:

# systemctl status etcd
● etcd.service - Etcd Server
   Loaded: loaded (/usr/lib/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2016-12-20 12:54:46 UTC; 1h 9min ago
 Main PID: 9284 (etcd)
   CGroup: /system.slice/etcd.service
           └─9284 /usr/bin/etcd --name=default --data-dir=/var/lib/etcd/default.etcd --listen-client-urls=http://mbukatov-usm5-server.example.com:2379

Dec 20 12:54:46 mbukatov-usm5-server.example.com etcd[9284]: added local member ce2a822cea30bfca [http://localhost:2380 http://localhost:7001] to cluster 7e27652122e8b2ae
Dec 20 12:54:46 mbukatov-usm5-server.example.com systemd[1]: Started Etcd Server.
Dec 20 12:54:46 mbukatov-usm5-server.example.com etcd[9284]: ce2a822cea30bfca is starting a new election at term 1
Dec 20 12:54:46 mbukatov-usm5-server.example.com etcd[9284]: ce2a822cea30bfca became candidate at term 2
Dec 20 12:54:46 mbukatov-usm5-server.example.com etcd[9284]: ce2a822cea30bfca received vote from ce2a822cea30bfca at term 2
Dec 20 12:54:46 mbukatov-usm5-server.example.com etcd[9284]: ce2a822cea30bfca became leader at term 2
Dec 20 12:54:46 mbukatov-usm5-server.example.com etcd[9284]: raft.node: ce2a822cea30bfca elected leader ce2a822cea30bfca at term 2
Dec 20 12:54:46 mbukatov-usm5-server.example.com etcd[9284]: setting up the initial cluster version to 2.3
Dec 20 12:54:46 mbukatov-usm5-server.example.com etcd[9284]: set the initial cluster version to 2.3
Dec 20 12:54:46 mbukatov-usm5-server.example.com etcd[9284]: published {Name:default ClientURLs:[http://mbukatov-usm5-server.example.com:2379]} to cluster 7e27652122e8b2ae

And it's configuration:

# cat /etc/etcd/etcd.conf 
# [member]
ETCD_NAME=default
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
#ETCD_WAL_DIR=""
#ETCD_SNAPSHOT_COUNT="10000"
#ETCD_HEARTBEAT_INTERVAL="100"
#ETCD_ELECTION_TIMEOUT="1000"
ETCD_LISTEN_PEER_URLS="http://mbukatov-usm5-server.example.com:2380"
ETCD_LISTEN_CLIENT_URLS="http://mbukatov-usm5-server.example.com:2379"
#ETCD_MAX_SNAPSHOTS="5"
#ETCD_MAX_WALS="5"
#ETCD_CORS=""
#
#[cluster]
#ETCD_INITIAL_ADVERTISE_PEER_URLS="http://localhost:2380"
# if you use different ETCD_NAME (e.g. test), set ETCD_INITIAL_CLUSTER value for this name, i.e. "test=http://..."
#ETCD_INITIAL_CLUSTER="default=http://localhost:2380"
#ETCD_INITIAL_CLUSTER_STATE="new"
#ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="http://mbukatov-usm5-server.example.com:2379"
#ETCD_DISCOVERY=""
#ETCD_DISCOVERY_SRV=""
#ETCD_DISCOVERY_FALLBACK="proxy"
#ETCD_DISCOVERY_PROXY=""
#ETCD_STRICT_RECONFIG_CHECK="false"
#
#[proxy]
#ETCD_PROXY="off"
#ETCD_PROXY_FAILURE_WAIT="5000"
#ETCD_PROXY_REFRESH_INTERVAL="30000"
#ETCD_PROXY_DIAL_TIMEOUT="1000"
#ETCD_PROXY_WRITE_TIMEOUT="5000"
#ETCD_PROXY_READ_TIMEOUT="0"
#
#[security]
#ETCD_CERT_FILE=""
#ETCD_KEY_FILE=""
#ETCD_CLIENT_CERT_AUTH="false"
#ETCD_TRUSTED_CA_FILE=""
#ETCD_PEER_CERT_FILE=""
#ETCD_PEER_KEY_FILE=""
#ETCD_PEER_CLIENT_CERT_AUTH="false"
#ETCD_PEER_TRUSTED_CA_FILE=""
#
#[logging]
#ETCD_DEBUG="false"
# examples for -log-package-levels etcdserver=WARNING,security=DEBUG
#ETCD_LOG_PACKAGE_LEVELS=""
#
#[profiling]
#ETCD_ENABLE_PPROF="false

Note that etcd configuration for tendrl is not yet properly documented, see:

so that the setup we do right now is based on tendrl demo sessions, see tendrl role, for details.

Also note that when I use the same etcd configuration with tendrl-api package, the api works fine.

@mbukatov
Copy link
Contributor

@TimothyAsir @sankarshanmukhopadhyay I'm providing the details instead of mkudlej in the previous comment.

@shtripat
Copy link
Member

@anivargi I remember even if tendrl api and etcd services are running the api curl http://localhost/api/1.0/Flows would keep returning 500 till node-agents are started on storage nodes and some nodes are known to api layer. Can you confirm/comment??

@mbukatov
Copy link
Contributor

mbukatov commented Jan 3, 2017

Any updates on this?

@nthomas-redhat
Copy link
Contributor

@mkudlej , can you respond to #31 (comment) please. There is an issue with tendrl-apid if there is no nodes are present in etcd, it always throws 500. Could you please confirm this?

@mbukatov
Copy link
Contributor

@nthomas-redhat Do I read it right that you are asking for us to reproduce this again and check if there are some nodes in etcd? How could I check that there are no nodes in etcd? I'm not familiar with such implementation details yet.

That said, there are multiple problem we need to fix. When some etcd key/value is missing, tendrl-api should

  • not crash, but report the issue (see details below) and shutdown itself if there is no way to continue (I don't know if this is the case)
  • report where the problem is exactly: which key is missing in etcd along with some hint (eg. which component is responsible for this)

@mbukatov
Copy link
Contributor

mbukatov commented Jan 10, 2017

When I use tendrl_gluster.yml from usmqe-setup to install from current Tendrl sources, I can see that the Flows call works:

# curl http://localhost/api/1.0/Flows; echo
[{"name":"GlusterImportCluster","method":"POST","attributes":[{"name":"Node[]","type":"List","required":true},{"name":"Tendrl_context.sds_name","help":"Name of the Tendrl managed sds, eg: 'gluster'","type":"String","default":null,"required":true},{"name":"Tendrl_context.sds_version","help":"Version of the Tendrl managed sds, eg: '3.2.1'","type":"String","default":null,"required":true}]},{"name":"CephImportCluster","method":"POST","attributes":[{"name":"Node[]","type":"List","required":true},{"name":"Tendrl_context.sds_name","help":"Name of the Tendrl managed sds, eg: 'gluster'","type":"String","default":null,"required":true},{"name":"Tendrl_context.sds_version","help":"Version of the Tendrl managed sds, eg: '3.2.1'","type":"String","default":null,"required":true}]}]

I checked that tendrl-node-agent service is up and running on all storage nodes.

To verify whether I can reproduce the issue when tendrl-node-agent have no chance to talk to etcd (as asked in #31 (comment)), I try the setup again, but without installing and setting up node-agents (dropping tendrl-node-agent and tendrl-gluster-integration roles from the playbook). Then I see this:

# curl http://localhost/api/1.0/Flows; echo
<h1>Internal Server Error</h1>

with the same traceback in the logs:

an 10 17:21:33 mbukatov-usm5-server.usmqe.lab.eng.brq.redhat.com puma[22231]: Etcd::KeyNotFound - Key not found:
Jan 10 17:21:33 mbukatov-usm5-server.usmqe.lab.eng.brq.redhat.com puma[22231]: /usr/local/share/gems/gems/etcd-0.3.0/lib/etcd/client.rb:141:in `process_http_r
Jan 10 17:21:33 mbukatov-usm5-server.usmqe.lab.eng.brq.redhat.com puma[22231]: /usr/local/share/gems/gems/etcd-0.3.0/lib/etcd/client.rb:114:in `api_execute'
Jan 10 17:21:33 mbukatov-usm5-server.usmqe.lab.eng.brq.redhat.com puma[22231]: /usr/local/share/gems/gems/etcd-0.3.0/lib/etcd/keys.rb:22:in `get'
Jan 10 17:21:33 mbukatov-usm5-server.usmqe.lab.eng.brq.redhat.com puma[22231]: /opt/tendrl/tendrl-api/node.rb:6:in `block in <class:Node>'

Which means that the answer to @nthomas-redhat 's question is that yes, the issue seems to be caused by node-agent not being up and running (there must have been some other, now fixed, issue which caused node-agent to fail previously).

With this in mind, I think that we can agree that:

  • this is no longer blocking qe team
  • it's still a high priority issue which needs to be addressed

@japplewhite
Copy link

This seems to conflict with the "green field" use case where I want to install ceph or gluster (by depending on a node agent running for an as yet non-existent service).

@r0h4n r0h4n closed this as completed Jul 21, 2017
@mbukatov
Copy link
Contributor

@mkudlej do we need to reevaluate this to check if the problem here is still relevant after rescope?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants