Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apollo bootstrap.sh Error #5344

Closed
Triangle001 opened this issue Aug 14, 2018 · 22 comments
Closed

Apollo bootstrap.sh Error #5344

Triangle001 opened this issue Aug 14, 2018 · 22 comments
Assignees
Labels
Module: Build Indicates build related issues Type: Help wanted Indicates that a maintainer wants help on an issue/pull request from the rest of the community

Comments

@Triangle001
Copy link

After built Apollo sucessfully,I use the command "bash scripts/bootstrap.sh" to launch to Apollo,I was refused,it shows:

============================
[ OK ] Build passed!
[INFO] Took 2415 seconds

apollo@in_dev_docker:/apollo$ bash scripts/bootstrap.sh
Started supervisord with dev conf
Start roscore...
voice_detector: started
unix:///tmp/supervisor.sock refused connection

How should i do? Thanks.

@DongAi
Copy link
Contributor

DongAi commented Aug 16, 2018

When I was testing Apollo, I run into a similiar issue, and found out a way that might help solve your problem.
First case:

  1. type command supervisorctl, make sure that could enter supervisorctl interpreter environment, if an error like "unix:///tmp/supervisor.sock no such file" occurs, it means that you haven't started supervisord service.
  2. use command /usr/bin/python /usr/local/bin/supervisord to start it. and then try step 1 again. you can use command supervisorctl version to check the version, the version Apollo uses is 3.3.3 or 3.3.4.

Second case:

  1. check whether you have /etc/supervisord.conf file, if not, that's the root of the issue, jump to step 2
  2. there should be an executable file named echo_supervisord_conf in /usr/bin/ directory, use command :
    echo_supervisord_conf > /etc/supervisord.conf to create that necessary file.
  3. Go to the step 2 of First case to start supervisord service

Another tips:

  1. use ps aux |grep supervisord to make sure supervisord service is running
  2. if there is supervisor with version 3.0 installed on your machine, you might need to remove it. and install version 3.3.3 or version 3.3.4, but not necessary, being able to use is our goal.
  3. for installing new version, check website supervisor
  4. check file apollo/modules/tools/supervisord/release.conf to see why supervisorctl could start a service named dreamview, in section [program:dreamview]

@natashadsouza natashadsouza added Type: Help wanted Indicates that a maintainer wants help on an issue/pull request from the rest of the community Module: Build Indicates build related issues labels Aug 17, 2018
@natashadsouza natashadsouza self-assigned this Aug 17, 2018
@Triangle001
Copy link
Author

Thanks.The problem has been solved.I change the config file: "apollo\modules\tools\supervisord\dev.conf" like this:
in line
;[inet_http_server] ; inet (TCP) server disabled by default
;port=127.0.0.1:9001 ; ip_address:port specifier, *:port for all iface
cancel the line first ";":
[inet_http_server] ; inet (TCP) server disabled by default
port=127.0.0.1:9001 ; ip_address:port specifier, *:port for all iface

So supervisor support web communicates with server by http .

@natashadsouza
Copy link
Contributor

Closing this issue as the problem seems to be solved. Feel free to open it if you have additional questions.

@IneverStop
Copy link

I have the same problem, and try @Triangle001 s solution but still not work.
Then I tried @DongAi s solution, still not work.
When I type supervisorctl, it shows

unix:///tmp/supervisor.sock refused connection
supervisor>

Seems it can enter the supervisor command line ,but have some problem.

Then I type /usr/bin/python /usr/local/bin/supervisord, it says:

/usr/local/lib/python2.7/dist-packages/supervisor/options.py:298: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
'Supervisord is running as root and it is searching '
Unlinking stale socket /tmp/supervisor.sock

Besides, my /tmp/supervisor.sock is empty.

Then I type supervisorctl, nothing changed.

I tried supervisorctl version, still says:

unix:///tmp/supervisor.sock refused connection

I will be very appreciate if someone help me.

@IneverStop
Copy link

I have solved this problem.My apollo version is r3.0.0, it has much things to do to solve this problem.It should also change /etc/supervisord.conf.

@natashadsouza
Copy link
Contributor

That's great! If possible, please share the additional steps you mentioned to help other developers stuck on the same issue.
Thanks!

@whuzxy
Copy link

whuzxy commented Aug 30, 2018

@IneverStop Hi,I met the same problem,could you share your steps?

@IneverStop
Copy link

IneverStop commented Sep 3, 2018

@whuzxy Sorry, I didn't check github these days.Change both file /apollo/modules/tools/supervisord/dev.conf and /etc/supervisord.conf as @Triangle001 says, this problem can be easily solved.

@whuzxy
Copy link

whuzxy commented Sep 3, 2018

@IneverStop thank you ,i solved the issue just as you said.

@natashadsouza
Copy link
Contributor

@IneverStop thank you very much for sharing the fix.

@CCodie
Copy link

CCodie commented Oct 12, 2018

@IneverStop @whuzxy @natashadsouza Hi, I have exactly the same error, can you help me ??
I installed --branch r3.0.0 and I changed both of /apollo/modules/tools/supervisord/dev.conf and /etc/supervisord.conf like below.

;[unix_http_server]
;file=/tmp/supervisor.sock   ; the path to the socket file
;chmod=0700                 ; socket file mode (default 0700)
;chown=nobody:nogroup       ; socket file uid:gid owner
;username=user              ; default is no username (open server)
;password=123               ; default is no password (open server)

[inet_http_server]         ; inet (TCP) server disabled by default
port=127.0.0.1:9001        ; ip_address:port specifier, *:port for all iface
;username=user              ; default is no username (open server)
;password=123               ; default is no password (open server)

...

[supervisorctl]
;serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket
serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
;username=chris              ; should be same as in [*_http_server] if set
;password=123                ; should be same as in [*_http_server] if set
;prompt=mysupervisor         ; cmd line prompt (default "supervisor")
;history_file=~/.sc_history  ; use readline history if available

But when I'm trying to run bash scripts/bootstrap.sh the error occurs.

@in_dev_docker:/apollo$ bash scripts/bootstrap.sh 
Started supervisord with dev conf
Start roscore...
voice_detector: started
dreamview: ERROR (spawn error)

I would really appreciate it if you could help. Thanks !

@DongAi
Copy link
Contributor

DongAi commented Oct 12, 2018

Hi CCodie, could you please display the error output of dreamview in file data/log/dreamview.ERROR?

@CCodie
Copy link

CCodie commented Oct 12, 2018

@DongAi Hi, thank you for the reply.
Actually there's no log file about dreamview.ERROR. There's only ERROR file about monitor.
monitor.ERROR

Log file created at: 2018/10/12 13:09:47
Running on machine: in_dev_docker
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E1012 13:09:47.067443 11173 can_checker_factory.cc:48] Failed to create CAN checker with parameter: brand: ESD_CAN
type: PCI_CARD
channel_id: CHANNEL_ID_ZERO
E1012 13:10:17.393507 11173 can_checker_factory.cc:48] Failed to create CAN checker with parameter: brand: ESD_CAN
type: PCI_CARD
channel_id: CHANNEL_ID_ZERO
E1012 13:10:47.630934 11173 can_checker_factory.cc:48] Failed to create CAN checker with parameter: brand: ESD_CAN
type: PCI_CARD
channel_id: CHANNEL_ID_ZERO
E1012 13:11:17.862692 11173 can_checker_factory.cc:48] Failed to create CAN checker with parameter: brand: ESD_CAN
type: PCI_CARD
channel_id: CHANNEL_ID_ZERO
E1012 13:11:48.021847 11173 can_checker_factory.cc:48] Failed to create CAN checker with parameter: brand: ESD_CAN
type: PCI_CARD
channel_id: CHANNEL_ID_ZERO
E1012 13:12:18.266692 11173 can_checker_factory.cc:48] Failed to create CAN checker with parameter: brand: ESD_CAN
type: PCI_CARD
channel_id: CHANNEL_ID_ZERO

But this log seems about the CAN Card, doesn't relate with my error.
Can you give me another way to solve the problem ?? Thanks !

@CCodie
Copy link

CCodie commented Oct 12, 2018

@DongAi , here is my update info.
I typed supervisorctl after running bash scripts/bootstrap.sh with modified files /apollo/modules/tools/supervisord/dev.conf , /etc/supervisord.conf.

@in_dev_docker:/apollo$ supervisorctl
canbus                           STOPPED   Not started
conti_radar                      STOPPED   Not started
control                          STOPPED   Not started
dreamview                        FATAL     Exited too quickly (process log may have details)
gps                              STOPPED   Not started
localization                     STOPPED   Not started
mobileye                         STOPPED   Not started
monitor                          RUNNING   pid 9248, uptime 0:36:09
navigation_control               STOPPED   Not started
navigation_localization          STOPPED   Not started
navigation_perception            STOPPED   Not started
navigation_planning              STOPPED   Not started
navigation_prediction            STOPPED   Not started
navigation_routing               STOPPED   Not started
navigation_server                STOPPED   Not started
open_api                         STOPPED   Not started
perception                       STOPPED   Not started
planning                         STOPPED   Not started
prediction                       STOPPED   Not started
routing                          STOPPED   Not started
third_party_perception           STOPPED   Not started
supervisor> 

Deos this can help for debugging my error ?

@DongAi
Copy link
Contributor

DongAi commented Oct 12, 2018

Hi, CCodie,
in this field,

[unix_http_server]
;file=/tmp/supervisor.sock   ; the path to the socket file

I think this line shouldn't be commented. But as you have started supervisord, so maybe it's not this matter.

If dreamview crashed, a core file would be generated in directory data/core/, we maybe need to debug to find what caused this crash.

below are the commands to use gdb to attach and debug:
$ gdb bazel-bin/modules/dreamview/dreamview data/core/#yourcorefilename#
after attaching successfully, use command:
$ bt
to see the callback stack. you can put the ouput here so that we can help to analyse.

And, there's still another way to escape from this error of supervisor.
we can modify file script/bootscript.sh to devide whether to use supervisor to manager our processes or not.

here is a part of my bootscript.sh:

function start() {
    DEBUG_MODE="yes"
    if [ "$HOSTNAME" == "in_release_docker" ]; then
        DEBUG_MODE="no"
    fi

    # Start roscore.
    bash scripts/roscore.sh start

    if [ "$DEBUG_MODE" == "yes" ]; then
        ./scripts/monitor.sh start
        ./scripts/dreamview.sh start
        supervisord -c /apollo/modules/tools/supervisord/dev.conf >& /tmp/supervisord.start.log
        echo "Started supervisord with dev conf"
    else

@whuzxy
Copy link

whuzxy commented Oct 12, 2018

@CCodie
I don't know why but it really worked,you can have a try.
/apollo/modules/tools/supervisord/dev.conf

[unix_http_server]
file=/tmp/supervisor.sock   ; the path to the socket file
chmod=0700                 ; socket file mode (default 0700)
;chown=nobody:nogroup       ; socket file uid:gid owner
;username=user              ; default is no username (open server)
;password=123               ; default is no password (open server)

[inet_http_server]         ; inet (TCP) server disabled by default
port=127.0.0.1:9001        ; ip_address:port specifier, *:port for all iface
;username=user              ; default is no username (open server)
;password=123               ; default is no password (open server)

...

[supervisorctl]
;serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket
serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
;username=chris              ; should be same as in [*_http_server] if set
;password=123                ; should be same as in [*_http_server] if set
;prompt=mysupervisor         ; cmd line prompt (default "supervisor")
;history_file=~/.sc_history  ; use readline history if available

/etc/supervisord.conf

[unix_http_server]
file=/tmp/supervisor.sock   ; the path to the socket file
;chmod=0700                 ; socket file mode (default 0700)
;chown=nobody:nogroup       ; socket file uid:gid owner
;username=user              ; default is no username (open server)
;password=123               ; default is no password (open server)

[inet_http_server]         ; inet (TCP) server disabled by default
port=127.0.0.1:9001        ; ip_address:port specifier, *:port for all iface
;username=user              ; default is no username (open server)
;password=123               ; default is no password (open server)

...

[supervisorctl]
;serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket
serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
;username=chris              ; should be same as in [*_http_server] if set
;password=123                ; should be same as in [*_http_server] if set
;prompt=mysupervisor         ; cmd line prompt (default "supervisor")
;history_file=~/.sc_history  ; use readline history if available

I just add

chmod=0700

in /apollo/modules/tools/supervisord/dev.conf than you.But it worked for me . Good luck.

@CCodie
Copy link

CCodie commented Oct 15, 2018

@DongAi Hi, I still got the error for running dreamview. Can you please give me a help?

$ bash scripts/bootstrap.sh 
Started supervisord with dev conf
Start roscore...
voice_detector: started
dreamview: ERROR (spawn error)

I hope the below can help me to debug this error...
$ gdb bazel-bin/modules/dreamview/dreamview data/core/core_dreamview.27435

GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.3) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from bazel-bin/modules/dreamview/dreamview...done.
[New LWP 27435]
[New LWP 27436]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

warning: the debug information found in "/home/caros/secure_upgrade/depend_lib/libyaml-cpp.so.0.5.1" does not match "/home/caros/secure_upgrade/depend_lib/libyaml-cpp.so.0.5" (CRC mismatch).

Core was generated by `/apollo/bazel-bin/modules/dreamview/dreamview --flagfile=/apollo/modules/dreamv'.
Program terminated with signal SIGILL, Illegal instruction.
#0  0x00007fe3adf28bec in double boost::math::detail::erf_inv_imp<double, boost::math::policies::policy<boost::math::policies::promote_float<false>, boost::math::policies::promote_double<false>, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math:---Type <return> to continue, or q <return> to quit---bt
:policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy> >(double const&, double const&, boost::math::policies::policy<boost::math::policies::promote_float<false>, boost::math::policies::promote_double<false>, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy> const&, mpl_::int_<64> const*) ()
   from /usr/local/lib/libpcl_sample_consensus.so.1.7
(gdb) bt
#0  0x00007fe3adf28bec in double boost::math::detail::erf_inv_imp<double, boost::math::policies::policy<boost::math::policies::promote_float<false>, boost::math::policies::promote_double<false>, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy> >(double const&, double const&, boost::math::policies::policy<boost::math::policies::promote_float<false>, boost::math::policies::promote_double<false>, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy> const&, mpl_::int_<64> const*) ()
   from /usr/local/lib/libpcl_sample_consensus.so.1.7
#1  0x00007fe3adeedf1e in _GLOBAL__sub_I_sac.cpp ()
   from /usr/local/lib/libpcl_sample_consensus.so.1.7
#2  0x00007fe3b6afd2da in call_init (l=<optimized out>, argc=argc@entry=2, 
    argv=argv@entry=0x7fff847f2d68, env=env@entry=0x7fff847f2d80)
    at dl-init.c:78
#3  0x00007fe3b6afd3c3 in call_init (env=<optimized out>, 

I actually don't know why this error occur for my laptop because I successfully ran Apollo ver3.0 on my desktop. Anyway thanks in advance !

add) Can you please type all the contents of bootstrap.sh function start ? I also want to try your 2nd guide.

@DongAi
Copy link
Contributor

DongAi commented Oct 15, 2018

Hi, CCodie,
from your gdb output, I think one of my colleagues has ever encountered this issue, it may be caused by the incompatibility between PCL lib and your cpu. You may need to recompile the PCL lib.

And you can get more information about this similar issue referring to #3615, #4135.

And please refer to pcl doc to get information about how to build PCL.
Please note that, it's better to build Release version of PCL sinch building Debug version may take your 3-4 hours.

Aftering building, replace the PCL libraries exist in you docker contaners with the newly built libraries, the directory that contains the PCL libraries is /usr/local/lib

@DongAi
Copy link
Contributor

DongAi commented Oct 15, 2018

Hi, CCodie, Another tip, the build process must run in your docker container.

@CCodie
Copy link

CCodie commented Oct 15, 2018

@DongAi
Hi, I'm really appreciate for your reply.
I'm going to try with PCL lib and after that, I'll leave a comment here. Thanks !

@DongAi
Copy link
Contributor

DongAi commented Oct 15, 2018

function start() {
    DEBUG_MODE="yes"
    if [ "$HOSTNAME" == "in_release_docker" ]; then
        DEBUG_MODE="no"
    fi

    # Start roscore.
    bash scripts/roscore.sh start

    if [ "$DEBUG_MODE" == "yes" ]; then
        ./scripts/monitor.sh start
        ./scripts/dreamview.sh start
	supervisord -c /apollo/modules/tools/supervisord/dev.conf >& /tmp/supervisord.start.log
        echo "Started supervisord with dev conf"
    else
        # Use supervisord.
        supervisord -c /apollo/modules/tools/supervisord/release.conf >& /tmp/supervisord.start.log
        echo "Started supervisord with release conf"

        # Start monitor.
        supervisorctl start monitor > /dev/null
        # Start dreamview.
        supervisorctl start dreamview
        supervisorctl status dreamview | grep RUNNING > /dev/null
    fi

    if [ $? -eq 0 ]; then
        echo "Dreamview is running at http://localhost:8888"
    fi
}

@natashadsouza natashadsouza reopened this Oct 16, 2018
@CCodie
Copy link

CCodie commented Oct 19, 2018

@DongAi I'm really appreciate for your detail instructions.
I found out that error was caused by incompatibility between my CPU and PCL lib.
Now I exactly solve my problem. Again, thank you :)

Add) Also we can close this issue.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Module: Build Indicates build related issues Type: Help wanted Indicates that a maintainer wants help on an issue/pull request from the rest of the community
Projects
None yet
Development

No branches or pull requests

6 participants