Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restful server started failed #4083

Closed
geng0021 opened this issue Aug 18, 2021 · 11 comments
Closed

Restful server started failed #4083

geng0021 opened this issue Aug 18, 2021 · 11 comments
Assignees

Comments

@geng0021
Copy link

Describe the issue:
Restful server started failed when i tried to run the mnist example

Environment:

  • NNI version:2.4

  • Training service (local|remote|pai|aml|etc):local

  • Client OS:ubuntu

  • Python version:3.8

  • Is conda/virtualenv/venv used?:?

  • Is running in Docker?: no

Configuration:

  • Experiment config (remember to remove secrets!):

截屏2021-08-18 下午2 50 29

  • Search space:
    INFO: Starting restful server...
    ERROR: Restful server start failed!
    INFO: Stdout:

            Experiment start time 2021-08-18 14:46:26

INFO: Stderr:

            Experiment start time 2021-08-18 14:46:26

Failed to create log dir: RangeError: Invalid time value
at Date.toISOString ()
at Logger.log (xx/.local/lib/python3.8/site-packages/nni_node/common/log.js:54:72)
at Logger.error (xx/.local/lib/python3.8/site-packages/nni_node/common/log.js:41:14)
at xx/.local/lib/python3.8/site-packages/nni_node/main.js:109:33

Log message:

  • nnimanager.log:
  • dispatcher.log:
  • nnictl stdout and stderr:

Could anyone help me look at this issue? stuck on this for a week

@acured
Copy link
Contributor

acured commented Aug 19, 2021

Hi @geng0021 Thanks your feedback, there is a very similar issue at #4077. Can we discuss this issue here?

@acured acured self-assigned this Aug 19, 2021
@DavideHe
Copy link

i get the same erro;my nni version is 2.4;i guess the version problem; so i reinstall the nni to 2.3. the problem has been sovled.

@Munyasya
Copy link

also getting a similar error on windows

ERROR: Restful server start failed!
INFO: Stdout:

            Experiment start time 2021-08-19 16:30:06

INFO: Stderr:

            Experiment start time 2021-08-19 16:30:06

Failed to create log dir: RangeError: Invalid time value
at Date.toISOString ()
at Logger.log (c:\users\chmunyas\anaconda3\lib\site-packages\nni_node\common\log.js:54:72)
at Logger.error (c:\users\chmunyas\anaconda3\lib\site-packages\nni_node\common\log.js:41:14)
at c:\users\chmunyas\anaconda3\lib\site-packages\nni_node\main.js:109:33

Traceback (most recent call last):
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_pswindows.py", line 679, in wrapper
return fun(self, *args, **kwargs)
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_pswindows.py", line 933, in create_time
user, system, created = cext.proc_times(self.pid)
ProcessLookupError: [Errno 3] assume no such process (originated from GetExitCodeProcess != STILL_ACTIVE)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_init_.py", line 354, in init
self.create_time()
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_init
.py", line 710, in create_time
self._create_time = self._proc.create_time()
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_pswindows.py", line 681, in wrapper
raise convert_oserror(err, pid=self.pid, name=self._name)
psutil.NoSuchProcess: psutil.NoSuchProcess process no longer exists (pid=25516)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\users\chmunyas\anaconda3\lib\site-packages\nni\tools\nnictl\launcher.py", line 432, in launch_experiment
kill_command(rest_process.pid)
File "c:\users\chmunyas\anaconda3\lib\site-packages\nni\tools\nnictl\command_utils.py", line 37, in kill_command
process = psutil.Process(pid=pid)
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_init_.py", line 326, in init
self.init(pid)
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_init
.py", line 367, in _init
raise NoSuchProcess(pid, None, msg)
psutil.NoSuchProcess: psutil.NoSuchProcess no process found with pid 25516

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\users\chmunyas\anaconda3\lib\site-packages\nni\tools\nnictl\launcher.py", line 525, in create_experiment
launch_experiment(args, config_v2, 'new', experiment_id, 2)
File "c:\users\chmunyas\anaconda3\lib\site-packages\nni\tools\nnictl\launcher.py", line 434, in launch_experiment
raise Exception(ERROR_INFO % 'Rest server stopped!')
TypeError: not all arguments converted during string formatting

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_pswindows.py", line 679, in wrapper
return fun(self, *args, **kwargs)
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_pswindows.py", line 933, in create_time
user, system, created = cext.proc_times(self.pid)
ProcessLookupError: [Errno 3] assume no such process (originated from GetExitCodeProcess != STILL_ACTIVE)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_init_.py", line 354, in init
self.create_time()
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_init
.py", line 710, in create_time
self._create_time = self._proc.create_time()
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_pswindows.py", line 681, in wrapper
raise convert_oserror(err, pid=self.pid, name=self._name)
psutil.NoSuchProcess: psutil.NoSuchProcess process no longer exists (pid=25516)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\users\chmunyas\anaconda3\lib\runpy.py", line 194, in _run_module_as_main
return run_code(code, main_globals, None,
File "c:\users\chmunyas\anaconda3\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\chmunyas\Anaconda3\Scripts\nnictl.exe_main
.py", line 7, in
File "c:\users\chmunyas\anaconda3\lib\site-packages\nni\tools\nnictl\nnictl.py", line 290, in parse_args
args.func(args)
File "c:\users\chmunyas\anaconda3\lib\site-packages\nni\tools\nnictl\launcher.py", line 529, in create_experiment
kill_command(restServerPid)
File "c:\users\chmunyas\anaconda3\lib\site-packages\nni\tools\nnictl\command_utils.py", line 37, in kill_command
process = psutil.Process(pid=pid)
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_init
.py", line 326, in init
self.init(pid)
File "c:\users\chmunyas\anaconda3\lib\site-packages\psutil_init
.py", line 367, in _init
raise NoSuchProcess(pid, None, msg)
psutil.NoSuchProcess: psutil.NoSuchProcess no process found with pid 25516

@acured
Copy link
Contributor

acured commented Aug 20, 2021

Hi @Munyasya, can you help me to do some test? Since I still can not reproduce it.

Add this two lines on c:\users\chmunyas\anaconda3\lib\site-packages\nni_node\common\log.js at line 53:
console.log(new Date().toLocaleString());
console.log(new Date(new Date().toLocaleString() + ' UTC'));

and launch an experiment.

Anything on "Stdout" or "Stderr"?

@Munyasya
Copy link

Munyasya commented Aug 20, 2021 via email

@wanggz
Copy link

wanggz commented Aug 24, 2021

how to do 555

@DavideHe
Copy link

how to do 555

I find 2 points:

  1. Don't use nni on windows. there will work easy on linux
  2. version==2.0 is run well ;newer version maybe come cross a little bug but no answer

@acured
Copy link
Contributor

acured commented Aug 25, 2021

There is a fix about this issue, #4108.
Have a try with source code or wait for next release.

@fsmosca
Copy link

fsmosca commented Sep 14, 2021

I encountered this issue on win10 but not running the mnist. I run a different experiment. nni 2.4 does not work, but when I use nni 2.3 it worked.

@fsmosca
Copy link

fsmosca commented Sep 25, 2021

I got 2.4 working so far using the following changes.

Change

const isoTime = new Date(new Date().toLocaleString() + ' UTC').toISOString();

to

const isoTime = new Date(new Date() + ' UTC').toISOString();

Just remove the toLocaleString().

Can be found at:

...lib\site-packages\nni_node\common\log.js:54:72)

This is on windows 10 and python 3.9.

@liuzhe-lz
Copy link
Contributor

Fixed in v2.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants