Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.uptime returns the wrong value #36244

Closed
davedodo opened this issue Nov 24, 2020 · 23 comments
Closed

os.uptime returns the wrong value #36244

davedodo opened this issue Nov 24, 2020 · 23 comments
Labels
libuv Issues and PRs related to the libuv dependency or the uv binding. os Issues and PRs related to the os subsystem.

Comments

@davedodo
Copy link

The os.uptime function returns wrong values on some machines. I am using NodeJS 10 on Ubuntu 16.04 via the nodesource repository. The system's uptime command returns 07:36:53 up 36 days, 11:51. But require("os").uptime() / 3600 / 24 yields about 77 days. The system is a virtual server. The virtualization software used by the provider is Virtuozzo (I think).

I tried the same on a bare metal machine at home with Ubuntu 18.04, but the returned time is correct there. I get the feeling that the virtualization has something to do with it. Anyway, I believe this should be fixed.

@gireeshpunathil
Copy link
Member

Node uses sysinfo call to obtain the uptime -

int uv_uptime(double* uptime) {
struct sysinfo info;
if (sysinfo(&info) < 0)
return UV__ERR(errno);
*uptime = info.uptime;
return 0;
}

the system command uptime looks like reading from /proc/uptime .

openat(AT_FDCWD, "/proc/uptime", O_RDONLY) = 3

Not sure why these are different.

Can you compile this below code and run to see what it produces?

$ cat foo.cc

#include <stdio.h>
#include <sys/sysinfo.h>
#include <errno.h>

int main() {
  struct sysinfo info;
  if (sysinfo(&info) < 0) {
    fprintf(stderr, "error calling sysinfo: %d\n", errno);
    return -1;
  }
  fprintf(stderr, "system has been up for %ld seconds\n", info.uptime);
  return 0;
}

compare it with the output of cat /proc/uptime too.

@davedodo
Copy link
Author

I don't have access to the affected server right now, but I will post the output as soon as I can.

Just out of curiosity: why are you looking at cygwin.c and not linux-core.c? There seems to be a uv_uptime function as well, which uses clock_gettime.

https://github.com/nodejs/node/blob/v10.x/deps/uv/src/unix/linux-core.c#L563-L578

I am by no means versed in Linux C programming, but according to the documentation, this function does not necessarily return the uptime, but the time since "some unspecified point in the past". It does say, however, that on Linux "that point corresponds to the number of seconds that the system has been running since it was booted," but I suspect that on this virtualization platform it returns something else. Maybe the time since the host was booted.

@gireeshpunathil
Copy link
Member

sorry, you are right! I looked at a wrong location. so when you have access to the system, hope you will check a similar source, as opposed to what I posted. (Please let me know if you need a working code snippet out of it)

@gireeshpunathil
Copy link
Member

I too tested in a virtual box, but got all the results matching, so not sure if it is specific to virtualization methods / os versions

#include <time.h>
#include <errno.h>
#include <stdio.h>

int main() {
  struct timespec now;
  int r;

  r = clock_gettime(CLOCK_BOOTTIME, &now);
  if (r != 0) {
    fprintf(stderr, "clock_gettime failed with %d\n", errno);
    return errno;
  }
  fprintf(stderr, "system uptime: %ld\n", now.tv_sec);
  return 0;
}
$ ./a.out
system uptime: 96954

$ node -e 'console.log(require("os").uptime())'
96946

$ cat /proc/uptime
96961.56 764249.30

@davedodo
Copy link
Author

$ ./a.out
system uptime: 6740810

$ node -e 'console.log(require("os").uptime())'
6740816

$ cat /proc/uptime
3187731.82 10172279.73

This confirms clock_gettime does not return the correct time.

If I run the sysinfo example, I get the correct time:

$ ./a.out
system has been up for 3187964 seconds

I wonder why the Linux version of NodeJS doesn't also use sysinfo instead of clock_gettime.

@gireeshpunathil
Copy link
Member

@davedodo - I assume that you can confirm (outside of the software) that the actual boot time in this case is 36 days ?

/cc @nodejs/libuv

@davedodo
Copy link
Author

@davedodo - I assume that you can confirm (outside of the software) that the actual boot time in this case is 36 days ?

Yes, I can confirm that 36 days is the correct uptime.

@schamberg97
Copy link
Contributor

schamberg97 commented Nov 29, 2020

I am not strongly familiar with the uptime utility internals or Linux's /proc/ filesystem, but I have a strong suspicion that this may be the case of Virtuozzo and OpenVZ kernel quirks (or perhaps misconfiguration). The nature of Virtuozzo/OpenVZ VPS instances is that they ALL share the same kernel as the host system. However, to the best of my knowledge, /proc/ filesystem is still separated between clients. From what I understood, the GNU coureutil uptime command defaults to using /proc/ filesystem, while @gireeshpunathil 's and Node.JS code doesn't, hence the possible difference in readings (the physical machine hosting VPS can be running for longer than the VPS itself).

@schamberg97
Copy link
Contributor

schamberg97 commented Nov 29, 2020

Probably the fastest way to confirm this is the case (besides going through obscure and arcane OpenVZ documentation) is by contacting physical server admin and asking him the time of the last reboot (if he's willing to tell)

@benjamingr
Copy link
Member

I think it's probably a good idea to add a note to the documentation about this?

@gireeshpunathil
Copy link
Member

@benjamingr - the issue with documenting is - documenting something like:

this API returns the time since reboot of the virtual machine instance or the underlying physical machine instance, depending on the type of virtualization used, depending on whether the virtualization re-uses the underlying OS or not

is going to largely limit the usability of the API - unless the programmer / user is able to externally verify such things and make calculated decisions in their program.

A more practical approach would be to:

  • figure out the actual root cause of the difference,
  • figure out a mean to detect this difference through s/w or eternally
  • if possible to detect through s/w, try to fix it by accommodating the diff
  • if possible to detect externally, then document based on that

does it look reasonable to you?

@gireeshpunathil
Copy link
Member

@schamberg97 - in my case, both Node.js and uptime return the same result. I can confirm my uptime of the virtual + physical machine as correctly returned.

@davedodo also physically verified and confirmed their physical uptime, and as different between physical and virtual instances.

Your points look promising, but where do we go from here, given the results? I too cannot find any documented behavior changes for /proc file system under virtualized environments!

@davedodo
Copy link
Author

davedodo commented Dec 7, 2020

Just wanted to reiterate: I did verify that the sysinfo function returns the correct uptime under all tested circumstances. It is already used by the cygwin implementation. Why not use it for the Linux implementation as well, instead of clock_gettime?

int uv_uptime(double* uptime) {
struct sysinfo info;
if (sysinfo(&info) < 0)
return UV__ERR(errno);
*uptime = info.uptime;
return 0;
}

@benjamingr
Copy link
Member

@gireeshpunathil

does it look reasonable to you?

Yes, though to be clear I didn't mean "let's not fix it" I only meant "let's document it in the meantime until it is fixed"

@benjamingr
Copy link
Member

@davedodo open a libuv issue?

Also cc @nodejs/libuv

@schamberg97
Copy link
Contributor

schamberg97 commented Dec 7, 2020

@schamberg97 - in my case, both Node.js and uptime return the same result. I can confirm my uptime of the virtual + physical machine as correctly returned.

@davedodo also physically verified and confirmed their physical uptime, and as different between physical and virtual instances.

Your points look promising, but where do we go from here, given the results? I too cannot find any documented behavior changes for /proc file system under virtualized environments!

@gireeshpunathil Even under OpenVZ and Virtuozzo? I have just checked, the issue is real under OpenVZ. The guest instances use the same kernel as host, but their /proc/ filesystem is not truly related to the host kernel. There is real mismatch between os.uptime() and uptime command. I can even provide the access to one virtualized instance I have just created on an OpenVZ provider's host.

@schamberg97
Copy link
Contributor

@gireeshpunathil Sorry, I misunderstood your comment initially :D. The systime functions perfectly in all cases, but clock_gettime doesn't. We could actually check, whether the instance is running under OpenVZ. AFAIK, the reliable way to do so is by checking for /proc/vz/veinfo. If it exists, then it is OpenVZ instance. If it doesn't exist - we are safe

@gireeshpunathil
Copy link
Member

@schamberg97 - thanks for the clarification.

#cat /proc/vz/veinfo
cat: /proc/vz/veinfo: No such file or directory
# 

this is what I get in my system. So the crux of what you are saying is that we can use this mechanism to reliably check the presence of mismatch between the two calls, right?

@schamberg97
Copy link
Contributor

schamberg97 commented Dec 7, 2020

@schamberg97 - thanks for the clarification.

#cat /proc/vz/veinfo
cat: /proc/vz/veinfo: No such file or directory
# 

this is what I get in my system. So the crux of what you are saying is that we can use this mechanism to reliably check the presence of mismatch between the two calls, right?

@gireeshpunathil Exactly! :)

@schamberg97
Copy link
Contributor

Inside openVZ container it will produce the info about container's ID, IP and number of running processes:

root@vps1607355603:~# cat /proc/vz/veinfo
     13892     0    51  185.53.129.131

@schamberg97
Copy link
Contributor

@gireeshpunathil @benjamingr @davedodo Since there is no open issue on libuv repo referring this node.js issue, shall I open one?

@schamberg97
Copy link
Contributor

This should hopefully be resolved if/when libuv/libuv#3072 is merged

@targos targos added libuv Issues and PRs related to the libuv dependency or the uv binding. os Issues and PRs related to the os subsystem. labels Dec 27, 2020
@juanarbol
Copy link
Member

Hi @schamberg97, I saw that libuv/libuv#3072 got landed -but is not in Node yet- (thanks for the help!), I' ll close this issue, but, feel free to reopen if needed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libuv Issues and PRs related to the libuv dependency or the uv binding. os Issues and PRs related to the os subsystem.
Projects
None yet
Development

No branches or pull requests

6 participants