Cleanup process tree #143

gaborcsardi · 2018-07-23T13:43:01Z

Every processx process p is marked with an environment variable that has a random part and a time stamp. Importantly, this variable is not set in the current process, that would lead to having it set in other, non-processx child processes, eg. started by RStudio.

This env var is then inherited by all processes in the process tree rooted at p, even if some processes in the tree are orphaned. (In theory processes could opt out and unset the env var, but we don't anticipate this happening very often.)

kill_tree() uses this env var, and the (internal) ps package function ps_kill_tree to clean up the whole tree.

Closes #139.

EDIT: some more notes:

Only Windows, macOS and Linux are supported, because these are the systems for which ps can read out environment variables from arbitrary processes. On unsupported platforms kill_tree() fails with "not_implemented".
ps is currently optional, should we just import it? It has no hard dependencies.
kill_tree() returns data about the killed processes, I'll add that to the docs in a minute.

Closes #139

codecov-io · 2018-07-23T14:02:12Z

Codecov Report

Merging #143 into master will increase coverage by 0.62%.
The diff coverage is 74.44%.

@@            Coverage Diff             @@
##           master     #143      +/-   ##
==========================================
+ Coverage   69.67%   70.29%   +0.62%     
==========================================
  Files          29       30       +1     
  Lines        2783     2956     +173     
==========================================
+ Hits         1939     2078     +139     
- Misses        844      878      +34

Impacted Files	Coverage Δ
src/init.c	`100% <ø> (ø)`	⬆️
src/tools/px.c	`59.45% <0%> (-2.23%)`	⬇️
R/on-load.R	`0% <0%> (ø)`	⬆️
src/unix/processx.c	`59.83% <40%> (-0.28%)`	⬇️
R/process.R	`70.86% <53.84%> (-2.18%)`	⬇️
src/create-time.c	`72.54% <72.54%> (ø)`
R/initialize.R	`93.84% <80%> (-1.32%)`	⬇️
R/utils.R	`91.89% <88.88%> (-0.27%)`	⬇️
src/win/processx.c	`83.87% <95.12%> (+2.64%)`	⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f7243ff...00c38dc. Read the comment docs.

jimhester · 2018-07-23T14:02:29Z

R/utils.R

+  paste0(
+    "PS",
+    paste(sample(c(LETTERS, 0:9), 10, replace = TRUE), collapse = ""),
+    "_", as.integer(Internal(Sys.time()))


Do we really need to use .Internal(), or could we just unclass(Sys.time())?

Without .Internal(), on some platforms it is ~10-1000x slower. Considering that process$new() is currently about 1-3 ms, and can be still faster by avoiding assert_that(), etc., we might want the faster solution.

Wow, yeah structure() is really slow.

struct <- function(x, ...) { attributes(x) <- c(attributes(x), list(...)) x } bench::mark( Sys.time(), .Internal(Sys.time()), .POSIXct(.Internal(Sys.time())), struct(.Internal(Sys.time()), class = c("POSIXct", "POSIXt")), check=FALSE, relative = TRUE) #> # A tibble: 4 x 10 #> expression min mean median max `itr/sec` mem_alloc n_gc n_itr #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 Sys.time() 63.2 46.9 45.2 23.7 1 NaN Inf 1 #> 2 .Internal… 1 1 1 1 46.9 NaN NaN 1.00 #> 3 .POSIXct(… 60.4 38.9 40.4 12.3 1.21 NaN Inf 1 #> 4 "struct(.… 17.5 13.5 11.0 507. 3.48 NaN Inf 1.00 #> # ... with 1 more variable: total_time <dbl>

Sounds like we what you have is the best option then!

jimhester · 2018-07-23T14:04:48Z

R/utils.R

+  )
+}
+
+format_unix_time <- function(z) {


Could this just be a call to .POSIXct(x, tz = "GMT")?

Isn't .POSIXct an internal function? And as.POSIXct() is very slow.

By convention I guess yes, but because it is a base function they are all the equivalent of exported. I don't believe you get a NOTE from using them...

Not just by convention:

Internal objects in the base package most of which are only user-visible because of the special nature of the base namespace.

Yes that is what I mean, if they were in a different package they would not be exported; but because they are in base everything is visible, so they are only internal by convention.

jimhester · 2018-07-23T14:06:14Z

R/utils.R

+  structure(z, class = c("POSIXct", "POSIXt"), tzone = "GMT")
+}
+
+r_version <- function(x) {


Why not base::getRversion() for this?

Because I did not find that.... thanks!

jimhester · 2018-07-23T14:09:10Z

R/process.R

+      class = c("not_implemented", "error", "condition")))
+  }
+
+  get("ps_kill_tree", asNamespace("ps"))(private$tree_id)


Why do you need to call an unexported function? Couldn't you export it in ps if needed?

Because the kill-tree functions are somewhat dangerous if R is multi-threaded and the other threads are starting processes, e.g. with_process_cleanup() reliably crashes RStudio. So I don't want people to call these, necessarily. We could still export ps_kill_tree(), but I just want to test it a bit in the wild, before letting people use it in their packages.

jimhester · 2018-07-23T14:12:15Z

src/create-time.c

+  }
+
+  ll = ((LONGLONG) ftCreate.dwHighDateTime) << 32;
+  ll += ftCreate.dwLowDateTime - 116444736000000000LL;


It is probably worth linking to documentation on this magic number. maybe https://docs.microsoft.com/en-us/windows/desktop/sysinfo/converting-a-time-t-value-to-a-file-time ?

Yeah, why not, it is just the difference between the two origins....

jimhester · 2018-07-23T14:27:13Z

src/create-time.c

+
+  do {
+    if (rem_size == 0) {
+      *buffer = S_realloc(*buffer, buffer_size * 2, buffer_size, 1);


Instead of growing the buffer as you go why not just allocate the buffer to the full file size up front, using either lseek to seek to the end or with fstat? e.g. https://stackoverflow.com/a/13322673/2055486

You can't do that with files in /proc, their size is not known. Based on my quick test, you also cannot seek to the end in them. They essentially behave like character devices AFAICT.

jimhester · 2018-07-23T14:33:24Z

src/create-time.c

+    return 0.0;
+  }
+
+  ret = processx__read_file(path, &buf, /* buffer= */ 2048);


If this proc file is always a single line why can you not use a single call to getline() rather than writing processx__read_file().

This one is one line, but /proc/stat is not, for example. It is just better to have a generic solution that always works. Although we only read two files here, this is from ps, which reads a bunch of different files in /proc.

jimhester · 2018-07-23T14:37:53Z

src/create-time.c

+  return starttime;
+}
+
+void *processx__memmem(const void *haystack, size_t n1,


Is there are reason you need memmem here and cannot use strstr? Aren't needle and haystack both strings in your case?

strstr is good for this file. Some \proc files have zero separated records, and this is a generic solution from ps.

gaborcsardi · 2018-07-23T15:53:25Z

Thanks!

gaborcsardi · 2018-07-24T12:22:20Z

@jimhester What do you think about importing ps in processx? Basically a processx process is a superset of a ps process: it is a child process, so you can do more with it. So it would make sense to define processx methods for the ps operations, I think.

ps only works on macOS, Linux and Windows currently, so on other platforms these methods will just throw "not_implemented".

OTOH, ps could also use processx. processx has an interrupt() method, which is cool, ps could define that for external processes, but on windows, it needs starting an external process. (Yeah, I know.) So it could use processx. Of course that cannot both depend on the other, the only solutions are 1) merging them or 2) using system2() in ps. 2) is not that bad, because we don't need a background process.

- Use getRversion() - Add comment about FILETIME -> Unix time conversion.

jimhester · 2018-07-24T12:51:03Z

I guess technically they could both depend on each other as long as the dependency was a runtime, rather than build-time dependency.

gaborcsardi added 12 commits July 23, 2018 11:22

Tree ids on Unix

1c510a4

Tree ids on windows if env is given

bc41753

Tree ids on windows if env if omitted

e5fd35d

Test cases for tree ids

7feb7cf

Proper create_time for Linux and macOS

a5b4fec

Proper create_time on windows

de5258e

Fix a create_time() bug on Linux

4828ecb

Update kill_tree test cases for new ps api

56b5683

More accurate start time on windows

103b490

Linux create_time: close fd

a6afddf

ps is on CRAN now, remove Remotes

87a237b

Process tree cleanup

0f986c3

Closes #139

gaborcsardi requested a review from jimhester July 23, 2018 13:43

gaborcsardi added 2 commits July 23, 2018 14:51

Document kill_tree() return value

ccf3eac

Update NEWS, README

10bc977

jimhester reviewed Jul 23, 2018

View reviewed changes

Refactoring for pull #143

00c38dc

- Use getRversion() - Add comment about FILETIME -> Unix time conversion.

gaborcsardi merged commit a26ff9f into master Jul 24, 2018

gaborcsardi deleted the featue/kill-tree branch August 15, 2018 21:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanup process tree #143

Cleanup process tree #143

gaborcsardi commented Jul 23, 2018 •

edited

codecov-io commented Jul 23, 2018 •

edited

jimhester Jul 23, 2018

gaborcsardi Jul 23, 2018

jimhester Jul 23, 2018

jimhester Jul 23, 2018

gaborcsardi Jul 23, 2018

jimhester Jul 23, 2018

gaborcsardi Jul 24, 2018

jimhester Jul 24, 2018

jimhester Jul 23, 2018

gaborcsardi Jul 23, 2018

gaborcsardi Jul 24, 2018

jimhester Jul 23, 2018

gaborcsardi Jul 23, 2018

jimhester Jul 23, 2018

gaborcsardi Jul 23, 2018

gaborcsardi Jul 24, 2018

jimhester Jul 23, 2018

gaborcsardi Jul 23, 2018

jimhester Jul 23, 2018

gaborcsardi Jul 23, 2018

jimhester Jul 23, 2018

gaborcsardi Jul 23, 2018

gaborcsardi commented Jul 23, 2018

gaborcsardi commented Jul 24, 2018

jimhester commented Jul 24, 2018

Cleanup process tree #143

Cleanup process tree #143

Conversation

gaborcsardi commented Jul 23, 2018 • edited

codecov-io commented Jul 23, 2018 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gaborcsardi commented Jul 23, 2018

gaborcsardi commented Jul 24, 2018

jimhester commented Jul 24, 2018

gaborcsardi commented Jul 23, 2018 •

edited

codecov-io commented Jul 23, 2018 •

edited