New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup process tree #143
Cleanup process tree #143
Conversation
Closes #139
Codecov Report
@@ Coverage Diff @@
## master #143 +/- ##
==========================================
+ Coverage 69.67% 70.29% +0.62%
==========================================
Files 29 30 +1
Lines 2783 2956 +173
==========================================
+ Hits 1939 2078 +139
- Misses 844 878 +34
Continue to review full report at Codecov.
|
paste0( | ||
"PS", | ||
paste(sample(c(LETTERS, 0:9), 10, replace = TRUE), collapse = ""), | ||
"_", as.integer(Internal(Sys.time())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to use .Internal()
, or could we just unclass(Sys.time())
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without .Internal()
, on some platforms it is ~10-1000x slower. Considering that process$new()
is currently about 1-3 ms, and can be still faster by avoiding assert_that()
, etc., we might want the faster solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, yeah structure()
is really slow.
struct <- function(x, ...) {
attributes(x) <- c(attributes(x), list(...))
x
}
bench::mark(
Sys.time(),
.Internal(Sys.time()),
.POSIXct(.Internal(Sys.time())),
struct(.Internal(Sys.time()), class = c("POSIXct", "POSIXt")),
check=FALSE, relative = TRUE)
#> # A tibble: 4 x 10
#> expression min mean median max `itr/sec` mem_alloc n_gc n_itr
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Sys.time() 63.2 46.9 45.2 23.7 1 NaN Inf 1
#> 2 .Internal… 1 1 1 1 46.9 NaN NaN 1.00
#> 3 .POSIXct(… 60.4 38.9 40.4 12.3 1.21 NaN Inf 1
#> 4 "struct(.… 17.5 13.5 11.0 507. 3.48 NaN Inf 1.00
#> # ... with 1 more variable: total_time <dbl>
Sounds like we what you have is the best option then!
) | ||
} | ||
|
||
format_unix_time <- function(z) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this just be a call to .POSIXct(x, tz = "GMT")
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't .POSIXct
an internal function? And as.POSIXct()
is very slow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By convention I guess yes, but because it is a base function they are all the equivalent of exported. I don't believe you get a NOTE from using them...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not just by convention:
Internal objects in the base package most of which are only user-visible because of the special nature of the base namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is what I mean, if they were in a different package they would not be exported; but because they are in base everything is visible, so they are only internal by convention.
R/utils.R
Outdated
structure(z, class = c("POSIXct", "POSIXt"), tzone = "GMT") | ||
} | ||
|
||
r_version <- function(x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not base::getRversion()
for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I did not find that.... thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
class = c("not_implemented", "error", "condition"))) | ||
} | ||
|
||
get("ps_kill_tree", asNamespace("ps"))(private$tree_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to call an unexported function? Couldn't you export it in ps if needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the kill-tree functions are somewhat dangerous if R is multi-threaded and the other threads are starting processes, e.g. with_process_cleanup()
reliably crashes RStudio. So I don't want people to call these, necessarily. We could still export ps_kill_tree()
, but I just want to test it a bit in the wild, before letting people use it in their packages.
} | ||
|
||
ll = ((LONGLONG) ftCreate.dwHighDateTime) << 32; | ||
ll += ftCreate.dwLowDateTime - 116444736000000000LL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is probably worth linking to documentation on this magic number. maybe https://docs.microsoft.com/en-us/windows/desktop/sysinfo/converting-a-time-t-value-to-a-file-time ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, why not, it is just the difference between the two origins....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
do { | ||
if (rem_size == 0) { | ||
*buffer = S_realloc(*buffer, buffer_size * 2, buffer_size, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of growing the buffer as you go why not just allocate the buffer to the full file size up front, using either lseek to seek to the end or with fstat? e.g. https://stackoverflow.com/a/13322673/2055486
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't do that with files in /proc
, their size is not known. Based on my quick test, you also cannot seek to the end in them. They essentially behave like character devices AFAICT.
return 0.0; | ||
} | ||
|
||
ret = processx__read_file(path, &buf, /* buffer= */ 2048); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this proc file is always a single line why can you not use a single call to getline()
rather than writing processx__read_file()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is one line, but /proc/stat
is not, for example. It is just better to have a generic solution that always works. Although we only read two files here, this is from ps, which reads a bunch of different files in /proc
.
return starttime; | ||
} | ||
|
||
void *processx__memmem(const void *haystack, size_t n1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there are reason you need memmem here and cannot use strstr? Aren't needle and haystack both strings in your case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strstr
is good for this file. Some \proc
files have zero separated records, and this is a generic solution from ps.
Thanks! |
@jimhester What do you think about importing ps in processx? Basically a processx process is a superset of a ps process: it is a child process, so you can do more with it. So it would make sense to define processx methods for the ps operations, I think. ps only works on macOS, Linux and Windows currently, so on other platforms these methods will just throw OTOH, ps could also use processx. processx has an |
- Use getRversion() - Add comment about FILETIME -> Unix time conversion.
I guess technically they could both depend on each other as long as the dependency was a runtime, rather than build-time dependency. |
Every processx process
p
is marked with an environment variable that has a random part and a time stamp. Importantly, this variable is not set in the current process, that would lead to having it set in other, non-processx child processes, eg. started by RStudio.This env var is then inherited by all processes in the process tree rooted at
p
, even if some processes in the tree are orphaned. (In theory processes could opt out and unset the env var, but we don't anticipate this happening very often.)kill_tree()
uses this env var, and the (internal) ps package functionps_kill_tree
to clean up the whole tree.Closes #139.
EDIT: some more notes:
kill_tree()
fails with"not_implemented"
.kill_tree()
returns data about the killed processes, I'll add that to the docs in a minute.