New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve registry value
return types
#10806
Conversation
Specifically: - REG_MULTI_SZ returns list<string> instead of | str join (char nl) - REG_EXPAND_SZ automatically expands %EnvironmentString% placeholders - REG_DWORD_BIG_ENDIAN returns the endianness decoded value - REG_QWORD actually ingests values over 2**32 - REG_LINK produces a decoded string - REG_*RESOURCE* produces binary data
Thanks @CAD97. I appreciate your work here. I'm fine with making your changes the default behavior, because these changes are kind of a QOL improvement. Most people will probably want to use this manipulated output that you've created. I see that value, for sure, but I don't want to lose track of my original intent, which was to get the values, without manipulation, from the registry. With that in mind, would you be interested in adding a This way, we can have the helpful defaults that you've written and then for the people that need it, we can have the raw data.
You should feel free. I know you well enough by your PRs to know that you have good instincts. If there's something else you want to discuss, feel free to ping me on Discord or throw up an issue for discussion. |
The thing is, the current behavior isn't really "without manipulation" either. If we start from the raw bytes retrieved from the registry, the decoding currently done is:
This proposed implementation instead does:
Note that while I do mutate When you look at it like that, it's not that big of a diff... | each { update value (match $in.type {
REG_NONE => {|| null }
REG_BINARY => {|it| $it }
REG_DWORD => {|it| $it | into int }
- REG_DWORD_BIG_ENDIAN => {|it| $it | into int }
+ REG_DWORD_BIG_ENDIAN => {|it| $it | into int --endian big }
- REG_QWORD => {|it| $it | take 4 | into int }
+ REG_QWORD => {|it| $it | into int }
REG_SZ => {|it| $it | decode utf-16 | str trim --right "\u{0}" }
- REG_MULTI_SZ => {|it| $it | decode utf-16 | str trim --right "\u{0}"` | str replace --all "\u{0}" "\n" }
+ REG_MULTI_SZ => {|it| $it | decode utf-16 | str trim --right "\u{0}"` | split row "\u{0}" }
- REG_EXPAND_SZ => {|it| $it | decode utf-16 | str trim --right "\u{0}"}
+ REG_EXPAND_SZ => {|it| $it | decode utf-16 | str trim --right "\u{0}" | str expand-env-strings}
- REG_LINK => {|it| $"($it)" }
+ REG_LINK => {|it| $it | decode utf-16 | str trim --right "\u{0}" }
- REG_RESOURCE_LIST => {|it| $"($it)" }
- REG_FULL_RESOURCE_DESCRIPTOR => {|it| $"($it)" }
- REG_RESOURCE_REQUIREMENTS_LIST => {|it| $"($it)" }
+ REG_RESOURCE_LIST => {|it| $it }
+ REG_FULL_RESOURCE_DESCRIPTOR => {|it| $it }
+ REG_RESOURCE_REQUIREMENTS_LIST => {|it| $it }
_ => (make error "invalid registry type")
})} The biggest reason for the net line add is the very explicit reasoning around calling
In that case.... Give me a couple days to think through it a bit more and I'll have a concrete proposal. Before giving it much more thought I'm considering roughly a shape of...# Commands for interacting with the Windows registry.
#
# This command does nothing by itself.
# You must use one of the subcommands.
export def registry [] { help registry }
def "nu-complete registry hive" [context: string] {
mut opts = []
# note: only completes the short form lowercase,
# but should accept long form, case insensitive
if context !~ "--app " { $opts = $opts | append [
{value: "hkcr", description: "the HKEY_CLASSES_ROOT hive"}
{value: "hkcu", description: "the HKEY_CURRENT_USER hive"}
{value: "hklm", description: "the HKEY_LOCAL_MACHINE hive"}
{value: "hku", description: "the HKEY_USERS hive"}
{value: "hkpd", description: "the HKEY_PERFORMANCE_DATA hive"}
{value: "hkpt", description: "the HKEY_PERFORMANCE_TEXT hive"}
{value: "hknls", description: "the HKEY_PERFORMANCE_NLSTEXT hive"}
{value: "hkcc", description: "the HKEY_CURRENT_CONFIG hive"}
{value: "hkdd", description: "the HKEY_DYN_DATA hive"}
{value: "hkculs", description: "the HKEY_CURRENT_USER_LOCAL_SETTINGS hive"}
]}
if context !~ "--sys " { $opts = $opts | append (
# yes I know this doesn't actually work as-is
nu-complete path $context
)}
$opts
}
# Query raw entries from the Windows registry.
export extern "registry query" [
--sys # restrict query to system registry hives (incompatible with --app)
--app # restrict query to application registry hives (incompatible with --sys)
hive: string@"nu-complete registry hive" # name of a system hive or path to an application hive
key?: string # registry keypath to query; empty string means the hive root
value?: string # registry value to query; empty string means the default value
]: [
<nothing> | registry query <string> <string?> -> <table<name: string, value: bytes, type: string>>
<nothing> | registry query <string> <string> <string> -> <record<name: string, value: bytes, type: string>>
]
# Expands environment variable expansion placeholders.
#
# If neither switch is provided, defaults to expanding the OS native format.
export extern "str expand-env" [
--windows # expand windows-style %VAR% placeholders
--unix # expand unix-style $VAR placeholders
]: [
list<string> -> list<string>
string -> string
]
# and/or a new --env switch for "path expand"
# (e.g. Python provides as os.path.expandvars)
# Decode raw Windows registry entries.
export def "registry decode" []: [
table<name: string, value: bytes, type: string> -> record
record<name: string, value: bytes, type: string> -> <bytes or string or int>
] {
let it = $in
if ($it | describe | str replace -r "<.*" "") in [table list] {
return $in | each { registry decode }
}
let decode = match $it.type {
# as in the previous details panel's diff
}
do $decode $it.value
}
# or: "registry into record" and "registry into value", a la "into values"
# *maybe* teach it to decode registry performance queries
# Retrieve values from the Windows registry.
export def "registry get" [
--sys # restrict query to system registry hives (incompatible with --app)
--app # restrict query to application registry hives (incompatible with --sys)
hive: string@"nu-complete registry hive" # name of a system hive or path to an application hive
key?: string # registry keypath to query; empty string means the hive root
value?: string # registry value to query; empty string means the default value
]: [
<nothing> | registry get <string> <string?> -> <record>
<nothing> | registry get <string> <string> <string> -> <bytes or string or int>
] {
match [$sys $app] {
[true true] => registry query --sys --app $hive $key $value
[true false] => registry query --sys $hive $key $value
[false true] => registry query --app $hive $key $value
[false false] => registry query $hive $key $value
} | registry decode
}
# And I do like the concept of a --depth N switch for recursively opening subkeys,
# but I don't know how that'd fit in, especially as IIUC subkey/value are separate namespaces Need to think more about it but felt the need to write that sketch down, I guess. Footnotes
|
Technically, I agree. You are right. However, I'd still like to have a I look forward to your proposal. |
When provided, we don't expand %ENV% placeholders in REG_EXPAND_SZ registry values, restoring previous behavior.
I added a > registry query Environment TEMP
╭───────┬─────────────────────────────────╮
│ name │ TEMP │
│ value │ C:\Users\CAD\AppData\Local\Temp │
│ type │ REG_EXPAND_SZ │
╰───────┴─────────────────────────────────╯
> registry query Environment TEMP --no-expand
╭───────┬──────────────────────────────────╮
│ name │ TEMP │
│ value │ %USERPROFILE%\AppData\Local\Temp │
│ type │ REG_EXPAND_SZ │
╰───────┴──────────────────────────────────╯ Footnotes
|
I'm good with this now too. Thanks. I like |
@CAD97 I just noticed that there are 2 unwraps() here https://github.com/nushell/nushell/pull/10806/files#diff-b395629258122f55118d05c97fc8449f9b307e4bc4f4f7558d7c9ac42b8d3db3R310-R322 I'm not sure how the CI didn't catch these. Do these just need to be unwrap_or(0) or what? |
The unwraps are effectively infallible; the precondition (the vtype is correct) is checked just before. This is where If we do eventually decide to factor out some sort of |
I guess nushell is in the religious "don't unwrap" camp since the CI is supposed to fail with any unwrap, except for tests. I'm not really sure how this PR squeaked by with unwraps. So, we have to change them. For now, I'll just make them expects and reference your points. Thanks for the input. |
r? @fdncred Last one, I hope. At least short of completely redesigning `registry query`'s interface. (Which I wouldn't implement without asking around first.) # Description User-Facing Changes has the general overview. Inline comments provide a lot of justification on specific choices. Most of the type conversions should be reasonably noncontroversial, but expanding `REG_EXPAND_SZ` needs some justification. First, an example of the behavior there: ```shell > # release nushell: > version | select version commit_hash | to md --pretty | version | commit_hash | | ------- | ---------------------------------------- | | 0.85.0 | a6f62e0 | > registry query --hkcu Environment TEMP | get value %USERPROFILE%\AppData\Local\Temp > # with this patch: > version | select version commit_hash | to md --pretty | version | commit_hash | | ------- | ---------------------------------------- | | 0.86.1 | 0c5a4c9 | > registry query --hkcu Environment TEMP | get value C:\Users\CAD\AppData\Local\Temp > # Microsoft CLI tooling behavior: > ^pwsh -c `(Get-ItemProperty HKCU:\Environment).TEMP` C:\Users\CAD\AppData\Local\Temp > ^reg query HKCU\Environment /v TEMP HKEY_CURRENT_USER\Environment TEMP REG_EXPAND_SZ %USERPROFILE%\AppData\Local\Temp ``` As noted in the inline comments, I'm arguing that it makes more sense to eagerly expand the %EnvironmentString% placeholders, as none of Nushell's path functionality will interpret these placeholders. This makes the behavior of `registry query` match the behavior of pwsh's `Get-ItemProperty` registry access, and means that paths (the most common use of `REG_EXPAND_SZ`) are actually usable. This does *not* break nu_script's [`update-path`](https://github.com/nushell/nu_scripts/blob/main/sourced/update-path.nu); it will just be slightly inefficient as it will not find any `%Placeholder%`s to manually expand anymore. But also, note that `update-path` is currently *wrong*, as a path including `%LocalAppData%Low` is perfectly valid and sometimes used (to go to `Appdata\LocalLow`); expansion isn't done solely on a path segment basis, as is implemented by `update-path`. I believe that the type conversions implemented by this patch are essentially always desired. But if we want to keep `registry query` "pure", we could easily introduce a `registry get`[^get] which does the more complete interpretation of registry types, and leave `registry query` alone as doing the bare minimum. Or we could teach `path expand` to do `ExpandEnvironmentStringsW`. But REG_EXPAND_SZ being the odd one out of not getting its registry type semantics decoded by `registry query` seems wrong. [^get]: This is the potential redesign I alluded to at the top. One potential change could be to make `registry get Environment` produce `record<Path: string, TEMP: string, TMP: string>` instead of `registry query`'s `table<name: string, value: string, type: string>`, the idea being to make it feel as native as possible. We could even translate between Nu's cell-path and registry paths -- cell paths with spaces do actually work, if a bit awkwardly -- or even introduce lazy records so the registry can be traversed with normal data manipulation ... but that all seems a bit much. # User-Facing Changes - `registry query`'s produced `value` has changed. Specifically: - ❗ Rows `where type == REG_EXPAND_SZ` now expand `%EnvironmentVarable%` placeholders for you. For example, `registry query --hkcu Environment TEMP | get value` returns `C:\Users\CAD\AppData\Local\Temp` instead of `%USERPROFILE%\AppData\Local\Temp`. - You can restore the old behavior and preserve the placeholders by passing a new `--no-expand` switch. - Rows `where type == REG_MULTI_SZ` now provide a `list<string>` value. They previously had that same list, but `| str join "\n"`. - Rows `where type == REG_DWORD_BIG_ENDIAN` now provide the correct numeric value instead of a byte-swapped value. - Rows `where type == REG_QWORD` now provide the correct numeric value[^sign] instead of the value modulo 2<sup>32</sup>. - Rows `where type == REG_LINK` now provide a string value of the link target registry path instead of an internal debug string representation. (This should never be visible, as links should be transparently followed.) - Rows `where type =~ RESOURCE` now provide a binary value instead of an internal debug string representation. [^sign]: Nu's `int` is a signed 64-bit integer. As such, values >= 2<sup>63</sup> will be reported as their negative two's compliment value. This might sometimes be the correct interpretation -- the registry does not distinguish between signed and unsigned integer values -- but regedit and pwsh display all values as unsigned.
r? @fdncred
Last one, I hope. At least short of completely redesigning
registry query
's interface. (Which I wouldn't implement without asking around first.)Description
User-Facing Changes has the general overview. Inline comments provide a lot of justification on specific choices. Most of the type conversions should be reasonably noncontroversial, but expanding
REG_EXPAND_SZ
needs some justification. First, an example of the behavior there:As noted in the inline comments, I'm arguing that it makes more sense to eagerly expand the %EnvironmentString% placeholders, as none of Nushell's path functionality will interpret these placeholders. This makes the behavior of
registry query
match the behavior of pwsh'sGet-ItemProperty
registry access, and means that paths (the most common use ofREG_EXPAND_SZ
) are actually usable.This does not break nu_script's
update-path
; it will just be slightly inefficient as it will not find any%Placeholder%
s to manually expand anymore. But also, note thatupdate-path
is currently wrong, as a path including%LocalAppData%Low
is perfectly valid and sometimes used (to go toAppdata\LocalLow
); expansion isn't done solely on a path segment basis, as is implemented byupdate-path
.I believe that the type conversions implemented by this patch are essentially always desired. But if we want to keep
registry query
"pure", we could easily introduce aregistry get
1 which does the more complete interpretation of registry types, and leaveregistry query
alone as doing the bare minimum. Or we could teachpath expand
to doExpandEnvironmentStringsW
. But REG_EXPAND_SZ being the odd one out of not getting its registry type semantics decoded byregistry query
seems wrong.User-Facing Changes
registry query
's producedvalue
has changed. Specifically:where type == REG_EXPAND_SZ
now expand%EnvironmentVarable%
placeholders for you. For example,registry query --hkcu Environment TEMP | get value
returnsC:\Users\CAD\AppData\Local\Temp
instead of%USERPROFILE%\AppData\Local\Temp
.--no-expand
switch.where type == REG_MULTI_SZ
now provide alist<string>
value. They previously had that same list, but| str join "\n"
.where type == REG_DWORD_BIG_ENDIAN
now provide the correct numeric value instead of a byte-swapped value.where type == REG_QWORD
now provide the correct numeric value2 instead of the value modulo 232.where type == REG_LINK
now provide a string value of the link target registry path instead of an internal debug string representation. (This should never be visible, as links should be transparently followed.)where type =~ RESOURCE
now provide a binary value instead of an internal debug string representation.Footnotes
This is the potential redesign I alluded to at the top. One potential change could be to make
registry get Environment
producerecord<Path: string, TEMP: string, TMP: string>
instead ofregistry query
'stable<name: string, value: string, type: string>
, the idea being to make it feel as native as possible. We could even translate between Nu's cell-path and registry paths -- cell paths with spaces do actually work, if a bit awkwardly -- or even introduce lazy records so the registry can be traversed with normal data manipulation ... but that all seems a bit much. ↩Nu's
int
is a signed 64-bit integer. As such, values >= 263 will be reported as their negative two's compliment value. This might sometimes be the correct interpretation -- the registry does not distinguish between signed and unsigned integer values -- but regedit and pwsh display all values as unsigned. ↩