New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wordlists: init #260240
wordlists: init #260240
Conversation
https://github.com/NixOS/nixpkgs/pull/104712/files#diff-2686bfedb20ad54b4cd53e6a12dd787b4b62aa4b7c6d93d2311702706105e24a added documentation to the manual. It would be nice to have documentation added here to know how to use the wordlists at a high level. For example now, it's unclear to me how to use the |
I don't think the manual is a good place for the documentation of this package, but I can add some docs to the met long description which would probably be easiest to find for users. To use the package with a specific set of word lists you would overwrite the arg parameter with --arg which is somewhat common in nix, but I can write some docs for that. |
Why do you think that? |
Because its hard to find that for users compared to the description or longDescription attribute because both of them are shown in search.nixos.org but I can add a section there as well and link it. |
@Pamplemousse I added some docs to the long description, can you check them? |
cc @Tochiaha |
dc21835
to
27681dc
Compare
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/prs-ready-for-review/3032/2813 |
, tree | ||
, wfuzz | ||
, lists ? [ | ||
nmap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this put nmap
in $out/bin
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it would. I don't really care what we add to the default lists we can also opt for just lists that add text files, if you prefer we do that, or do a extra derivation that inherits version and source from the nmap package but only outputs the nmap wordlist.
|
||
symlinkJoin rec { | ||
pname = "wordlists"; | ||
version = "unstable-2023-10-10"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason for this to be unstable if we are the one's maintaining it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, I just went with the version scheme I know which is either use the upstream version which we don't have here or use unstable-date-of-last-edit
Feel free to make suggestions on how we should version it.
@h7x4 thank you for your review, I implemented most of the low effort stuff, and will try to get to the other stuff as soon as I can but my backlog of stuff is currently way too long so it will probably take me a few more days to come back to this. |
15cd588
to
e8383f7
Compare
] | ||
}: | ||
|
||
symlinkJoin rec { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote this in a response in one of the outdated threads, but I think it regards more than the initial discussion, so I thought I'd rather make a new thread.
I've had another look at this. Introducing binaries in the middle of a symlinkJoin
feels like a hack that circumvents the purpose of symlinkJoin
.
I think the canonical way would've been to let the original binaries be standalone tools that takes their input via normal methods like args, envvars or stdin, and then rather wrap the program to fix its inputs. Or at least somehow build the tools in standalone derivations that depend on the lists as input.
How about a design that looks more like this?
flowchart TD
wordlists["wordlists package (symlinkJoin)"]
binaries["binaries (wrapping wordlist collection dir)"]
symlinkedWordlists["wordlist collection (symlinkJoin)"]
wordlist1["wordlist 1"]
wordlist2["wordlist 2"]
wordlist3["wordlist 3"]
wordlists --> binaries
wordlists --> symlinkedWordlists
binaries --> symlinkedWordlists
symlinkedWordlists --> wordlist1
symlinkedWordlists --> wordlist2
symlinkedWordlists --> wordlist3
This would resolve several of the comments that are currently "unfixable" due to the weird use of symlinkJoin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So wordlists package (symlinkJoin) would be a package including wordlists or would that be the current wordlists package we have?
This would be the equivalent of the current wordlists package we already have, consisting of the symlinkJoin
of the binaries derivation and the "wordlist collection" derivation
binaries (wrapping wordlist collection dir) this would be binaries interacting with wordlists like the current wordlists and wordlists_path ones?
Correct. And this derivation will use the "wordlist collection" as its input, so it can refer to it without needing to refer to the $out
of its own derivation.
What is wordlist collection (symlinkJoin) supposed to be?
This is a derivation created with symlinkJoin
which only contains the combination of the wordlists, no binaries and nothing else. The binaries derivation will be able to refer to this derivation as its input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is wordlist collection (symlinkJoin) supposed to be?
This is a derivation created with symlinkJoin which only contains the combination of the wordlists, no binaries and nothing else. The binaries derivation will be able to refer to this derivation as its input.
what would you do about packages like nmap that themselfs have a binary as a output? would you go and do a extra derivation that only includes the wordlist or would you be fine with the way we are currently doing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for elaborating, I'll start working on a implantation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would you do about packages like nmap that themselfs have a binary as a output? would you go and do a extra derivation that only includes the wordlist or would you be fine with the way we are currently doing it?
For cleanliness sake, I think I'd create the derivation not using symlinkJoin
, but rather using either runCommand
or stdenvNoCC.mkDerivation
and then symlinking the contents inside every packages $out/share/wordlists
.
However, using symlinkJoin
shouldn't break anything as long as there's no colliding paths AFAIK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm okay, this feels like it's adding more complexity then it's worth because with the way you suggested if we still want to allow the user to override the lists used we have to give the list of wordlists provided in wordlists package
to every single wordlists collection
which seems like a bit of ugly wiring and we also have to provide every single binaries (wrapping wordlist collection dir)
with the current collection. which this both wouldn't be hard to do it's just adding a lot of boilerplate and seems to over complicate stuff. I don't really see the problem this would fix/the value we would get.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused.
every single wordlists collection
The wordlists collection is a single derivation consisting of the list of wordlists the user provides. wordlist1, wordlist2, and wordlist3 are just example substitutes for these. The suggestion about cleaning the list of wordlist derivations is separate from the architectural change.
The only real change here is that we split up the large symlinkJoin
step into 3 pieces. One for joining the wordlists, one for creating the binaries, and lastly one for the combining these two. We don't need to change how the user should be able to override the list of wordlists.
This is the corresponding diagram for the earlier implementation for reference:
flowchart TD
wordlists["wordlists package (symlinkJoin)\ncontains binaries injected in postBuild"]
wordlist1["wordlist 1"]
wordlist2["wordlist 2"]
wordlist3["wordlist 3"]
wordlists --> wordlist1
wordlists --> wordlist2
wordlists --> wordlist3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current code is good enough and we can refactor/make it better at a later point. The binaries could be extracted to a let statement, but that would just be internal refactoring in the derivation. There would be no point in putting the scripts into the global package namespace anyway since their usecase would be very limited.
Result of 29 packages built:
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: |
Description of changes
This is a simplified take on #104712 and was inspired by it a lot
Things done
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)