New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add string builtin #2296
Add string builtin #2296
Conversation
- string builtin source, tests, & docs - changes to configure.ac & Makefile.in
The pcre2 build is failing due to: |
It's obviously choking on If anyone wants to have a look at what this looks like in real usage, see faho/fish-shell@34a7832 on my personal "string" branch where I've replaced the My questions from that:
Are you trying to improve those? Otherwise, that's some fantastic work, and I look forward to using it in earnest since I've been uncomfortable with adding user input to sed's expressions for a while (e.g. |
I have no objection to this but the command line syntax would have to change. What do you suggest?
The regex string replace functionality is exactly what To get a tab or other escape sequence, you can do the usual fish quote-unquote thing:
Thanks for the reminder; pcre2 does provide human-friendly error strings so I will replace the error codes with those. |
I'd add a flag to specify that the next two arguments are pattern/replacement, something like
where "-r" applies to all pairs, regardless where it is specified, and a "-p" for the first pair is optional. (This means of course that arguments on the commandline need to be protected via "--", but that's already an issue) (It also wouldn't easily work with our current completion syntax, but there's currently not much we could complete anyway)
All pcre documentation I've found (and all applicable programs I know) use
Could you add it? I believe that the current situation is non-intuitive.
Ah, okay. |
Okay, I like that. Will do.
FYI here is the description of If we automatically honor backslash escapes within replacement strings, I think this will be the only place within fish where strings are treated this way. Is that the right move? Another alternative to playing with quotes is substitution with
|
The downside with Then this would conflict with variables. What would happen in this case? I am assuming it would use the variable $1.
|
It's also already the only place "within fish" that we honor "\w". I believe, especially considering that "sed" already does it, that it's quite intuitive that the "string" "command" accepts different arguments from other commands. Whether it's a built-in or not isn't too visible to users (of course they could check, but they have to explicitly do that).
Yeah, that's.... not great. Would it be possible to just change every "$([0-9])" to "(\1)" or is that prohibitively expensive or error-prone? I'm also probably massively over-thinking this - it's not the end of the world if I have to get used to "$1" instead of "\1".
Yes, it would. Of course using single quotes as much as possible is probably better in general and especially for things like this were you use unusual characters and don't want to constantly escape (which is why it's a blessing that fish usually doesn't do that), but if you were to use double quotes (and I'm guilty of using them too much), then you'd need to escape any "$". |
Okay, I can see that. Here is the list of escapes handled by the (undocumented)
If we go ahead with your suggestion I think it makes sense to handle the same set of escapes. Thoughts? On a related note, |
(I added that documentation in #2290 - though it was also previously added and removed again)
Yeah, it's best to stay consistent - though I can't say I've ever seen the need for the bell char, and I don't think "\c" applies here. The unicode escapes are nice, though, as is \n and \t.
You know what? The more I look at it, the more I see the issues that changing around pcre2 would cause for that one single piece of consistency with something else. It would have been nice if pcre2 didn't choose to use incompatible syntax, but as it stands now I'm beginning to convince myself that we should just keep that as is, as long as it's documented - which you've already done. There's another thing I've found, though, and that's the behavior without the "-a" option:
i.e. without the "-a" option, produces this output:
It only operates on the first line! Now, I'd have expected this tool to be line-based, to operate on every line (when given input via stdin), and the "-a" option to be analogous to sed's "g" (as in Is this by design? |
Honestly I simply wasn't thinking of the behavior of line-oriented tools, but that behavior makes more sense. I'll make it so the absence of I appreciate your feedback. I came at this with a few simple uses cases in mind and it helps to hear from someone wanting to solve different problems. |
Clean up completions for Fossil
Match the whole real home directory in prompt_pwd.
Emit a warning but keep building
Avoids differences in widths of wchar_t, hopefully addressing issue fish-shell#2284
This is already done by fish before calling the completion. It breaks completion with combiners (fish-shell#2025) and also with wrappers. (This does not include git because that's better solved in fish-shell#2145)
Closes fish-shell/fish-site#25. Signed-off-by: David Adam <zanchey@ucc.gu.uwa.edu.au> [skip ci]
It is duplicative of the fish_mode_prompt function Fixes fish-shell#2228
Make it simpler, and use wcstring instead of wcsdup
- make match/replace without -a operate on the first match on each argument - use different exit codes for "no operation performed" and errors, as grep does - refactor regex compile code - use human-friendly error messages from pcre2 - improve error handling & reporting elsewhere - add a few tests - make some doc fixes - some simplification & cleanup - fix ci build failure (I hope)
Okay, finally got the CI build of pcre2 squared away. @faho: I took your suggestions except for accepting multiple pattern/replacement pairs. After working on it for a bit, I concluded that the getopt hackery required to have an option take two arguments, combined with its usual argument permutation behavior, made a reliable implementation more trouble than it was worth. That could be added in the future though. @ridiculousfish: the problems noted in #2296 (comment) still exist. Otherwise I think this is ready to go. |
Squashed commit of the following: commit 4c3eaeb6e57d76463e9683c327142b0aeafb92b8 Author: ridiculousfish <corydoras@ridiculousfish.com> Date: Sat Sep 12 12:51:30 2015 -0700 Remove testdata and doc dirs from pcre2 source commit b2a8b4b50f2398b204fb72cfe4b5ba77ece2e1ab Merge: 11c8a47 7974aab Author: ridiculousfish <corydoras@ridiculousfish.com> Date: Sat Sep 12 12:32:40 2015 -0700 Merge branch 'string' of git://github.com/msteed/fish-shell into string-test commit 7974aab Author: Michael Steed <msteed@saltstack.com> Date: Fri Sep 11 13:00:02 2015 -0600 build pcre2 lib only, no docs commit eb20b43 Merge: 1a09e70 5f519cb Author: Michael Steed <msteed68@gmail.com> Date: Thu Sep 10 20:00:47 2015 -0600 Merge branch 'string' of github.com:msteed/fish-shell into string commit 1a09e70 Author: Michael Steed <msteed68@gmail.com> Date: Thu Sep 10 19:58:24 2015 -0600 rebase on master & address the fallout commit a0ec977 Author: Michael Steed <msteed68@gmail.com> Date: Thu Sep 10 19:26:45 2015 -0600 use fish's wildcard_match() for glob matching commit 64c25a0 Author: Michael Steed <msteed68@gmail.com> Date: Thu Aug 27 08:19:23 2015 -0600 some fixes from review - string_get_arg_stdin(): simplify and don't discard the argument when the trailing newline is absent - fix calls to pcre2 for e.g. string match -r -a 'a*' 'b' - correct test for args coming from stdin commit ece7f35 Author: Michael Steed <msteed68@gmail.com> Date: Sat Aug 22 19:35:56 2015 -0600 fixes from review - Makefile.in: restore iwyu target - regex_replacer_t::replace_matches(): correct size passed to realloc() commit 9ff7477 Author: Michael Steed <msteed68@gmail.com> Date: Thu Aug 20 13:08:33 2015 -0600 Minor doc improvements commit baf4e09 Author: Michael Steed <msteed68@gmail.com> Date: Wed Aug 19 18:29:02 2015 -0600 another attempt to fix the ci build commit 896a2c2 Author: Michael Steed <msteed68@gmail.com> Date: Wed Aug 19 18:03:49 2015 -0600 Updates after review comments - make match/replace without -a operate on the first match on each argument - use different exit codes for "no operation performed" and errors, as grep does - refactor regex compile code - use human-friendly error messages from pcre2 - improve error handling & reporting elsewhere - add a few tests - make some doc fixes - some simplification & cleanup - fix ci build failure (I hope) commit efd47dc Author: Michael Steed <msteed68@gmail.com> Date: Wed Aug 12 00:26:07 2015 -0600 fix dependencies for parallel make commit ed0850e Author: Michael Steed <msteed68@gmail.com> Date: Tue Aug 11 23:37:22 2015 -0600 Add missing pcre2 files + .gitignore commit 9492e7a Author: Michael Steed <msteed68@gmail.com> Date: Tue Aug 11 22:44:05 2015 -0600 add pcre2-10.20 and update license.hdr commit 1a60b93 Author: Michael Steed <msteed68@gmail.com> Date: Tue Aug 11 22:41:19 2015 -0600 add string builtin files - string builtin source, tests, & docs - changes to configure.ac & Makefile.in commit 5f519cb Author: Michael Steed <msteed68@gmail.com> Date: Thu Sep 10 19:26:45 2015 -0600 use fish's wildcard_match() for glob matching commit 2ecd24f Author: Michael Steed <msteed68@gmail.com> Date: Thu Aug 27 08:19:23 2015 -0600 some fixes from review - string_get_arg_stdin(): simplify and don't discard the argument when the trailing newline is absent - fix calls to pcre2 for e.g. string match -r -a 'a*' 'b' - correct test for args coming from stdin commit 45b777e Author: Michael Steed <msteed68@gmail.com> Date: Sat Aug 22 19:35:56 2015 -0600 fixes from review - Makefile.in: restore iwyu target - regex_replacer_t::replace_matches(): correct size passed to realloc() commit 981cbb6 Author: Michael Steed <msteed68@gmail.com> Date: Thu Aug 20 13:08:33 2015 -0600 Minor doc improvements commit ddb6a2a Author: Michael Steed <msteed68@gmail.com> Date: Wed Aug 19 18:29:02 2015 -0600 another attempt to fix the ci build commit 1e34e31 Author: Michael Steed <msteed68@gmail.com> Date: Wed Aug 19 18:03:49 2015 -0600 Updates after review comments - make match/replace without -a operate on the first match on each argument - use different exit codes for "no operation performed" and errors, as grep does - refactor regex compile code - use human-friendly error messages from pcre2 - improve error handling & reporting elsewhere - add a few tests - make some doc fixes - some simplification & cleanup - fix ci build failure (I hope) commit 34232e1 Author: Michael Steed <msteed68@gmail.com> Date: Wed Aug 12 00:26:07 2015 -0600 fix dependencies for parallel make commit 00d7e78 Author: Michael Steed <msteed68@gmail.com> Date: Tue Aug 11 23:37:22 2015 -0600 Add missing pcre2 files + .gitignore commit 4498aa5 Author: Michael Steed <msteed68@gmail.com> Date: Tue Aug 11 22:44:05 2015 -0600 add pcre2-10.20 and update license.hdr commit 290c58c Author: Michael Steed <msteed68@gmail.com> Date: Tue Aug 11 22:41:19 2015 -0600 add string builtin files - string builtin source, tests, & docs - changes to configure.ac & Makefile.in
I've squash-merged to a new branch string-staging. I'll use that to add support in the Xcode build. zanchey, you can take a closer look at how it's plugged into the build too on this branch if you like. I also hope to add support for using OS X's libpcre here. |
I think OS X (and most Linux distributions) use the old PCRE API rather than PCRE2, so I'm not sure how straightforward that will be. Almost no distributions are shipping the PCRE2 libraries yet. |
It should be straightforward to conditionally use PCRE1 or PCRE2 depending on what's available at build time. But OS X doesn't ship any PCRE headers, and they hate it when developers reverse engineer headers. So the OS X build will build the pcre2 library, and link against it statically. This is already working in Xcode on the branch. |
🎉 Wonderful news. |
Excellent! Thanks to @ridiculousfish and @faho for the careful reviews, helpful feedback, and many improvements. Thanks to @kballard for the original interface design. And thanks to everyone who contributed to the discussion on #156. |
Awesome! Some more work needs to be done on the integration with the autotools build - requiring autoreconf prevents building on RHEL 5 due to too-old autoconf, and as our tarballs aren't marked as depending on aclocal the build also fails on openSUSE. I'll try and take a look in the next few days. |
The change c1bd3b5 fixes issue 5 in my list, closing the loop on that. |
What version of fish is this in? It's really difficult to tell when the same milestones (next-2.x) keeps being re-used. Any reason we're doing it that way? |
Since it's in the next-2.x milestone, that means it's not released yet, so it's in whatever comes after 2.2 (which AFAIK hasn't been decided yet, though I'd quite like it if it were 2.3 since I have a thing for bad movies). |
It looks like the next-2.x milestone bucket is being reused, for 2.0, 2.1 and 2.2. That means I couldn't tell which release it was in and this ticket is all the way back in September and I have no idea when 2.2 was released. Iow, was this ticket filed before 2.2 was released or not. Am I missing something? I think my concern here is legit. |
@danielb2: All our releases are git tags. If you wish to know which tag contains a given commit (i.e. "came after" it), use AFAIK, on the last release the "next-2.x" milestone was |
faho is right: milestones are never reused across releases. |
|
@danielb2: You need doxygen to build the docs. You can also try |
thank you |
there. named tmux handling based on dir name. Thanks @msteed! I was using ruby to do this before. All |
@danielb2: Cool! Note that you can also replace |
oh. sweet! thanks :) |
@faho btw man string and string --help didn't work. Does it mean I have to have doxygen installed before I compile? |
Effectively yes. You need doxygen to transform the documentation from our input format to the output formats (the website, the man pages and the "--help" output). This is done as part of the build process. You do not need it just to view the documentation. Doxygen is a build-time dependency. It also says so in README.md under the heading "Building":
|
Refs #156.
I think this is ready, with the following exceptions:
lexicon_filter
or Doxygen seems to be choking on some of the examples indoc_src/string.txt
. I'm not sure how to fix it.