New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Regular Expression HOWTO #55084
Comments
The history paragraph "The re module was added in Python 1.5, and provides Perl-style regular expression patterns. Earlier versions of Python came with the regex module, which provided Emacs-style patterns. The regex module was removed completely in Python 2.5." might be eliminated in 3.x, or at least the irrelevant-for-py3 reference to regex. This is a policy decision.
"If you have Tkinter available, you may also want to look at Tools/scripts/redemo.py," Change 'Tkinter' to 'tkinter' and make it a module reference. "Phil Schwartz’s Kodos is also an interactive tool for developing and testing RE patterns." Add the url '(http://kodos.sourceforge.net/)' to the text so that Windows help users can copy and paste it into a browser. (This should be a general policy.) "Python 2.2.2 (#1, Feb 10 2003, 12:57:01)" <_sre.SRE_Match object at 80c4f68> This is correctly updated (for late 2.x and 3.x) "<re.MatchObject instance at 80c9650>" (7 like this) Globally replace 're.MatchObject instance' with '_sre.SRE_Match object'
"[1] Introduced in Python 2.2.2." remove for 3.x here and wherever footnote reference is in the text.
This section is about *using* re.VERBOSE and the benefit thereof, not about not using it. I recommend deleting 'Not' as it gives the impression that the section is a warning about not using, the opposite of the intent.
I ran doctest.testfile("C:/programs/PyDev/py32/Doc/howto/regex.rst", module_relative = False) After the 're...' to '_sre...' substitution above, all 11 failures would be due to 'at 0x#######' address mismatches. I believe changing all 11 addresses to '0x...' (I took this from the doctest doc) would both fix the failures and remove irrelevant detail for human readers. The other 87 examples all passed ;-!. Is there any current doctest-related markup that should be added? |
Your points 1-5 all sound valid to me. Would you like to do make a patch? I don't know what to do about the release number. Probably doesn't hurt anyone to keep it. |
Good points overall. The only subpoint I disagree with is this one: “Add the url '(http://kodos.sourceforge.net/)' to the text so that Windows help users can copy and paste it into a browser. (This should be a general policy.)” IMO, it’s the job of the Sphinx builder to add URIs in plaintext if the format does not have hyperlinks. -1 on cluttering the source and HTML output with duplicated links. |
Oh right, I misread that one. Can't Windows help users right-click and select "Copy URL"? |
Here is the patch implementing all but the url suggestion. Doctest still has 11 failures (changing to '0x...' didn't help). |
A few bits and pieces fixed compared to the previous patch. >>> doctest.testfile("/home/mischa/pydev/Doc/howto/regex.rst", module_relative = False, optionflags=doctest.ELLIPSIS)
TestResults(failed=0, attempted=98) |
It seems that the special sequences description in Matching Characters section need to be updated to incorporate information on unicode and bytes. I don't think, however, that it's a good idea just to copy that information from the Doc/library/re.rst May be the section could be shortened and linked to that RE Syntax section? there aren't any deeper links available unfortunately. |
I agree that the .rst should not have two copies and that any windows.chm specific fixup should be in the tool. Right now, right clicking gives a context menu with one item: Properties. Clicking that brings up a dialog box with a url that can be copied. Good enough for me at the moment but not terribly obvious. A possible separate issue. Unless A Kuchling says different, I would like to remove the version number. It implies to me that this doc is in pre-alpha condition and it is far beyond that. I see that the patch already does so. -:file:`Tools/scripts/redemo.py`, a demonstration program included with the should (currently) be Other than that, the patch looks good. Thanks. I am still thinking about Matching Characters. Once the patch is fixed with possible addition, a 2.7 version can easily be made be deleting the 3.x-specific deletions. |
I don't know whether it would be easy to strip down py3k version to 2.7 version. Seeing how it's just a basic introduction, I would think that a single statement re unicode support might be sufficient. For exhaustive description of special sequences refer the docs and carry on with ascii strings. Attached patch fixes path issue. |
Since I think I know how to do it, easily, I will try to derive the 2.7 patch. In Matching Characters, I think should be expanded to "The following predefined special sequences are a subset of those available. The equivalent classes are for bytes patterns. For a complete list of sequences and expanded class definitions for Unicode string patterns, see the end of Regular Expression Syntax." Note to myself. /bytes/byte string/ for 2.7. While the changes all look innocuous to me with respect to building the docs, I am curious if you have tried to rebuild the HOWTO (if you have the tool chain, which I do not). |
I would argue that this is a bug in the CHM viewers, not Python :) |
I did rebuild the docs with 'make html'. Build was clean every time. If you meant something else please let me know. |
I applied patch to 3.2, 3.1 in r87904, r87905. Thanks. I made a separate small patch for my suggested addition to Matching Characters. Could someone check that it is correct, given that re.rst contains the target directive (or whatever it is called): |
Looks good, builds without warnings. Note that you can use :ref:`re-syntax` and Sphinx will substitute the heading for you. The :role:`some special text <real-target>` form is used when you want to control the text of the link. (That thing is called an hyperlink target: http://docutils.sourceforge.net/docs/user/rst/quickref.html#hyperlink-targets) |
and r87918 for 2.7, with bytes -> byte string |
Correction: r87912 and r87913 for 3.x |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: