New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Add test cases for utiltextmunge
#24
Conversation
Aside from removing the `FIXME`, changes to `utiltextmunge.py` are swapping the docstring to triple quotation marks. Signed-off-by: Heliotrop3 <43817374+Heliotrop3@users.noreply.github.com>
Signed-off-by: Heliotrop3 <43817374+Heliotrop3@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thor above, but you are phenomenal. You're so phenomenal, you're a coding phenomenon that deserves his own weather system and ZIP code.
Thank you sooo much for tackling this! I only belatedly just realized that the critical replace_str_substrs()
function we slaved over like the dwarves of Durin in the eldritch barrows of Moria, Gandalf rest their buried souls, was completely untested. I think I need a hanky now. 🤧
Abject apologies on the delay, too. Precipitous matters such as an unplayed video game backlog that grows ever more ponderously unbalanced (like a Jenga tower you've been meticulously erecting for hours until that one roommate topples drunkenly onto the boardgame table at 4:12AM) consumed the whole weekend, which I'll shamefully own up to.
# ....................{ TESTS }.................... | ||
def test_uppercase_char_first(): | ||
# Defer heavyweight imports. | ||
from pytest import raises |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...Of course, there's also no harm in unconditionally isolating all imports to tests. It's mostly a matter of style and I have none.
Definitely thought it was a style thing. Is there any objections to pulling the pytest import into the module scope across all tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. Do what they wilt shall be the whole of the law under beartype
. So... that's fine. Bears rule, salmon drool! 🐻 🐟
text = "bears, beats, battlestar galactica\n" * 20 | ||
expected_result = "\n".join( | ||
'(line {:0>4d}) {}'.format(line_no, line_text) | ||
for line_no, line_text in enumerate(text.splitlines(), start=1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hooyah! The classic x == x
unit test. Quality assurance professionals who get paid lavish sums of untaxed bitcoin tumbled through St. Petersburg probably have a term for this sort of thing. x == x
is a better synopsis than that term.
As you'd expect, x == x
is bad – but possibly not for the reasons you'd expect. x == x
is bad because it couples tests to the code being tested. In software architecture, coupling is usually bad because it increases the fragility of the two components being coupled. This is why semantic versioning and RESTful APIs are trending topics that I pretend to know things about.
In this case, x == x
couples this test to the number_lines()
function. That's bad, because every time we change the number_lines()
function (e.g., by changing the line numbering format to '[line {:0>4d}]'
) we also break the test.
The solution is to decouple the test from the code its testing. Before we do that, let's take a step back from the ugly precipice of QA despair and contemplate the existential point of this whole exercise. What exactly are we trying to achieve here beyond making us both even balder?
We're trying to test that the number_lines()
function vaguely behaves like we expect it to. We are not trying to test that this function exactly behaves as its implementation defines it to. We don't particularly care how this (or any other test-worthy) function exactly behaves – only that this function vaguely obeys expectations, including such lofty expectations befitting our collective genius as:
- This function should return a new string containing the same number of newlines as in the original string.
- Each line of this new string should be prefixed somewhere, somehow, in some format that we really don't care about with that line's line number.
With respect to strings, the easiest, sanest, and most robust way to test vague expectations is with regular expressions. What could be vaguer than an inscrutable symbolic metalanguage we all wish would just be replaced by Parser Expression Grammars (PEGs) already?
Here's how we might assert those expectations, if you were my twisted doppelganger Nega Leycec (also known as Dark Leycec):
from re import search
NEWLINES_COUNT = 20
CATCHY_JINGLE = 'bears, beats, battlestar galactica'
mad_fat_rhymes = f'{CATCHY_JINGLE}\n' * NEWLINES_COUNT
mad_fat_rhymes_numbered = number_lines(mad_fat_rhymes)
# Assert this function preserves newlines as is.
assert mad_fat_rhymes_numbered.count('\n') == NEWLINES_COUNT
# Assert this function prefixes lines by line numbers.
mad_fat_rhymes_numbered_lines = mad_fat_rhymes_numbered.splitlines()
for line_number in range(NEWLINES_COUNT):
assert search(
fr'\b{line_number + 1}\b.*\b{CATCHY_JINGLE}\b',
mad_fat_rhymes_numbered_lines[line_number]
) is not None
Everything above is thoroughly untested. Expect catastrophic explosions in the sky, but don't worry! You have iodine tablets for just such an emergency... don't you? Godzilla roars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With respect to strings, the easiest, sanest, and most robust way to test vague expectations is with regular expressions. What could be vaguer than an inscrutable symbolic metalanguage we all wish would just be replaced by Parser Expression Grammars (PEGs) already?
Interestingly enough, PEP 617 is about transitioning the C-Python parser to PEG
Here's how we might assert those expectations, if you were my twisted doppelganger Nega Leycec (also known as Dark Leycec):
from re import search NEWLINES_COUNT = 20 CATCHY_JINGLE = 'bears, beats, battlestar galactica' mad_fat_rhymes = f'{CATCHY_JINGLE}\n' * NEWLINES_COUNT mad_fat_rhymes_numbered = number_lines(mad_fat_rhymes) # Assert this function preserves newlines as is. assert mad_fat_rhymes_numbered.count('\n') == NEWLINES_COUNT # Assert this function prefixes lines by line numbers. mad_fat_rhymes_numbered_lines = mad_fat_rhymes_numbered.splitlines() for line_number in range(NEWLINES_COUNT): assert search( fr'\b{line_number + 1}\b.*\b{CATCHY_JINGLE}\b', mad_fat_rhymes_numbered_lines[line_number] ) is not None
Everything above is thoroughly untested. Expect catastrophic explosions in the sky, but don't worry! You have iodine tablets for just such an emergency... don't you? Godzilla roars.
I'll play around with this within the next couple of days - your insight was/is very much appreciated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly enough, PEP 617 is about transitioning the C-Python parser to PEG
So much excitement. I think they've already successfully transitioned CPython from LL(1)
to PEG for the most recent 3.9.0 release. I also think Guido himself made a surprise comeback for that one. Go, go, CPython! Welcome to the 21st century, fellows.
I'll play around with this within the next couple of days - your insight was/is very much appreciated!
No worries whatsoever. Your dedication to the cause of code justice continues to amaze, Tyler.
I humbly appreciate whatever scraps of after-work time you can occasionally throw our way. Meanwhile, I'll be banging on our newly reported non-compliance with PEP 561. Woopsie. 💢
Derp. I feel just awful for letting this linger. This is already more than sufficient as is. You've done a tremendous job here despite a high-paying day job that squanders your vast creative talents and predilection for explosive exponential growth. I meant to merge this as is tonight and patch in my assorted sordid commentary as a subsequent commit, because we're now in the danger zone of accidentally breaking this PR with unrelated changes, which would upset my fragile emotional balance and make me feel even awfuller. Instead, I finalized PEP 561 compliance for Haren. "So it goes," as a gutsy guy named Vonnegut said. So, I promise you this: I will merge this on Monday evening or die trying. I hope for the former. |
Specifically, this commit: - Moves the pytest import to the module scope - Separates multiple assertion checks occurring under singular `with raises` - Decouples the test for number lines from its code
I think the last commit should make these test cases sufficient.
I'm glad you didn't! The work on the test cases was trivial, I just needed to find the time to sit down and be productive with respect to Beartype.
Oh no! 😱
Here here. I don't like taking a week to resolve PR requests - especially when they are arguably trivial - however weeks can get crazy I would draw your attention to how I handled the decoupling the test for line numbers. Specifically, it's not testing for the actual line number - which could be in a variety of formats - rather it's testing the base string isn't at the start of the line but is at the end of the line. In other words, testing to see if there is SOMETHING before but nothing after the base string. |
Bravissimo! Fortissimo! And other untranslatable Italian exclamations ending in -issimo! Janky English translation: It's in. Thanks everything and then some for this immaculate reconception of your original changeset. I mean, that one was great too – but this one is for the revisionist history books that make you question everything you previously thought to be true. |
This release brings explicit support for ``None``, subscripted generics, and PEP 561 compliance. This release resolves **10 issues** and merges **8 pull requests.** Specific changes include: ## Compatibility Improved * **[PEP 484-compliant `None` singleton](https://www.python.org/dev/peps/pep-0484/#id16).** As a return type hint, `None` is typically used to annotate callables containing *no* explicit `return` statement and thus implicitly returning `None`. `@beartype` now implicitly reduces `None` at all nesting levels of type hints to that singleton's type per [PEP 484](https://www.python.org/dev/peps/pep-0484/#id16). * **[PEP 561 compliance](https://www.python.org/dev/peps/pep-0561).** `beartype` now fully conforms to [PEP 561](https://www.python.org/dev/peps/pep-0561), resolving issue #25 kindly submitted by best macOS package manager ever @harens. In useful terms, this means that: * **`beartype` now complies with [mypy](http://mypy-lang.org),** Python's popular third-party static type checker. If your package had no [mypy](http://mypy-lang.org) errors or warnings *before* adding `beartype` as a mandatory dependency, your package will still have no [mypy](http://mypy-lang.org) errors or warnings *after* adding `beartype` as a mandatory dependency. * **`beartype` preserves [PEP 561](https://www.python.org/dev/peps/pep-0561) compliance.** If your package was [PEP 561-compliant](https://www.python.org/dev/peps/pep-0561) *before* adding `beartype` as a mandatory dependency, your package will still be [PEP 561-compliant](https://www.python.org/dev/peps/pep-0561) *after* adding `beartype` as a mandatory dependency. Of course, if your package currently is *not* [PEP 561-compliant](https://www.python.org/dev/peps/pep-0561), `beartype` can't help you there. We'd love to, really. [It's us. Not you.](https://www.youtube.com/watch?v=2uAj4wBIU-8) * **The `beartype` codebase is now mostly statically rather than dynamically typed,** much to our public shame. Thus begins the eternal struggle to preserve duck typing in a world that hates bugs. * **The `beartype` package now contains a top-level `py.typed` file,** publicly declaring this package to be [PEP 561-compliant](https://www.python.org/dev/peps/pep-0561). * **Subscripted generics** (i.e., [user-defined generics](https://www.python.org/dev/peps/pep-0484) subscripted by one or more type hints), resolving issue #29 kindly submitted by indefatigable test engineer and anthropomorphic Siberian Husky @eehusky. Since it's best not to ask too many questions about subscripted generics, we instead refer you to [the issue report that nearly broke a Canadian man](#29). ## Compatibility Broken * **None.** This release preserves backward compatibility with the prior stable release. ## Packaging Improved * **New optional installation-time extras,** enabling both `beartype` developers and automation tooling to trivially install recommended (but technically optional) dependencies. These include: * `pip install -e .[dev]`, installing `beartype` in editable mode as well as all dependencies required to both locally test `beartype` *and* build documentation for `beartype` from the command line. * `pip install beartype[doc-rtd]`, installing `beartype` as well as all dependencies required to build documentation from the external third-party Read The Docs (RTD) host. * **Homebrew- and MacPorts-based macOS installation.** Our front-facing `README.rst` file now documents `beartype` installation with both Homebrew and MacPorts on macOS, entirely courtesy the third-party Homebrew tap and Portfile maintained by build automation specialist and mild-mannered student @harens. Thanks a London pound, Haren! ## Features Added * Public `beartype.cave` types and type tuples, including: * `beartype.cave.CallableCTypes`, a tuple of all **C-based callable types** (i.e., types whose instances are callable objects implemented in low-level C rather than high-level Python). * `beartype.cave.HintGenericSubscriptedType`, the C-based type of all subscripted generics if the active Python interpreter targets Python >= 3.9 *or* `beartype.cave.UnavailableType` otherwise. This type was previously named `beartype.cave.HintPep585Type` before we belatedly realized this type broadly applies to numerous categories of PEP-compliant type hints, including PEP 484-compliant subscripted generics. ## Features Optimized * **`O(n)` → `O(1)` exception handling.** `@beartype` now internally raises human-readable exceptions in the event of type-checking violations with an `O(1)` rather than `O(n)` algorithm, significantly reducing time complexity for the edge case of invalid large sequences either passed to or returned from `@beartype`-decorated callables. For forward compatibility with a future version of `beartype` enabling users to explicitly switch between constant- and linear-time checking, the prior `O(n)` exception-handling algorithm has been preserved in a presently disabled form. * **`O(n)` → `O(1)` callable introspection during internal memoization.** `@beartype` now avoids calling the inefficient stdlib `inspect` module from our private `@beartype._util.cache.utilcachecall.callable_cached` decorator memoizing functions throughout the `beartype` codebase. The prior `O(n)` logic performed by that call has been replaced by equivalent `O(1) logic performed by a call to our newly defined `beartype._util.func.utilfuncarg` submodule, optimizing function argument introspection without the unnecessary overhead of `inspect`. * **Code object caching.** `@beartype` now temporarily caches the code object for the currently decorated callable to support efficient introspection of that callable throughout the decoration process. Relatedly, this also has the beneficial side effect of explicitly raising human-readable exceptions from the `@beartype` decorator on attempting to decorate C-based callables, which `@beartype` now explicitly does *not* support, because C-based callables have *no* code objects and thus *no* efficient means of introspection. Fortunately, sane code only ever applies `@beartype` to pure-Python callables anyway. ...right, sane code? *Right!?!?* ## Features Deprecated * The ambiguously named `beartype.cave.HintPep585Type` type, to be officially removed in `beartype` 0.1.0. ## Issues Resolved * **Unsafe `str.replace()` calls.** `@beartype` now wraps all unsafe internal calls to the low-level `str.replace()` method with calls to the considerably safer high-level `beartype._util.text.utiltextmunge.replace_str_substrs()` function, guaranteeing that memoized placeholder strings are properly unmemoized during decoration-time code generation. Thanks to temperate perennial flowering plant @Heliotrop3 for this astute observation and resolution to long-standing background issue #11. * **`KeyPool` release validation.** `@beartype` now validates that objects passed to the `release()` method of the private `beartype._util.cache.pool.utilcachepool.KeyPool` class have been previously returned from the `acquire()` method of that class. Thanks to @Heliotrop3, the formidable bug assassin, for their unswerving dedication to the cause of justice with this resolution to issue #13. * **Least Recently Used (LRU) cache.** ``@beartype`` now internally provides a highly microoptimized Least Recently Used (LRU) cache for subsequent use throughout the codebase, particularly with respect to caching iterators over dictionaries, sets, and other non-sequence containers. This resolves issue #17, again graciously submitted by open-source bug mercenary @Heliotrop3. * **Callable labelling.** `@beartype` now internally provides a private `beartype._util.func.utilfuncorigin.get_callable_origin_label` getter synthesizing human-readable labels for the files declaring arbitrary callables, a contribution by master code-mangler @Heliotrop3 resolving issue #18. Thanks again for all the insidious improvements, Tyler! You are the master of everyone's code domain. * **Release automation.** Our release workflow has now been migrated from the unmaintained `create-release` GitHub Action to @ncipollo's actively maintained `release-action`, resolving issue #22 kindly submitted by human-AI-hybrid @Heliotrop3. ## Tests Improved * **Microsoft Windows and macOS exercised under CI**, resolving issue #21. Since doing so increases our consumption of Microsoft resources that we care deeply about, care has been taken to reduce the cost of our CI workflow. This includes: * Replacing our prior use of the external third-party `tox-gh-actions` GitHub Action streamlining `tox` usage with our own ad-hoc build matrix that appears to be simpler and faster despite offering basically identical functionality. * Removing our prior installation of optional dependencies, especially including NumPy. *Yeah.* Let's not do that anymore. Thanks to dedicated issue reporter @Heliotrop3 for his unsustainable deep-code trawling of the `beartype` codebase for unresolved `FIXME:` comments. * **PyPy 3.7 exercised under CI.** Our `tox` and GitHub Actions-based continuous integration (CI) configurations now both correctly exercise themselves against both PyPy 3.6 and 3.7, resolving the upstream actions/setup-python#171 issue for `beartype`. * **CI thresholded.** Our CI configuration now caps tests to a sane maximum duration of time to avoid a repeat of the pull request we do *not* talk about here. Okay, it was #23. I blame only myself. * **New functional tests,** including: * A **CPython-specific [mypy](http://mypy-lang.org) functional test,** optionally exercising our conformance to static type-checking standards when the third-party `mypy` package is installed under CPython. This test is sufficiently critical that we perform it under our CI workflow, guaranteeing test failures on any push or PR violating mypy expectations. * A **`README.rst` functional test,** optionally exercising the syntactic validity of our front-facing `README.rst` documentation when the third-party `docutils` package (i.e., the reference reST parser) is installed. This test is sufficiently expensive that we currently avoid performing it under our CI workflow. * **New unit tests,** including: * **Text munging unit tests,** exercising the private `beartype._util.text.utiltextmunge` submodule with lavish attention to regex-based fuzzy testing of the critical `number_lines()` function. Humble `git log` shout outs go out to @Heliotrop3 for this mythic changeset that warps the fragile fabric of the GitHub cloud to its own pellucid yet paradoxically impenetrable intentions, resolving issue #24. ## Documentation Revised * **Sphinx skeleton.** The `beartype` repository now defines a largely unpopulated skeleton for Sphinx-generated documentation formatted as reST and typically converted to HTML to be hosted at [Read The Docs (RTD)](https://beartype.readthedocs.io/en/latest/?badge=latest), generously contributed by @felix-hilden, Finnish computer vision expert and our first contributor! This skeleton enables: * An HTTP 404 redirect page on missing page hits. * The standard retinue of builtin Sphinx extensions (e.g., `autodoc`, `viewcode`). * MathJax configured for remote implicit downloads. * Napolean configured for NumPy-formatted docstrings. * An optional dependency on `sphinx_rtd_theme`, a third-party Sphinx extension providing RTD's official Sphinx HTML. * A **badge** (i.e., shield, status icon) on our front-facing `README.rst` documentation signifying the success of the most recent attempt to build and host this skeleton at RTD. * A **top-level `sphinx` script**, building Sphinx-based package documentation when manually run from the command line by interactive developers. * A **[beautiful banner graphic that makes grown adults openly weep](https://github.com/beartype/beartype-assets/tree/main/banner),** featuring the official `beartype` mascot "Mr. Nectar Palm" – again courtesy @felix-hilden, because sleep is for the weak and Felix has never known the word. * A **new prefacing "tl;dr" section** that's redundant with numerous other sections, but we're surprisingly okay with that. * A **new "Usage" section** that accidentally became a new tutorial and division of the existing "Overview" section into various subsections highlighting tradeoffs between `beartype` and existing type checkers, resolving clarity concerns raised by @kevinjacobs-progenity at issue #7. Thanks for the invaluable commentary, Kevin! * A **new "Frequently Asked Questions (FAQ)" section,** inspired by the admission from several prospective users that they have utterly no idea what @leycec is talking about. Fair play, users. You win this round. * A **new "Workflow" subsection of the "Developer" section,** listing developer-specific instructions for forking, cloning, installing, modifying, and submitting PRs for `beartype` in a live manner. * **Properly rendered code blocks,** kindly reported by humane human extraordinaire @harens in discussion topic #28. Thanks and may the little-seen English sun eternally shine upon ye, Haren! ## API Changed * Added: * `beartype.cave.CallableCTypes`. * `beartype.cave.HintGenericSubscriptedType`. * Deprecated: * `beartype.cave.HintPep585Type`. (*Exogenous exhaustion!*)
This PR looks to resolve the issue of not having unit tests for
utiltextmunge.py
. I ended up adding basic unit tests for all functions inutiltextmunge.py
.I'd specifically like to draw your attention to this section. As per the comment, I am unsure whether this is a worth-while test. Given it's a 1:1 copy of the function's return statement it's given to fail should the function ever change. That being said, it's unsurprisingly also the most compact way of producing the expected result...
Slipped into this PR is also the resolution of a smaller refactoring item for
utiltextrepr.py