PoC of implementation to show large output by wrapping up long lines #42

AntonioL · 2022-02-06T07:12:08Z

Created this PR just to start a conversation. I will create an issue and bring this PR as a proof of concept.

for more information, see https://pre-commit.ci

max-sixty · 2022-02-06T08:47:25Z

Thanks for the contribution!

Am I right that this makes the max line length 80, wrapping where necessary? The text of the issue is cut off for me.

Would you have a case where this makes the output more readable?

Are the line continuation characters handled well? I've had issues with those in the past. IIRC making the docstring into a raw string helps, but we don't do this automatically atm.

AntonioL · 2022-02-06T18:45:12Z

I was intending to have this conversation here (#43) so we can capture the requirements .

Am I right that this makes the max line length 80, wrapping where necessary? The text of the issue is cut off for me.

Title changed, hope that is more clear.

Would you have a case where this makes the output more readable?

Regarding your request of an example where this improves clarity: we have a bunch of scrapers taking data from some feeds.
I want to colocate an example of this feed output with the same code that processes it.

Some of those feeds are single-line json. As of today the library would shorten the output with ellipsis .... I do not want that.

Are the line continuation characters handled well? I've had issues with those in the past. IIRC making the docstring into a raw string helps, but we don't do this automatically atm.

At the moment I did not give much thought to the above issue, this PR is a quite hacky one I created just to start this conversation.

max-sixty · 2022-02-06T19:57:56Z

I'm open to the idea! We'd probably need to make it configurable.

Another approach would be for the user to format the output with a modified print call.

It's worth testing the line-continuations and seeing whether they make it easier for you — I added something in the readme around how they're handled by doctests — let me know your thoughts.

AntonioL · 2022-02-07T00:47:58Z

On the other hand I was having trouble running pytest on the cloned repository. I used a virtualenv-inspired workflow and I found out that the pytest setup hooks to install the plugin were not installed in my case and this makes the tests fail.
Can you expand on how I should operate locally? I never used poetry.

Regarding the \ character, I found that to work with my code changes. I will contribute a test.

I propose a --wrap-long-lines option, this will make the plugin wrap long lines up to MAX_COUNT charaters ( I used 80 in my case), and the output will not be shortened.

max-sixty · 2022-02-07T01:32:34Z

Yes, if you install poetry and then run poetry install, the whole env should be set up — it's nice for things like this. You'll need to be in a poetry shell; the poetry docs are fairly good.

Yes, I think a --accept-wrap would be good (or some variant of that) to make it clear that it's impacting this library.

Thanks for checking out the \ characters, that sounds good. I'll have a double-check of your test case too locally.

Cheers @AntonioL

AntonioL · 2022-02-13T03:56:45Z

I am investigating this issue right now. Turns out it is a bit more complicated than it looks as the amount of edge cases to handle is a lot. pytest should do more when it comes to reporting the gotten output.

Consider this case:

def scraper(n : int):
    """
    Simple scraper testing.

    >>> scraper(1)
    """

    # Payload coming from https://httpbin.org/json
    return """{
  "slideshow": {
    "author": "Yours Truly", 
    "date": "date of publication", 
    "slides": [
      {
        "title": "Wake up to WonderWidgets!", 
        "type": "all"
      }, 
      {
        "items": [
          "Why <em>WonderWidgets</em> are great", 
          "Who <em>buys</em> WonderWidgets"
        ], 
        "title": "Overview", 
        "type": "all"
      }
    ], 
    "title": "Sample Slide Show"
  }
}"""

The above test case fails. pytest-accept rewrites it as:

def scraper(n : int):
    """
    Simple scraper testing.

    >>> scraper(1)
    '{\n  "slideshow": {\n    "author": "Yours Truly", \n    "date": "date of publication", \n    "slides": [\n      {\n        "title": "Wake up to WonderWidgets!", \n        "type": "all"\n      }, \n      {\n        "items": [\n          "Why <em>WonderWidgets</em> are great", \n          "Who <em>buys</em> WonderWidgets"\n        ], \n        "title": "Overview", \n        "type": "all"\n      }\n    ], \n    "title": "Sample Slide Show"\n  }\n}'
    """

    # Payload coming from https://httpbin.org/json
    return """{
  "slideshow": {
    "author": "Yours Truly", 
    "date": "date of publication", 
    "slides": [
      {
        "title": "Wake up to WonderWidgets!", 
        "type": "all"
      }, 
      {
        "items": [
          "Why <em>WonderWidgets</em> are great", 
          "Who <em>buys</em> WonderWidgets"
        ], 
        "title": "Overview", 
        "type": "all"
      }
    ], 
    "title": "Sample Slide Show"
  }
}"""

This rewriting is not correct as if I try to run doctest-modules I will get

/usr/lib/python3.8/doctest.py:939: in find
    self._find(tests, obj, name, module, source_lines, globs, {})
../../.cache/pypoetry/virtualenvs/pytest-accept-OGiL4A3W-py3.8/lib/python3.8/site-packages/_pytest/doctest.py:522: in _find
    doctest.DocTestFinder._find(  # type: ignore
/usr/lib/python3.8/doctest.py:1001: in _find
    self._find(tests, val, valname, module, source_lines,
../../.cache/pypoetry/virtualenvs/pytest-accept-OGiL4A3W-py3.8/lib/python3.8/site-packages/_pytest/doctest.py:522: in _find
    doctest.DocTestFinder._find(  # type: ignore
/usr/lib/python3.8/doctest.py:989: in _find
    test = self._get_test(obj, name, module, globs, source_lines)
/usr/lib/python3.8/doctest.py:1073: in _get_test
    return self._parser.get_doctest(docstring, globs, name,
/usr/lib/python3.8/doctest.py:675: in get_doctest
    return DocTest(self.get_examples(string, name), globs,
/usr/lib/python3.8/doctest.py:689: in get_examples
    return [x for x in self.parse(string, name)
/usr/lib/python3.8/doctest.py:651: in parse
    self._parse_example(m, name, lineno)
/usr/lib/python3.8/doctest.py:720: in _parse_example
    self._check_prefix(want_lines, ' '*indent, name,
/usr/lib/python3.8/doctest.py:805: in _check_prefix
    raise ValueError('line %r of the docstring for %s has '
E   ValueError: line 6 of the docstring for scraper_test_example.scraper has inconsistent leading whitespace: '  "slideshow": {'
====================================================================== short test summary info =======================================================================
ERROR examples/scraper_test_example.py - ValueError: line 6 of the docstring for scraper_test_example.scraper has inconsistent leading whitespace: '  "slideshow": {'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
========================================================================== 1 error in 0.34s ==========================================================================

The issue is because of the \n character pytest which inside the pytest-accept rewriting. It should be \\n.
Indeed if I replace \n with \\n the test passes:

def scraper(n : int):
    """
    Simple scraper testing.

    >>> scraper(1)
    '{\\n  "slideshow": {\\n    "author": "Yours Truly", \\n    "date": "date of publication", \\n    "slides": [\\n      {\\n        "title": "Wake up to WonderWidgets!", \\n        "type": "all"\\n      }, \\n      {\\n        "items": [\\n          "Why <em>WonderWidgets</em> are great", \\n          "Who <em>buys</em> WonderWidgets"\\n        ], \\n        "title": "Overview", \\n        "type": "all"\\n      }\\n    ], \\n    "title": "Sample Slide Show"\\n  }\\n}'
    """

    # Payload coming from https://httpbin.org/json
    return """{
  "slideshow": {
    "author": "Yours Truly", 
    "date": "date of publication", 
    "slides": [
      {
        "title": "Wake up to WonderWidgets!", 
        "type": "all"
      }, 
      {
        "items": [
          "Why <em>WonderWidgets</em> are great", 
          "Who <em>buys</em> WonderWidgets"
        ], 
        "title": "Overview", 
        "type": "all"
      }
    ], 
    "title": "Sample Slide Show"
  }
}"""

Handling those corner cases adds some complexity. We should do this escaping only in case of strings. This would require some kind of parsing of the pytest result. I am not sure if this complexity is worth to be added in pytest-accept. You mention this corner case in the README, your suggested fix is to use rawstrings for the docstring.

Why is this related to my idea of newlines patch? Because my approach relied on adding a final "" at the end of each line, this will need the handling of corner cases like the above.

max-sixty · 2022-02-13T04:23:08Z

Great, thanks for looking into this @AntonioL.

To respond quickly:

Does this work if you make the docstring a "raw" docstring, with r:

def scraper(n : int):
    r"""
    Simple scraper testing.

For your specific case, have you considered print(scraper(1))? That should make it show in multiple lines (though still not wrapped).

I'd be up for making this better in pytest-accept; I'm asking because I also want to help you in your case.

AntonioL · 2022-02-13T04:24:19Z

Great, thanks for looking into this @AntonioL.

To respond quickly:
* Does this work if you make the docstring a "raw" docstring, with `r`:
def scraper(n : int):
    r"""
    Simple scraper testing.
* For your specific case, have you considered `print(scraper(1))`? That should make it show in multiple lines (though still not wrapped).
I'd be up for making this better in pytest-accept; I'm asking because I also want to help you in your case.

Yes it works if that is raw docstring.

def scraper(n : int):
    r"""
    Simple scraper testing.

    >>> scraper(1)
    '{\n  "slideshow": {\n    "author": "Yours Truly", \n    "date": "date of publication", \n    "slides": [\n      {\n        "title": "Wake up to WonderWidgets!", \n        "type": "all"\n      }, \n      {\n        "items": [\n          "Why <em>WonderWidgets</em> are great", \n          "Who <em>buys</em> WonderWidgets"\n        ], \n        "title": "Overview", \n        "type": "all"\n      }\n    ], \n    "title": "Sample Slide Show"\n  }\n}'
    """

    # Payload coming from https://httpbin.org/json
    return """{
  "slideshow": {
    "author": "Yours Truly", 
    "date": "date of publication", 
    "slides": [
      {
        "title": "Wake up to WonderWidgets!", 
        "type": "all"
      }, 
      {
        "items": [
          "Why <em>WonderWidgets</em> are great", 
          "Who <em>buys</em> WonderWidgets"
        ], 
        "title": "Overview", 
        "type": "all"
      }
    ], 
    "title": "Sample Slide Show"
  }
}"""

We would need another property like got in pytest's DocTestFailure object. That should give the proper string representation to use in the tests. I am not sure pytest-accept should do more. I am not clear what is the impact of rewriting it to raw docstrings in those cases.

For the time being I will be best served by using the raw string modified whenever needed.

max-sixty · 2022-02-13T04:30:20Z

We would need another property like got in pytest's DocTestFailure object. That should give the proper string representation to use in the tests. I am not sure pytest-accept should do more. I am not clear what is the impact of rewriting it to raw docstrings in those cases.

OK, yes, I think you're right. Thank you very much @AntonioL .

AntonioL · 2022-02-13T04:32:46Z

Thanks for the library and apologies for the noise in creating this PR, should have done a bit more of research upfront!

max-sixty · 2022-02-13T05:17:42Z

Not at all, thank you @AntonioL . Please continue with any feedback you have on the library!

AntonioL and others added 2 commits February 6, 2022 02:10

PoC of implementation allowing long lines.

53cf06d

[pre-commit.ci] auto fixes from pre-commit.com hooks

fc31001

for more information, see https://pre-commit.ci

AntonioL mentioned this pull request Feb 6, 2022

Long lines support #43

Closed

AntonioL changed the title ~~PoC of implementation allowing long lines output~~ PoC of implementation to show large output as a whole, by wrapping up new lines Feb 6, 2022

AntonioL changed the title ~~PoC of implementation to show large output as a whole, by wrapping up new lines~~ PoC of implementation to show large output by wrapping up long lines Feb 6, 2022

AntonioL closed this Feb 13, 2022

AntonioL deleted the long_lines branch February 13, 2022 04:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PoC of implementation to show large output by wrapping up long lines #42

PoC of implementation to show large output by wrapping up long lines #42

AntonioL commented Feb 6, 2022

max-sixty commented Feb 6, 2022

AntonioL commented Feb 6, 2022 •

edited

Loading

max-sixty commented Feb 6, 2022

AntonioL commented Feb 7, 2022 •

edited

Loading

max-sixty commented Feb 7, 2022

AntonioL commented Feb 13, 2022 •

edited

Loading

max-sixty commented Feb 13, 2022

AntonioL commented Feb 13, 2022 •

edited

Loading

max-sixty commented Feb 13, 2022

AntonioL commented Feb 13, 2022

max-sixty commented Feb 13, 2022

PoC of implementation to show large output by wrapping up long lines #42

PoC of implementation to show large output by wrapping up long lines #42

Conversation

AntonioL commented Feb 6, 2022

max-sixty commented Feb 6, 2022

AntonioL commented Feb 6, 2022 • edited Loading

max-sixty commented Feb 6, 2022

AntonioL commented Feb 7, 2022 • edited Loading

max-sixty commented Feb 7, 2022

AntonioL commented Feb 13, 2022 • edited Loading

max-sixty commented Feb 13, 2022

AntonioL commented Feb 13, 2022 • edited Loading

max-sixty commented Feb 13, 2022

AntonioL commented Feb 13, 2022

max-sixty commented Feb 13, 2022

AntonioL commented Feb 6, 2022 •

edited

Loading

AntonioL commented Feb 7, 2022 •

edited

Loading

AntonioL commented Feb 13, 2022 •

edited

Loading

AntonioL commented Feb 13, 2022 •

edited

Loading