Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC of implementation to show large output by wrapping up long lines #42

Closed
wants to merge 2 commits into from

Conversation

AntonioL
Copy link

@AntonioL AntonioL commented Feb 6, 2022

Created this PR just to start a conversation. I will create an issue and bring this PR as a proof of concept.

@AntonioL AntonioL mentioned this pull request Feb 6, 2022
@max-sixty
Copy link
Owner

Thanks for the contribution!

Am I right that this makes the max line length 80, wrapping where necessary? The text of the issue is cut off for me.

Would you have a case where this makes the output more readable?

Are the line continuation characters handled well? I've had issues with those in the past. IIRC making the docstring into a raw string helps, but we don't do this automatically atm.

@AntonioL AntonioL changed the title PoC of implementation allowing long lines output PoC of implementation to show large output as a whole, by wrapping up new lines Feb 6, 2022
@AntonioL AntonioL changed the title PoC of implementation to show large output as a whole, by wrapping up new lines PoC of implementation to show large output by wrapping up long lines Feb 6, 2022
@AntonioL
Copy link
Author

AntonioL commented Feb 6, 2022

I was intending to have this conversation here (#43) so we can capture the requirements .

Am I right that this makes the max line length 80, wrapping where necessary? The text of the issue is cut off for me.

Title changed, hope that is more clear.

Would you have a case where this makes the output more readable?

Regarding your request of an example where this improves clarity: we have a bunch of scrapers taking data from some feeds.
I want to colocate an example of this feed output with the same code that processes it.

Some of those feeds are single-line json. As of today the library would shorten the output with ellipsis .... I do not want that.

Are the line continuation characters handled well? I've had issues with those in the past. IIRC making the docstring into a raw string helps, but we don't do this automatically atm.

At the moment I did not give much thought to the above issue, this PR is a quite hacky one I created just to start this conversation.

@max-sixty
Copy link
Owner

I'm open to the idea! We'd probably need to make it configurable.

Another approach would be for the user to format the output with a modified print call.

It's worth testing the line-continuations and seeing whether they make it easier for you — I added something in the readme around how they're handled by doctests — let me know your thoughts.

@AntonioL
Copy link
Author

AntonioL commented Feb 7, 2022

On the other hand I was having trouble running pytest on the cloned repository. I used a virtualenv-inspired workflow and I found out that the pytest setup hooks to install the plugin were not installed in my case and this makes the tests fail.
Can you expand on how I should operate locally? I never used poetry.

Regarding the \ character, I found that to work with my code changes. I will contribute a test.

I propose a --wrap-long-lines option, this will make the plugin wrap long lines up to MAX_COUNT charaters ( I used 80 in my case), and the output will not be shortened.

@max-sixty
Copy link
Owner

Yes, if you install poetry and then run poetry install, the whole env should be set up — it's nice for things like this. You'll need to be in a poetry shell; the poetry docs are fairly good.

Yes, I think a --accept-wrap would be good (or some variant of that) to make it clear that it's impacting this library.

Thanks for checking out the \ characters, that sounds good. I'll have a double-check of your test case too locally.

Cheers @AntonioL

@AntonioL
Copy link
Author

AntonioL commented Feb 13, 2022

I am investigating this issue right now. Turns out it is a bit more complicated than it looks as the amount of edge cases to handle is a lot. pytest should do more when it comes to reporting the gotten output.

Consider this case:

def scraper(n : int):
    """
    Simple scraper testing.

    >>> scraper(1)
    """

    # Payload coming from https://httpbin.org/json
    return """{
  "slideshow": {
    "author": "Yours Truly", 
    "date": "date of publication", 
    "slides": [
      {
        "title": "Wake up to WonderWidgets!", 
        "type": "all"
      }, 
      {
        "items": [
          "Why <em>WonderWidgets</em> are great", 
          "Who <em>buys</em> WonderWidgets"
        ], 
        "title": "Overview", 
        "type": "all"
      }
    ], 
    "title": "Sample Slide Show"
  }
}"""

The above test case fails. pytest-accept rewrites it as:

def scraper(n : int):
    """
    Simple scraper testing.

    >>> scraper(1)
    '{\n  "slideshow": {\n    "author": "Yours Truly", \n    "date": "date of publication", \n    "slides": [\n      {\n        "title": "Wake up to WonderWidgets!", \n        "type": "all"\n      }, \n      {\n        "items": [\n          "Why <em>WonderWidgets</em> are great", \n          "Who <em>buys</em> WonderWidgets"\n        ], \n        "title": "Overview", \n        "type": "all"\n      }\n    ], \n    "title": "Sample Slide Show"\n  }\n}'
    """

    # Payload coming from https://httpbin.org/json
    return """{
  "slideshow": {
    "author": "Yours Truly", 
    "date": "date of publication", 
    "slides": [
      {
        "title": "Wake up to WonderWidgets!", 
        "type": "all"
      }, 
      {
        "items": [
          "Why <em>WonderWidgets</em> are great", 
          "Who <em>buys</em> WonderWidgets"
        ], 
        "title": "Overview", 
        "type": "all"
      }
    ], 
    "title": "Sample Slide Show"
  }
}"""

This rewriting is not correct as if I try to run doctest-modules I will get

/usr/lib/python3.8/doctest.py:939: in find
    self._find(tests, obj, name, module, source_lines, globs, {})
../../.cache/pypoetry/virtualenvs/pytest-accept-OGiL4A3W-py3.8/lib/python3.8/site-packages/_pytest/doctest.py:522: in _find
    doctest.DocTestFinder._find(  # type: ignore
/usr/lib/python3.8/doctest.py:1001: in _find
    self._find(tests, val, valname, module, source_lines,
../../.cache/pypoetry/virtualenvs/pytest-accept-OGiL4A3W-py3.8/lib/python3.8/site-packages/_pytest/doctest.py:522: in _find
    doctest.DocTestFinder._find(  # type: ignore
/usr/lib/python3.8/doctest.py:989: in _find
    test = self._get_test(obj, name, module, globs, source_lines)
/usr/lib/python3.8/doctest.py:1073: in _get_test
    return self._parser.get_doctest(docstring, globs, name,
/usr/lib/python3.8/doctest.py:675: in get_doctest
    return DocTest(self.get_examples(string, name), globs,
/usr/lib/python3.8/doctest.py:689: in get_examples
    return [x for x in self.parse(string, name)
/usr/lib/python3.8/doctest.py:651: in parse
    self._parse_example(m, name, lineno)
/usr/lib/python3.8/doctest.py:720: in _parse_example
    self._check_prefix(want_lines, ' '*indent, name,
/usr/lib/python3.8/doctest.py:805: in _check_prefix
    raise ValueError('line %r of the docstring for %s has '
E   ValueError: line 6 of the docstring for scraper_test_example.scraper has inconsistent leading whitespace: '  "slideshow": {'
====================================================================== short test summary info =======================================================================
ERROR examples/scraper_test_example.py - ValueError: line 6 of the docstring for scraper_test_example.scraper has inconsistent leading whitespace: '  "slideshow": {'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
========================================================================== 1 error in 0.34s ==========================================================================

The issue is because of the \n character pytest which inside the pytest-accept rewriting. It should be \\n.
Indeed if I replace \n with \\n the test passes:

def scraper(n : int):
    """
    Simple scraper testing.

    >>> scraper(1)
    '{\\n  "slideshow": {\\n    "author": "Yours Truly", \\n    "date": "date of publication", \\n    "slides": [\\n      {\\n        "title": "Wake up to WonderWidgets!", \\n        "type": "all"\\n      }, \\n      {\\n        "items": [\\n          "Why <em>WonderWidgets</em> are great", \\n          "Who <em>buys</em> WonderWidgets"\\n        ], \\n        "title": "Overview", \\n        "type": "all"\\n      }\\n    ], \\n    "title": "Sample Slide Show"\\n  }\\n}'
    """

    # Payload coming from https://httpbin.org/json
    return """{
  "slideshow": {
    "author": "Yours Truly", 
    "date": "date of publication", 
    "slides": [
      {
        "title": "Wake up to WonderWidgets!", 
        "type": "all"
      }, 
      {
        "items": [
          "Why <em>WonderWidgets</em> are great", 
          "Who <em>buys</em> WonderWidgets"
        ], 
        "title": "Overview", 
        "type": "all"
      }
    ], 
    "title": "Sample Slide Show"
  }
}"""

Handling those corner cases adds some complexity. We should do this escaping only in case of strings. This would require some kind of parsing of the pytest result. I am not sure if this complexity is worth to be added in pytest-accept. You mention this corner case in the README, your suggested fix is to use rawstrings for the docstring.

Why is this related to my idea of newlines patch? Because my approach relied on adding a final "" at the end of each line, this will need the handling of corner cases like the above.

@max-sixty
Copy link
Owner

Great, thanks for looking into this @AntonioL.

To respond quickly:

  • Does this work if you make the docstring a "raw" docstring, with r:
def scraper(n : int):
    r"""
    Simple scraper testing.
  • For your specific case, have you considered print(scraper(1))? That should make it show in multiple lines (though still not wrapped).

I'd be up for making this better in pytest-accept; I'm asking because I also want to help you in your case.

@AntonioL
Copy link
Author

AntonioL commented Feb 13, 2022

Great, thanks for looking into this @AntonioL.

To respond quickly:

* Does this work if you make the docstring a "raw" docstring, with `r`:
def scraper(n : int):
    r"""
    Simple scraper testing.
* For your specific case, have you considered `print(scraper(1))`? That should make it show in multiple lines (though still not wrapped).

I'd be up for making this better in pytest-accept; I'm asking because I also want to help you in your case.

Yes it works if that is raw docstring.

def scraper(n : int):
    r"""
    Simple scraper testing.

    >>> scraper(1)
    '{\n  "slideshow": {\n    "author": "Yours Truly", \n    "date": "date of publication", \n    "slides": [\n      {\n        "title": "Wake up to WonderWidgets!", \n        "type": "all"\n      }, \n      {\n        "items": [\n          "Why <em>WonderWidgets</em> are great", \n          "Who <em>buys</em> WonderWidgets"\n        ], \n        "title": "Overview", \n        "type": "all"\n      }\n    ], \n    "title": "Sample Slide Show"\n  }\n}'
    """

    # Payload coming from https://httpbin.org/json
    return """{
  "slideshow": {
    "author": "Yours Truly", 
    "date": "date of publication", 
    "slides": [
      {
        "title": "Wake up to WonderWidgets!", 
        "type": "all"
      }, 
      {
        "items": [
          "Why <em>WonderWidgets</em> are great", 
          "Who <em>buys</em> WonderWidgets"
        ], 
        "title": "Overview", 
        "type": "all"
      }
    ], 
    "title": "Sample Slide Show"
  }
}"""

We would need another property like got in pytest's DocTestFailure object. That should give the proper string representation to use in the tests. I am not sure pytest-accept should do more. I am not clear what is the impact of rewriting it to raw docstrings in those cases.

For the time being I will be best served by using the raw string modified whenever needed.

@max-sixty
Copy link
Owner

We would need another property like got in pytest's DocTestFailure object. That should give the proper string representation to use in the tests. I am not sure pytest-accept should do more. I am not clear what is the impact of rewriting it to raw docstrings in those cases.

OK, yes, I think you're right. Thank you very much @AntonioL .

@AntonioL
Copy link
Author

Thanks for the library and apologies for the noise in creating this PR, should have done a bit more of research upfront!

@AntonioL AntonioL closed this Feb 13, 2022
@AntonioL AntonioL deleted the long_lines branch February 13, 2022 04:33
@max-sixty
Copy link
Owner

Not at all, thank you @AntonioL . Please continue with any feedback you have on the library!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants