Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3.8 #142

Closed
wants to merge 31 commits into from
Closed

Python 3.8 #142

wants to merge 31 commits into from

Conversation

normanlorrain
Copy link
Contributor

This PR brings Python 3.8 compatibility. Many changes here, so to make things easier on myself I created virtual environments for Python 3.5, 3.6, 3.7, 3.8, and identified what need to be changed at each step. This PR contains the results. Tested under Windows and Windows Subsystem for Linux (Debian).

Also added some unit tests, for two reasons:

  • help in understanding how certain modules worked
  • confirm no breakage between Python versions.

I'd like to see more tests in the future. To make this easier, begun to refactor util.py to a package. Feel free to omit that change. This is a work in progress.

The changes to Python 3.7 affecting the project are:

In 3.7.1 the tokenize module now implicitly emits a NEWLINE token when provided with input that does not have a trailing new line. This behavior now matches what the C tokenizer does internally. (Contributed by Ammar Askar in bpo-33899.)

The changes to Python 3.8 affecting the project are:

Removed str implementations from builtin types bool, int, float, complex and few classes from the standard library. They now inherit str() from object. As result, defining the repr() method in the subclass of these classes will affect their string representation. (Contributed by Serhiy Storchaka in bpo-36793.)

I also addressed code that emitted warnings (e.g. DeprecationWarning: invalid escape sequence \)

See pygments/lexers/python.py  in Github,
commit a8a1a3a06e364bc8f5e777bf663e9da4710cb055
In 3.6.7 the tokenize module now implicitly emits a NEWLINE token when provided with input that does not have a trailing new line. This behavior now matches what the C tokenizer does internally. (Contributed by Ammar Askar in bpo-33899.)

"rinoh --help" command now works.
DeprecationWarning: invalid escape sequence \
    DeprecationWarning: invalid escape sequence \
From https://docs.python.org/3/whatsnew/3.7.html :
PEP 479 is enabled for all code in Python 3.7, meaning that
StopIteration exceptions raised directly or indirectly in
coroutines and generators are transformed into RuntimeError exceptions.
(Contributed by Yury Selivanov in bpo-32670.)
RefKeyDictionary moved to its own file.
This will be done for other classes, functions, etc., and
unit tests will be done for each.
"Removed __str__ implementations from builtin types bool, int, float, complex and few classes from the standard library. They now inherit __str__() from object. As result, defining the __repr__() method in the subclass of these classes will affect their string representation. (Contributed by Serhiy Storchaka in bpo-36793.)"

 https://bugs.python.org/issue36793
@claassistantio
Copy link

claassistantio commented Jan 22, 2020

CLA assistant check
All committers have signed the CLA.

@brechtm
Copy link
Owner

brechtm commented Jan 23, 2020

Thank you. This is much appreciated! Please accept my apologies for not responding to your offer for help in #133.

I still had a bunch of local commits sitting on my hard drive, some of which already fixed a few of these issues. This week I finally reserved some time for rinohtype and I have now pushed these to GitHub. I will go through your commits and merge them into the master branch on Friday.

@brechtm
Copy link
Owner

brechtm commented Jan 24, 2020

A small update. I didn't get too far yet, as I first wanted to fix the Travis CI configuration and make sure all builds succeed.

@normanlorrain
Copy link
Contributor Author

normanlorrain commented Jan 25, 2020 via email

@@ -145,7 +145,7 @@ def __new__(cls, value, base=10, indirect=False):
def __init__(self, value, base=10, indirect=False):
Object.__init__(self, indirect)

def _repr(self):
def __repr__(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just __repr__ = int.__repr__?

@@ -561,7 +561,7 @@ def parse_string(cls, string):

@classmethod
def doc_repr(cls, value):
space = '\ ``\N{NO-BREAK SPACE}``\ '
space = r'\ ``\N{NO-BREAK SPACE}``\ '

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not work because of \N.

Suggested change
space = r'\ ``\N{NO-BREAK SPACE}``\ '
space = '\\ ``\N{NO-BREAK SPACE}``\\ '

inserted between each two consecutive elements"""
iterable = iter(iterable)
yield next(iterable)
while True:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be simpler:

for item in iterable:
    yield element
    yield item


def __init__(self, *args, **kwargs):
self.store = dict()
self.update(dict(*args, **kwargs))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just self.update(*args, **kwargs)?

Actually, it does not work with kwargs at all, because you create a weak reference to a key, but kwargs keys are strings, which do not support weak references.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I copied this code from StackOverflow without thinking :-)

del self.store[id(obj)]

def __iter__(self):
return iter(self.store)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it emits ids of keys, not keys. So many code (for example values() and items()) will not work.

return value

def __setitem__(self, obj, value):
self.store[id(obj)] = ref(obj), value

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not keep a hard reference to obj. It can be removed, and after that its id is no longer valid. New created object can take the same id.

else:
char = chr(code)
if char in ('\\', '(', ')'):
char = '\\' + char
if char in (r'\\', r'(', r')'):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r'\\' is a two-character string. char is a single-character string, it never can be equal to r'\\'.

I think the existing code (without r) is correct.

if char in ('\\', '(', ')'):
char = '\\' + char
if char in (r'\\', r'(', r')'):
char = r'\\' + char

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have doubts about this change.

@normanlorrain
Copy link
Contributor Author

Thanks for pulling in some of these changes. Feel free to delete this PR. Don't worry about the unit tests and other changes; these were more for my curiosity. Master seems to work fine now on 3.8.

@brechtm
Copy link
Owner

brechtm commented Jan 29, 2020

Thanks for pulling in some of these changes. Feel free to delete this PR. Don't worry about the unit tests and other changes; these were more for my curiosity. Master seems to work fine now on 3.8.

For now, I've cherry picked only those that were needed to get the the build on Travis and AppVeyor in a more or less OK state. Fixing the CI configuration took up a lot of time, unfortunately. I'm currently looking into fixing the long-running tests. Once that is done, I'll get back to cherry-picking your commits. I definitely welcome more unit tests!

Some notes so far:

  • I've opted not to include the VSCode-related files. I don't think we should store these in the repository.
  • I'm using pyenv for managing different versions of CPython. I see that there's also a Windows version pyenv-win nowadays. This works well together with tox. I'll try to describe this setup.

@normanlorrain
Copy link
Contributor Author

normanlorrain commented Jan 29, 2020 via email

@brechtm
Copy link
Owner

brechtm commented Feb 10, 2020

  • Hadn't hear of pyenv. I'll check it out, thanks. So many tools!

Too many tools, indeed! I wrote some documentation on the tox environments and pyenv in DEVELOPING.rst in the root directory.

Also... - Any changes left over from the PR I'll resubmit as individual changes, simpler to deal with.

There's no need for that. I'm happy cherry-picking the commits, which I will now get back to. After that, I hope to make a long-overdue new release.

  • Btw, do you have any notes or documentation on the architecture of rinohtype? Your code is excellent, but sometimes I find it difficult to follow what the code is doing.

Yes, the program flow can get very complex. I tend to get lost myself too! One cause of this is the gratuitous use of exceptions during the rendering. This is something I'd like to address (#130).

Unfortunately, there is no documentation on the architecture. I'm sure writing it would uncover several more opportunities for simplification. Alas, I don't think I will find the time to do that any time soon. But if you have any questions (specific or general) with regard to the architecture, don't hesitate to ask me!

@brechtm
Copy link
Owner

brechtm commented Mar 4, 2020

I went over the remaining commits in this PR and cherry-picked some of them onto master.

Some remarks:

  • Some of your fixes I had already done on my local clone, so I didn't cherry-pick those.
  • "invalid escape sequence" fixes: I prefer not to use raw docstrings, so I just fixed the escapes.
  • RefKeyDictionary: keys (Element instances) will not be deleted during the rendering process, so the case simulated in your unit test does not occur in rinohtype (as far as I know).
    • That means the refkeydict is overkill the typical use case, unless when when rendering.multiple documents in series in the same Python session (I haven't explicitly tested this)
  • I'm not sure why you started to split out util.py to multiple files. It's only 300 lines and in line with the Zen of Python's Flat is better than nested.
  • The PeekIterator, util.timed and cos.Integer tests don't assert anything, so I've not retained those.

Thanks for your contributions. Sorry this took so long; I again ended up wrestling with Travis CI configuration which is a major time drain (required dependencies for PDF diffing). Next up: a new release.

Also, thank you @serhiy-storchaka for your input! I fixed the issue in RefKeyDictionary which you spotted.

@brechtm brechtm closed this Mar 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants