Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Working towards python3 compatible codebase #109

Merged
merged 52 commits into from
Jun 28, 2017

Conversation

manu-chroma
Copy link
Member

@manu-chroma manu-chroma commented May 22, 2017

Related issue checklist: #108

I'd prefer if the commits from this branch are preserved when merging into the master. This will ensure that future potential issues are easier to bisect. That's why I've ensured that the commit history is clean and commit messages are informative.

@manu-chroma manu-chroma changed the title Working towards python3 compatible codebase [WIP] Working towards python3 compatible codebase May 22, 2017
@tetron
Copy link
Member

tetron commented May 22, 2017

What happens if you add from __future__ import unicode_literalso that it uses unicode strings consistently?

@mr-c
Copy link
Member

mr-c commented May 22, 2017

looking good!

@manu-chroma
Copy link
Member Author

manu-chroma commented May 22, 2017

@tetron It allows in a Python 2 program to have the default interpretation of string literals be Unicode (UTF8). That is,

from __future__ import unicode_literals
 
s = "my surname is Häggholm"

is equivalent to

s = u"my surname is Häggholm"

except that I don’t have to prefix every single string with u to treat it as a Unicode literal.

Looks good to me, but it seems to have drawbacks too as highlighted in this article: http://python-future.org/unicode_literals.html
I am not very familiar with the codebase so I'd be interested in your opinion on this. Currently reading more sources and having look at other projects which are py2+3 compatible to get more idea.

@manu-chroma
Copy link
Member Author

@tetron @mr-c feel free to drop comments on individual commits. I'll do some cleanup and testing for corner cases now.

@tetron
Copy link
Member

tetron commented May 22, 2017

Thanks for the link.

If possible, I'd prefer to use Unicode strings consistently. Some string literals are already marked with u"" but doing that everywhere is tedious. The most likely problem is calling Python2 APIs which expect native strings and break if passed unicode. I think it is at least worth a try to find out what breaks.

The main client of schema-salad is cwltool. Current cwltool has a version restriction of < 3 so we should be able to safely release a 3.0 schema-salad package without breaking existing cwltool installations.

@manu-chroma
Copy link
Member Author

Thanks for the heads up.

The most likely problem is calling Python2 APIs which expect native strings and break if passed unicode.

Yes, this is my main concern too.

I think it is at least worth a try to find out what breaks.

Sure, I will try this approach. Interested to see how this works out.

I think it would be better to get this PR merged before getting on with it. Would keep things simpler

@tetron
Copy link
Member

tetron commented May 23, 2017

Here's how I would like to do this. The goal is minimize the number of merges & schema-salad releases in the event there are issues in schema salad that turn up while porting cwltool.

  1. in this branch setup.py, change version="3.0"
  2. in cwltool, create a new python3 branch and change the dependency to schema-salad >= 3.0
  3. for development, create a virtualenv and clone/install python3 branches for both schema-salad and cwltool
  4. fix up cwltool for python3 support & pass conformance tests using both python 2.7 and 3.4
  5. merge & release schema-salad 3.0
  6. this should make it possible for the cwltool PR tests to pass
  7. once those are green, merge & release cwltool

@manu-chroma
Copy link
Member Author

I like the overall workflow. This would give me a chance to see how changes in schema_salad are breaking conformance tests early on.
One thing I will have to take care about is the merge conflicts that will arise since @kapilkd13 will also be working on both the repos for MS Windows feature.

@kapilkd13
Copy link
Contributor

Nice 😃

@AleksandrSl
Copy link

About typesheds for python3, there are already many of them in python typeshed repo https://github.com/AleksandrSl/schema_salad/blob/python3-support/typeshed.md. I have raised issues about typeshed addition about two months ago in mistune in cachecontrol repos, but got no answer so far, so I think I will try to write them myself. Is there any modules that I've not mentioned or haven't find ready typesheds for?

@manu-chroma manu-chroma force-pushed the modernize branch 2 times, most recently from aa13415 to 8c17480 Compare May 27, 2017 20:24
@manu-chroma
Copy link
Member Author

manu-chroma commented May 27, 2017

Updates:

  • Enabled Python 3.x (x = 3,4,5,6) on Travis for unit and lint tests.
  • Tested this branch of schema_salad with current master of cwltool locally. Unit tests and conformance tests passing. Thus, changes made so far don't break backward compatibility with Python 2.7. 😅

@manu-chroma
Copy link
Member Author

@AleksandrSl Thanks so much for compiling this. Before running mypy in Python 3 mode I've plans to bump mypy to latest version and then try to make it run in Python 3 and specifically aim for --strict flag. 😄 Switching on --strict flag definately requires better annotated stubs files/codebase AFAIK.

I will go through your typshed list and try to update it for the missing fields. Also, we've used stubgen for some of the 3rd party dependencies so far. So, need to decide if we will continue with that or write our own stubs for some of them.

@@ -299,7 +301,7 @@ def test_scoped_id(self):
}, ra)

g = makerdf(None, ra, ctx)
print(g.serialize(format="n3"))
print((g.serialize(format="n3")))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we wrap these expressions in additional parentheses? Just wondering

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change was done by modernize fixture. I'll fix it.

tox.ini Outdated
deps =
mypy==0.470
typed-ast==0.6.3
-rrequirements.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

six requirement missing in requirements.txt

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier the tests were passing because six was being satisfied as a requirement of some other package. I will add it as an explicit dependency now.
I believe I should add in setup.py rather than requirements.txt because it is required beyond testing.

@@ -227,7 +234,7 @@ def typefmt(self,
nbsp=False, # type: bool
jsonldPredicate=None # type: Optional[Dict[str, str]]
):
# type: (...) -> Union[str, unicode]
# type: (...) -> Union[str, Text]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't Text an alias for str in python3 and unicode in python2, why mention str separately?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separately mentioning is redundant in python3 indeed, but in case of python2 union states that this function can return ASCII strings as well as unicode strings.
# type: (...) -> Union[str, unicode] annotation was already present, I just made sure Text type makes it compatible with python3.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manu-chroma It seems, though, that simple Text is enough.
"When checking Python 2.7 code, type checkers should treat the int and long types as equivalent. For parameters typed as unicode or Text , arguments of type str should be acceptable." from PEP484
I checked this with mypy, str and unicode pass for Text in python2 and str for Text in python3

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AleksandrSl Thanks for the clarifying this. 😄

I will make changes wherever required in PR #111

Main changes:
- use of __future__
- using six lib to:
	- deal with checking unicode
	- dict operations
	- other API changes
@manu-chroma
Copy link
Member Author

manu-chroma commented Jun 27, 2017

In the process of rebasing and resolving merge conflicts, I ended up having some commits messages appearing twice. Just messed up commit history a little. 😓

edit: around 3-4 commits. shouldn't be a problem

_add_lc_filename(d, source)

def relname(source): # type: (AnyStr) -> AnyStr
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason this was marked as AnyStr? I don't expect this to take binary strings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason I can think of

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was before Text was a thing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright. 👍

_add_lc_filename(r, relname(source))

def reflow(text, maxline, shift=""): # type: (AnyStr, int, AnyStr) -> AnyStr
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query same as above.

@manu-chroma manu-chroma changed the title [WIP] Working towards python3 compatible codebase [MRG] Working towards python3 compatible codebase Jun 27, 2017
@manu-chroma
Copy link
Member Author

manu-chroma commented Jun 27, 2017

This branch is pretty mature now. I would like to get it reviewed and merged now.
cc: @mr-c @tetron @anton-khodak

setup.py Outdated

install_requires.append("avro") # TODO: remove me once cwltool is
# install_requires.append("avro") # TODO: remove me once cwltool is
# available in Debian Stable, Ubuntu 12.04 LTS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these lines can now be removed

setup.py Outdated

setup(name='schema-salad',
version='2.5',
version='3.0',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the API really different? Does the current released version ofcwltool still pass its tests and the CWL conformance tests which this branch? If not, then 2.6 or 2.5.1 is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version bump was motivated by @tetron's comment.
#109 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the event there are issues in schema salad that turn up while porting cwltool.

So let's test that as I mentioned in my previous comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the current released version ofcwltool still pass its tests and the CWL conformance tests which this branch?

conformace and unit tests pass

mypy fails. https://travis-ci.org/manu-chroma/cwltool/jobs/247893307#L194

this is due to changes made when upgrading mypy to 0.511 for schema_salad. But cwltool master is still uses mypy==0.470.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can selectively silence errors due to schema_salad using mypy config file until mypy==0.511 changes are not merged in cwltool.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, as long as we fix it this week.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. I'm preparing cwltool PR to get reviewed soon.

setup.py Outdated
@@ -67,7 +69,7 @@
'console_scripts': ["schema-salad-tool=schema_salad.main:main", "schema-salad-doc=schema_salad.makedoc:main"]
},
zip_safe=True,
cmdclass={'egg_info': tagger},
# cmdclass={'egg_info': tagger},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've fixed it.
when testing with cwltool earlier, it was giving me errors related to requirement parsing. thought commenting this out might help. forgot to uncomment.

manu-chroma added a commit to manu-chroma/cwltool that referenced this pull request Jun 28, 2017
manu-chroma added a commit to manu-chroma/cwltool that referenced this pull request Jun 28, 2017
manu-chroma added a commit to manu-chroma/cwltool that referenced this pull request Jun 28, 2017
@mr-c mr-c self-requested a review June 28, 2017 11:50
setup.py Outdated

setup(name='schema-salad',
version='2.5',
version='3.0',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to 2.5.1 to 2.6 and I'll merge it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2.5.1 or 2.6?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 2.5.1 is fine.

@mr-c mr-c merged commit fa93acc into common-workflow-language:master Jun 28, 2017
@mr-c
Copy link
Member

mr-c commented Jun 28, 2017

Huzzah! Thank you @manu-chroma

@manu-chroma
Copy link
Member Author

thank you @mr-c 😅

@manu-chroma
Copy link
Member Author

next PR for schema_salad will be doing unicode_literals import everywhere. will get cwltool PR merged before proceeding with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants