Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify so that flake8 and mypy are happy #300

Merged
merged 28 commits into from
Dec 14, 2018

Conversation

LukeShu
Copy link
Contributor

@LukeShu LukeShu commented Dec 2, 2018

This is all small changes (naturally), but since there are so many of them it can be hard to review them all at once, so this is a whole bunch of small commits, for easy reviewing.

Come back with no complaints from the following (sometimes using # NOQA or # type: ignore):

  • flake8 3.6.0 with flake8-isort 2.5: This turns out to be entirely stylistic changes
  • mypy --strict 0.641: Mostly mucking with and adding type hints, but the last 6 commits do make actual changes.

The individual commit messages should be sufficiently detailed.

And add `setup.cfg` to configure isort.
 - Set the maximum line length to 119, same as systemd.git
 - If the line is a function definition that is long because of many args,
   put one arg per line
 - Don't split user-facing strings over multiple lines; use NOQA to silence
   those complaints
 - In a few cases, re-structure the code to be easier to follow

This doesn't blindly modify the code to flake8's complaints, it looks at
long lines, and sees if we can make them easier to read.
Copy link
Member

@lucaswerkmeister lucaswerkmeister left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly looks good to me (and thank you for breaking this up across so many commits with beautiful messages! ❤️), but I feel like this has the potential to move a lot of people’s cheese around, so I’d prefer some more people to look at this before merging.

GitHub won’t let me add comments to commit messages, so I’m putting them here:


This commit message:

2 blank lines before/after class/function definitions

could be simplified to

2 blank lines around class/function definitions

or even

2 blank lines around definitions

but that’s just a suggestion. (In the last case, the “blank line between method definitions” commit could be squashed into it as well.)


The commit message for

Remove individual spaces around things

has the following block twice:

  • E251: Don't put spaces around the = when passing a keyword
    argument to a function:
      fn(kw = val)
           ^ ^
  • E251: Don't put spaces around the = when passing a keyword
    argument to a function:
      fn(kw = val)
           ^ ^

lambda line: "HOOKS=\"systemd modconf block sd-encrypt filesystems keyboard fsck\"\n" if line.startswith("HOOKS=") and args.encrypt == "all" else
"HOOKS=\"systemd modconf block filesystems fsck\"\n" if line.startswith("HOOKS=") else
line)
def jj(line: str) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the “jj” stand for anything?

(Edit after looking at the rest of the commit: or is that the usual way for tiny helper functions in Python?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used jj because the lambdas in set_root_password() were already named jj, so I just kept that name for those, and used it in the other places too, for consistency. If I'd made up the name "from scratch", I'd probably have named it patchfn

It looks like jj was originally introduced by @keszybz, you'd have to ask him.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jj stands for ... argh, let's not go there. I think those should be made top-level functions now, no need to make them nested.

l.append(x)
setattr(namespace, self.dest, l)
ary.append(x)
setattr(namespace, self.dest, ary)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s another occurrence of the “ambiguous” name l, down in line_join_list. Shouldn’t that be renamed as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. I'm not sure why flake8 doesn't catch that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, E741 does't trigger for argument names.

def install_distribution(args: CommandLineArguments,
workspace: str,
*,
run_build_script: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this parameter new? I don’t see it used below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, it was previously part of **kwargs. A later commit fixes this when it drops kwargs.

@complete_step('Detaching namespace')
# https://github.com/python/mypy/issues/1317
C = TypeVar('C', bound=Callable)
completestep = cast(Callable[[str], Callable[[C], C]], complete_step)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative suggestion: rename the def to completestep, and make complete_step the cast version of it, so all the uses of the decorator wouldn’t need to be adjusted. I feel like that would be better, but I’m not sure, so for now that’s just a suggestion :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also used as a generator inside of with statements. I'm not sure in which case it's used more, but I figured the decorator usage was easier to search/replace for.

if not args.bmap:
return None

if not args.output_format.is_disk_rw():
return None
assert raw is not None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there should be a blank line before this addition. Usually, the assert … is not None is right after the single if …: return check, but here we have two such checks separated by a blank line, and the assert doesn’t really belong any closer to the second one than the first one, in my opinion.

Copy link
Contributor Author

@LukeShu LukeShu Dec 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that is_disk_rw() implies is_disk() implies raw is not None.

I'm not sure what the relationship between bmap and raw is.

That is: In my understanding, it is related the the second check, and not to the first.

 - E225: Put a space on each side of the `+` operator
 - E227: Put a space on each side of the `|` operator
 - E227: Put a space on each side of the `<<` operator
 - E231: Put a space after `,` in a list/set/tuple.
 - E231: Put a space after the `:` when defining a lambda
 - E231: Put a space after the `:` in dict literals
 - E231: When type-hinting an argument, put a space between the `:`
   and the type:
       def fn(argname: type) -> type:
                      ^
 - E252: When type-hinting an argument with a default value, put
   spaces around the `=`:
       def fn(argname: type = default) -> type:
                           ^ ^
   This is in contrast with non-type-hinted arguments with defaults,
   which do NOT get spaces around the `=` (E251).
 - E261: Put two spaces before the `#` of an inline comment
@LukeShu
Copy link
Contributor Author

LukeShu commented Dec 2, 2018

but that’s just a suggestion. (In the last case, the “blank line between method definitions” commit could be squashed into it as well.)

class & function definitions get 2 blank lines between them; methods only get 1

…d things

 - E201: Don't put a space after the `[` in a list:
       ary = [ item,
              ^
 - E201: Don't put a space after the `{` in a dict:
       d = { key: val,
            ^
 - E202: Don't put a space before the `]` in a list:
       ary = [item1, item2 ]
                          ^
 - E202: Don't put a space before the `}` in a dict:
       d = [key: val }
                    ^
 - E203: Don't put a space before the `:` in a dict:
       d = {key : val}
               ^
 - E221: Write `# NOQA: E221` if we're lining up the `=` in a list of
   assignments
 - E222: Write `# NOQA: E222` if we're lining numbers on the right
   side of `=` in a list of assignments
 - E251: When defining an argument a default value, and the argument
   isn't type-hinted, don't put spaces around the `=`:
       def fn(argname = default) -> type:
                     ^ ^
   This is in contrast with type-hinted arguments with defaults, which
   DO get spaces around the `=` (E252).
 - E251: Don't put spaces around the `=` when passing a keyword
   argument to a function:
       fn(kw = val)
            ^ ^
 - E111: Normal indents should be multiples of 4
 - E128: Fixup visual indents
PEP 8 says to avoid `l` as a variable name, because it can be hard to
distinguish from `1` and `I`.  This means that flake8 complains about
the variable inside of ListAction.

However, flake8 doesn't trigger E741 for *argument* names.  Even
though flake8 doesn't complain about it, also rename the argument of
line_join_list() from `l` to `ary`.
Commit a415718 bumped the required
Python version to 3.6; which means that we can now use PEP 526
type-hints, instead of doing it in magical `# type:` comments.
Unfortunately, mypy 0.641 isn't yet smart enough to figure out the
appropriate subtype of IO to return from tempfile.NamedTemporaryFile()
based on whether the mode= argument contains "b", so we'll have to cast()
its return value for now.

If this makes a line wider than 119 characters, move the dir= argument to
the next line.
 - A @contextlib.contextmanager should return a
   typing.Generator[yield, send, return]
 - Because of <python/mypy#1317>, mypy 0.641
   doesn't think complete_step() has the correct type to be a decorator
   that works the way it does.  So, set

       completestep = complete_step

   But do typecasting to make mypy think it's the correct type.  And use
   `@completestep` instead of `@complete_step`.  That's an easy
   search/replace.
When we have a "new" with a new type, don't reuse the name of another
variable still in scope.
Common cases:

 - An `if foo is None` check clearly indicates that foo should be
   Optional[]:
   * prepare_root(): dev
   * prepare_home(): dev
   * prepare_srv(): dev
   * mount_image(): loopdev, home_dev, srv_dev
   * save_cache(): raw, cache_path
   * link_output_nspawn_settings(): path
   * link_output_checksum(): checksum
   * link_output_root_hash(): root_hash_file
   * link_output_signature(): signature
   * link_output_bmap(): bmap
   * process_settings(): key
   * run_build_script(): raw
 - In general, replace IO[str] with TextIO, except that in many cases
   BinaryIO is correct:
   * create_image() returns binary, not text
   * reuse_cache_image() returns binary, not text
   * attach_image_loopback() takes binary, not text
   * make_verity() returns binary, not text
   * calculate_sha256sum():
     . argument: raw is binary, not text
     . argument: tar is binary, not text
   * make_build_dir():
     . return: (raw|squashfs) is binary, not text
     . return: tar is binary, not text
   * run_build_script(): raw is binary, not text
   * remove_artifacts():
     . argument: raw is binary, not text
     . argument: tar is binary, not text
 - In general, replace IO[bytes] with BinaryIO, except that in many cases
   TextIO is correct:
    * calculate_sha256sum() returns text, not binary
    * calculate_bmap() returns text, not binary
 - Avoid ambiguous IO[Any]:
    * hash_file() writes to a TextIO
    * calculate_bmap() reads a BinaryIO

Special cases:

 - run(): Either returns a CompletedProcess, or doesn't return.  Drop
   the Optional[].
 - mount_image(): ???
 - calculate_sha256sum():
   . argument: root_hash_file is a BinaryIO, not a str
   . argument: nspawn_settings is a BinaryIO, not a str
     Wait, BinaryIO, not TextIO!?  Yes, even though the .nspawn file is
     plaintext.  The IO object we have here is a NamedTemporaryFile copy
     of the user's .nspawn file.  We made that copy in binary mode, to
     avoid pointlessly decoding then encoding the text.
 - remove_glob(): When providing a type hint for '*args', you provide the
   hint for the individual 'arg's, not the 'args' list.
There are several places where we unconditionally call a function, then
that function checks `args` to decide if it immediately returns, or if it
actually gets to do work.  In the cases where it immediately returns, some
of the arguments can be None, so they should be hinted as Optional[].
However, in the case were it does get to do work, they cannot be None.

So, hint them as Optional[], and the add `assert ARGNAME is not None`
after the check that implies it's not None, to let mypy know what we know.

For instance: args.output_format.is_disk() implies that raw is not None.
So it's safe to make the change:

         if not args.output_format.is_disk():
             return None
    +    assert raw is not None
We must declare 'values' as Union[str, Sequence[Any], None], to keep the
__call__ definition compatible with the parent class.  However, we know
(and rely on) that how we call .add_argument() means that it will always
be a str.  Add an 'assert' to let mypy know what we know.
But this shouldn't actually change anything; this is just making
stylistic changes to the code to clue mypy in to some stuff.
Mostly issues with not handling things that may be None.
@LukeShu
Copy link
Contributor Author

LukeShu commented Dec 3, 2018

v2:

  • Reword the E302,E305 commit message
  • Reword the E301 commit message
  • Remove a duplicate bullet point in the E251 commit message
  • Also modify line_join_list() in the E741 commit, even though E741
    doesn't care
  • Don't add the 'run_build_script' kwarg to install_distibution() in
    the "add missing type-hints" commit; only hint existing args in
    that commit. Instead, do that in a later commit:
  • In the install_distribution()/install_{DISTRIBUTION}() type
    unification commit, switch install_distribution() from using
    **kwargs to listing each the kwargs explicitly.

That is almost entirely metadata/history change, not content change.
The only content change is as follows:

$ git diff to-upstream/2018-12-02-flake8+mypy-v1 to-upstream/2018-12-03-flake8+mypy-v2 
diff --git a/mkosi b/mkosi
index abeda33..0026560 100755
--- a/mkosi
+++ b/mkosi
@@ -3870,12 +3870,12 @@ def none_to_none(s: Optional[str]) -> str:
     return "none" if s is None else s
 
 
-def line_join_list(l: List[str]) -> str:
+def line_join_list(ary: List[str]) -> str:
 
-    if not l:
+    if not ary:
         return "none"
 
-    return "\n                        ".join(l)
+    return "\n                        ".join(ary)
 
 
 def print_summary(args: CommandLineArguments) -> None:

@poettering
Copy link
Member

Hmm, this will regress quickly again if not verified all the time. Any chance you can include a patch that updates /ci/semaphore.sh to validate this on each PR?

@LukeShu
Copy link
Contributor Author

LukeShu commented Dec 3, 2018

Why does semaphore.sh manually add an apt repo for python3.6, and install python from there? There's an option in Semaphore to just start with 3.6.

@poettering
Copy link
Member

Why does semaphore.sh manually add an apt repo for python3.6, and install python from there? There's an option in Semaphore to just start with 3.6.

iirc then the code runs in a docker container, but we need more than that for losetup and stuff to work

@zyga
Copy link
Contributor

zyga commented Dec 4, 2018

Wow, this is super nice. What is the baseline python version for mkosi? The attribute type syntax is quite new and will not parse on older python versions.

mkosi Outdated
@@ -61,7 +61,7 @@ if sys.version_info < (3, 6):
arg_debug = ()


def run(cmdline: List[str], execvp: bool = False, **kwargs) -> Optional[subprocess.CompletedProcess]:
def run(cmdline: List[str], execvp: bool = False, **kwargs) -> subprocess.CompletedProcess:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like the case that uses execvp=True should be a separate method with clearly different return type. There's a special no-return type in mypy for this kind of interaction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize that there's a special no return type. Good suggestion!

@LukeShu
Copy link
Contributor Author

LukeShu commented Dec 4, 2018

@poettering That's definitely not the case on Semaphore 2.0, but apparently the systemd org is on Semaphore Classic, so IDK.

@zyga The attribute type syntax is introduced in Python 3.6, which a recent commit (e965e58) has already bumped the required version to (the appropriate commit message, 62d0e73, notes this).

@poettering
Copy link
Member

needs a rebase.

regarding the contents of the patch... I am not really a guru when it comes to python, and this looks like something only a true pythonista can appreciate, hence maybe @keszybz could you have a look if this makes sense to you?

@poettering
Copy link
Member

poettering commented Dec 6, 2018

@poettering That's definitely not the case on Semaphore 2.0, but apparently the systemd org is on Semaphore Classic, so IDK.

That sounds like you are volunteering to move the CI to semaphore 2.0? ;-)

@@ -401,9 +441,11 @@ def btrfs_subvol_delete(path: str) -> None:
# Delete the subvolume now that all its descendants have been deleted
run(["btrfs", "subvol", "delete", path], stdout=DEVNULL, stderr=DEVNULL, check=True)

def btrfs_subvol_make_ro(path: str, b:bool=True) -> None:

def btrfs_subvol_make_ro(path: str, b: bool = True) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be wrong but last time I looked at this the assignment of default values was supposed to not use whitespace around it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitespace is supposed to be there iff the parameter has a type specified, see 554a455.

if not args.output_format.is_disk():
return None

with complete_step('Creating partition table',
'Created partition table as {.name}') as output:

f = tempfile.NamedTemporaryFile(dir=os.path.dirname(args.output), prefix='.mkosi-', delete=not for_cache)
f: BinaryIO = cast(BinaryIO, tempfile.NamedTemporaryFile(prefix='.mkosi-', delete=not for_cache,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the cast tell the inference engine enough about f that you can drop the type annotation entirely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

My mindset is "every call to cast() is a workaround for a bug/deficiency in mypy, and should be droppable as mypy improves."

I think the "correct" way to write the line is

f: BinaryIO = tempfile.NamedTemporaryFile(prefix='.mkosi-', delete=not for_cache,

Since I'd like the : BinaryIO to be there after the cast() is removed, I included it now, even though it isn't necessary as long as cast() is there.

@@ -541,7 +590,8 @@ def reuse_cache_image(args: CommandLineArguments, workspace: str, run_build_scri
return None, False

with source:
f = tempfile.NamedTemporaryFile(dir = os.path.dirname(args.output), prefix='.mkosi-')
f: BinaryIO = cast(BinaryIO, tempfile.NamedTemporaryFile(prefix='.mkosi-',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about the inference engine as above.

@LukeShu
Copy link
Contributor Author

LukeShu commented Dec 10, 2018

That sounds like you are volunteering to move the CI to semaphore 2.0? ;-)

@poettering Sure, if I got the appropriate credentials :)

@systemd systemd deleted a comment from LukeShu Dec 14, 2018
Copy link
Member

@keszybz keszybz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. mypy still has some complaints, but they are all clearly bogus. Getting this so clean went much faster than I expected.

I'll merge this now, since rebasing is so unpleasant. Further fixes can be done separately.

@@ -1,3 +1,5 @@
[flake8]
max-line-length = 119
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the meantime we moved to 109 in systemd, but we can catch up here later.

lambda line: "HOOKS=\"systemd modconf block sd-encrypt filesystems keyboard fsck\"\n" if line.startswith("HOOKS=") and args.encrypt == "all" else
"HOOKS=\"systemd modconf block filesystems fsck\"\n" if line.startswith("HOOKS=") else
line)
def jj(line: str) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jj stands for ... argh, let's not go there. I think those should be made top-level functions now, no need to make them nested.

GPT_ROOT_ARM_VERITY = uuid.UUID("7386cdf2203c47a9a498f2ecce45a2d6")
GPT_ROOT_ARM_64_VERITY = uuid.UUID("df3300ced69f4c92978c9bfb0f38d820")
GPT_ROOT_IA64_VERITY = uuid.UUID("86ed10d5b60745bb8957d350f23d0571")
GPT_ROOT_X86 = uuid.UUID("44479540f29741b29af7d131d5f0458a") # NOQA: E221
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish flake8 and other tools could be made smart enough to figure out stuff like this one their own. Or maybe if there could be an external config file that'd tell flake8 to shut up about some of the checks.

l = getattr(namespace, self.dest)
if l is None:
l = []
ary = getattr(namespace, self.dest)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ary is a strange name...

passphrase = (passphrase['content'] + "\n").encode("utf-8")
run(["cryptsetup", "open", "--type", "luks", dev, name], input=passphrase, check=True)
passphrase_content = (passphrase['content'] + "\n").encode("utf-8")
run(["cryptsetup", "open", "--type", "luks", dev, name], input=passphrase_content, check=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

universal_newlines= aka text= would be an alternative. But this is OK too.

@keszybz keszybz merged commit 624a2a4 into systemd:master Dec 14, 2018
keszybz added a commit that referenced this pull request Dec 14, 2018
Merging locally to resolve the few conflicts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

5 participants