Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format code with Yapf #997

Closed
wants to merge 2 commits into from
Closed

Conversation

svartkanin
Copy link
Collaborator

I raised this as part of #872.

I'd like to start this as a discussion point here and to showcase Yapf.
I would like to make it clear that I'm not trying to enforce anything here but rather suggest something :)

Looking at the current code base there's quite a few code styles flying around, which is nothing bad per se, but it might be useful to have a more unified formatting at least. I generally don't like to have a very strict formatting applied (I"m looking at you Black but rather configurable rules that allows you to tweak things to ones liking, which I believe Yapf provides.

Yapf is a formatter that allows you to configure A LOT of settings, please check out the README in the yapf repo which lists all(?) possible configuration options. There are pre-configured profiles that one can use and override certain things manually.

The configuration is done in the pyproject.toml file. I've included an example now that uses the pep8 profile as a base and then overrides some configurations.

To run the formatting, install

pip install yapf toml

and then

yapf archinstall/ --recursive --in-place

NOTE: These installations would only be for development so we're not adding any real dependencies to archinstall

@svartkanin
Copy link
Collaborator Author

And of course I'll fix flake8 which I just completely blew up :D

user_configuration_json = json.dumps(
{
'config_version':
storage['__version__'], # Tells us what version was used to generate the config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the new line and indent here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be updated now with a new re-formatting

if save:
dest_path = pathlib.Path(storage.get('LOG_PATH', '.'))
if (not dest_path.exists()) or not (dest_path.is_dir()):
log(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is silly to split into 3 lines because of word wrapping

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean the arguments in the log function, I believe it tries to separate them into a new line for readability.
There might be a setting for this, but it will most likely affect other places that will get less readable by it

"intel-media-driver",
"vulkan-intel",
],
"Nvidia (open-source)": ["mesa", "xf86-video-nouveau", "libva-mesa-driver"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Odd that it put these all on one line for no good reason.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the entries in the list?
There's multiple settings for this to define how the entries should be split up, so this can be tweaked

@svartkanin
Copy link
Collaborator Author

@Torxed any thoughts, feedback, concerns on this one :)

@Torxed
Copy link
Member

Torxed commented Feb 28, 2022

This introduces spaces everywhere. And I'm pretty firm on that stance sadly.
I can't have spaces when working with editors like nano and notepad.exe every now and then, kills my spirit and my fingers :)

@svartkanin
Copy link
Collaborator Author

@Torxed that's a very easy fix, there's a setting for this in yapf :)
Any other concerns with this otherwise?
I'll update it to tabs asap

@Torxed
Copy link
Member

Torxed commented Feb 28, 2022

Humm, I might like yapf if it's this configurable :) I'll have a look in a bit.

@svartkanin
Copy link
Collaborator Author

@Torxed I've updated everything with tabs now :)

The configuration is set in the pyproject.toml file

@Torxed
Copy link
Member

Torxed commented Mar 28, 2022

I'll have a look on this one. I'll look it over pretty slowly because I'd like to see all aspects of changes that yapf does in order to understand what I'm committing to. So far I like how flexible it is and some of the changes I've touched, I think there was really only one change that I made manually to override it that stood out.

@Torxed
Copy link
Member

Torxed commented Mar 29, 2022

Overall, I like what Yapf is doing.
And It's generally doing a good job with lists, dicts and functions.
But there were some cases where it contradicted itself in the way it formatted dictionaries and lists etc, one example was archinstall/lib/storage.py where it really just bungled everything up.

I've tried to be as pragmatic as possible and ignore some of my personal preferences.
So the thoughts below are the things that really stood out to me:

  • It's going to take some getting used to that's for sure, for instance two spaces before a comment.
    Color highlighting kinda solves the need to distinguish comments by two spaces for me. But might be useful.
  • I agree that:
    parser.add_argument("--disk_layouts",
    			"--disk_layout",
    			"--disk-layouts",
    			"--disk-layout",
    			nargs="?",
    			help="JSON disk layout file")
    Is better than
    parser.add_argument("--disk_layouts","--disk_layout","--disk-layouts","--disk-layout",nargs="?",
    			help="JSON disk layout file")
    But I would prefer:
    parser.add_argument(
    	"--disk_layouts",
    	"--disk_layout",
    	"--disk-layouts",
    	"--disk-layout",
    	nargs="?",
    	help="JSON disk layout file"
    )
    And here's where we come in to taste and differences.
    I'm pretty sure (without the need for comments stating it) people will disagree.
    But I like that it's predictable in width, and follows the semantics of using tuples as line continuums on imports as well as list continuations.
    So if possible, I'd prefer the last format or none at all.
    And it must be consistent, which it isn't here:
    parser.add_argument(
    	"--silent",
    	action="store_true",
    	help="WARNING: Disables all prompts for input and confirmation. If no configuration is provided, this is ignored"
    )
    parser.add_argument("--dry-run",
    			"--dry_run",
    			action="store_true",
    			help="Generates a configuration file and then exits instead of performing an installation")
  • I would require Yapf to not mess up dictionaries:
    {
    	'config_version':
    	storage['__version__'],  # Tells us what version was used to generate the config
    	**self._user_config,  # __version__ will be overwritten by old version definition found in config
    	'version':
    	storage['__version__']
    },
    Here, it should have been 'version': storage['__version__'] but it looks like it's treating it as a set or list?
  • I know this isn't very PEP8 of me, but the following is true for me:
    def _is_valid_path(self, dest_path :Path) -> bool:  # Correct
    def _is_valid_path(self, dest_path: Path) -> bool:  # Wrong
    The reason being, it looks way to much like a dictionary definition, and less as "something else".
    It's a personal preference for sure, and again, people will disagree. But I get properly confused and annoyed reading code that looks like a dictionary, but doesn't behave like a dictionary :) Functionality should look differently, not just be in a different place. So I'd require either to not touch these definitions or make it do the wrong thing on purpose I guess.
  • I know it's not Yapf's fault and it will be hard to configure, but minituousus changes like these makes no sense to me:
    raise ValueError(f"Missing fs-type to format on newly created encrypted partition {partition['device_instance']}")
    raise ValueError(
    	f"Missing fs-type to format on newly created encrypted partition {partition['device_instance']}"
    )
    The reason for the change is obvious, the line was too long so it did what it could to shorten it.
    But in that case, just leaving it would make sense. So if possible - only do this if the statement itself is long (not just the line).
    Of let it feel good about itself and I'll have to live with the fact that it's almost impossible for it to handle the edge case.
  • This is less readable:
    if not partition['filesystem']['format'] or valid_fs_type(
    	partition['filesystem']['format']) is False:
     	print(
     		_("You need to enter a valid fs-type in order to continue. See `man parted` for valid fs-type's."))
    Than this:
    if not partition['filesystem']['format'] or valid_fs_type(partition['filesystem']['format']) is False:
    	print(_("You need to enter a valid fs-type in order to continue. See `man parted` for valid fs-type's."))
    I struggle with finding what starts where and what ends where.
    And it didn't really shorten anything down so this one bothered me : )
  • This became less readable:
    {
    	f"/dev/{dev_name}": {
    		**device_information, 'path': f'/dev/{dev_name}',
    		'PATH': f'/dev/{dev_name}',
    		'PTTYPE': None
    	}
    }
    Than what it previously was:
    {
    	f"/dev/{dev_name}" : {
    		**device_information,
    		'path' : f'/dev/{dev_name}',
    		'PATH' : f'/dev/{dev_name}',
    		'PTTYPE' : None
    	}
    }
  • This is nitpicking for sure, but I consider del to be a function not a statement, just like print was considered a statement before but turned out to be a function. Therefor I think del(x) is more correct than del x and for that reason del (x) is wrong I think:
    del(self.monitoring[fileno])   # It was
    del (self.monitoring[fileno])  # It became
  • Nitpicking again, ignore this one if it's not easy to solve without affecting complex lists which SHOULD be like this:
    # Better:
    def multisplit(s :str, splitters :List[str]) -> str:
    	s = [s, ]
    
    # Turn for the worse, but I get what it tired to do (being consistent).
    def multisplit(s: str, splitters: List[str]) -> str:
    	s = [
    		s,
    	]
  • Nitpicking, and I doubt we can do anything about it?
    '--hash', hash_type,
    Rather than
    '--hash',
    hash_type,
    I get what it's doing, but in the case of parameters to a command it would be nice if they followed each other instead of being on separate lines? : )

One thing I was impressed by, was this change:
screenshot

The only thing I'd like to say, is that it took some time wrapping my head around why it did what it did.
I get that it simplified the code, but it also changed it in a pretty dramatic way until you understood the change.
So that would take some time from me in every merger, but ultimately I think it looked better and that time is well spent.
Good job Yapf - even tho you gave me a bit of an Ai-level scare here re-writing code..

Found another one where it did some voodoo, and it comes up with it's own variable names?

# It was:
new_uuid_set = (previous_partition_uuids ^ {partition.uuid for partition in self.blockdevice.partitions.values()})
# Became:
uuids = {partition.uuid for partition in self.blockdevice.partitions.values()}
new_uuid_set = (previous_partition_uuids ^ uuids)

I hope it know to check for uuids further down the code so it didn't override or create possible bugs somewhere or conflicts with flake8/mypy : )

Interesting that it didn't like:

while SysCommand(f"/usr/bin/umount -R {temporary_mountpoint}").exit_code != 0 and (iterations := iterations + 1) < 10:

But instead prefered to do:

while SysCommand(f"/usr/bin/umount -R {temporary_mountpoint}").exit_code != 0 and iterations < 10:
    iterations += 1

I guess it's more readable, but I wonder if that's because people are just not used to the walrus operator yet or because it genuinely will never be readable hehe.

@svartkanin
Copy link
Collaborator Author

@Torxed thanks for the great review of this one!
To address some of them

for instance two spaces before a comment.

This can be set with the SPACES_BEFORE_COMMENT setting to any number of spaces. If updaed the code by setting it to 1 :)

So if possible, I'd prefer the last format or none at all.
And it must be consistent, which it isn't here:

The reason why this isn't consistent is that Yapf is merely reformatting lines that are too long and don't fit into the defined column space. I don't think there's a way to force it to everything regardless of length. Because in that case we'd have things like these

call_function(
     single_arg=42
)

I would require Yapf to not mess up dictionaries

I've added a new setting to indent dictionary values which will end up with

{
	'config_version':
		storage['__version__'], # Tells us what version was used to generate the config
	**self._user_config, # __version__ will be overwritten by old version definition found in config
	'version':
		storage['__version__']
},

The reason the values are put on a new line is because the comments are also considered as part of the code and it's trying to fit everything into the given column space.

def _is_valid_path(self, dest_path :Path) -> bool: # Correct
def _is_valid_path(self, dest_path: Path) -> bool: # Wrong

I don't think this is doable. I understand your reasoning but I personally have never seen a project using var :str :) not that it bothers me too much since an auto formatter would do whatever ;)

The reason for the change is obvious, the line was too long so it did what it could to shorten it.
But in that case, just leaving it would make sense. So if possible - only do this if the statement itself is long (not just the line).
Of let it feel good about itself and I'll have to live with the fact that it's almost impossible for it to handle the edge case.

I don't think this can be tweaked in that detail unfortunately :(

I struggle with finding what starts where and what ends where.
And it didn't really shorten anything down so this one bothered me : )

I totally agree on this one, there will most definitely by weird cases that have to be rewritten manually

This is nitpicking for sure, but I consider del to be a function not a statement, just like print was considered a statement before but turned out to be a function. Therefor I think del(x) is more correct than del x and for that reason del (x) is wrong I think:

This has been formatted like this because there are surounding parenthesis. Normally on a del you wouldn't need those and therefore it would end up being del entry[x] kinda like Python2.7 print entry[x] :)

itpicking again, ignore this one if it's not easy to solve without affecting complex lists which SHOULD be like this:

This is a very simple fix but just removing the , from it ;) . I understand that it might end up like this and why Yapf is doing it, but sometimes to get a better formatting it might be necessary to rewrite the code and simplify things. IMHO if yapf produces weird formatting that means that the code is already too complex and can be simplified in a way.

Nitpicking, and I doubt we can do anything about it?

In this case there's no way to tell yapf about this, this is internal logic and it would be quite magical of a formatter to determine based on the logic what format to use :)

Found another one where it did some voodoo, and it comes up with it's own variable names?

haha this was actually me doing some manual refactoring in this case. I think this is rather an illustration of how to refactor code to get better formatting results ;)

So overrall Yapf is configurable to align more with personal preferences but it does have it's limitations. It cannot format dictionaries or lists individually but based on a rule set that applies to all of them equally. Which means that it will end up doing some things that one might not like as much.

From that I have to say though that if a formatting turns out to be bad, then it's usually quite easy to refactor the code itself to gain simplicity and also the formatting will then work.

I have pushed now an update with some setting tweaks, if you get some time have another look and see if it's getting better or worse now :)

I like formatters because they unify the code and give it a global set of rules that is systematically applied. Depending on code complexity this will of course not always work perfectly, especially in a language that operates with spaces/tabs as part of it's language :D

@svartkanin
Copy link
Collaborator Author

FYI just ignore the conflicts for now, if we should find a compromise on this and will actually introduce the formatter into the code then i'll start from a new master anyways :)

@Torxed
Copy link
Member

Torxed commented Mar 30, 2022

I'm leaning towards maybe disabling the line-length as that appears to be the root of many "oddities".
I would prefer consistency over following strict guidelines.
A long line is always a long line, but a dictionary should also look and be a dictionary in all scenarios : )

Thank you for taking the time to do this and to respond to my rants.
I'll have a proper look on your comments in a bit :)

@Torxed
Copy link
Member

Torxed commented Apr 20, 2022

I've asked for a "knob" (I think they call it) to configure type hints colon position. It's such a small thing but I would love for it to happen. Tracking it here: google/yapf#1003

I've also started experimenting with yapf in a smaller project of mine, to see what tweaks we can do to the configuration.
Among those things I found that increasing the line length and have yapf not bother with long lines yields pretty good results.
Downside is we don't get automatic formatting on long lines - but perhaps that's not the most pressing issue with the style of code at the moment? :)

@Torxed
Copy link
Member

Torxed commented Apr 20, 2022

Added a second upstream suggestion in order to be able to affect all sublevels of dictionaries the same way the top level is formatted. Tracking it here: google/yapf#1004

That one is a blocker for me as it makes dictionaries harder to read.
Probably a configuration I missed but I'll see what comes out of the above :)

Starting to like Yapf.

@svartkanin
Copy link
Collaborator Author

@Torxed thanks for having another look ❤️
I think github messed up the actual links to the issues raised but I found them in the Yapf repo ;) lets see what they say about the dictionary one, I'm curious as well on it!

@Torxed
Copy link
Member

Torxed commented Apr 22, 2022

Pleasure is all mine!
Hehe yea I think github filtered away my own shortening []() for some reason. Added a direct URL ^^
I don't suspect much feedback on them, it feels like most issues are quite silent.

@svartkanin
Copy link
Collaborator Author

@Torxed I'll close this for now, maybe we can come back in the future to it :)
I haven't used black in a while, might have a look at that one if it got more flexible and configurable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants