Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formatting of pandas expressions suboptimal for readability #662

Closed
clstaudt opened this issue Jan 8, 2019 · 4 comments
Closed

Formatting of pandas expressions suboptimal for readability #662

clstaudt opened this issue Jan 8, 2019 · 4 comments

Comments

@clstaudt
Copy link

clstaudt commented Jan 8, 2019

Arguing with the formatting decisions of black may be besides the point, but I am pretty happy with it with the exception of one type of expressions:

        trend = matching_items.groupby(pandas.Grouper(key="ERDAT", freq=timeframe))[
            "KWMENG"
        ].sum()

This is a typical pandas expression where some method calls are chained and fields of the dataframe are selected. Black seems to treat the field selection as a list and indents it on a new line. I think this is counterintuitive and does not improve readability. Is it intentional or a potential enhancement to treat this differently?

Operating system: macOS
Python version: 3.7.0
Black version: '18.9b0'

@tuchandra
Copy link
Contributor

This seems related to #571 - I'll +1 this as something that seems kind of bizarre.

@swenzel
Copy link

swenzel commented Jan 29, 2019

black won't reformat this unless your line is > line-length.
In one line, your statement would be 91 chars, the default line-length is 88.
Two solutions for that, either increase line-length or refactor your code so you get multiple shorter lines. For example:

grouper = pandas.Grouper(key="ERDAT", freq=timeframe)
grouped_items = matching_items.groupby(grouper)
trend = grouped_items["KWMENG"].sum()

88 isn't much, I prefer 120. None the less, I'd also do the refactoring. This way you can provide more descriptive code. I'm certain you can find better names for grouper and grouped_items but you see what I mean.

Long lines with lots of in-line code and logic really aren't that great for readability...

@clstaudt
Copy link
Author

clstaudt commented Jan 29, 2019

"One of the hardest things in programming is naming things" - and especially intermediate results in pandas transform chains, that's why you are going to have a hard time convincing pandas power users to adopt the unusual style you propose. I am going to disagree and say that chained method calls can be great for readability if formatted correctly.

Changing line length would be a workaround for this specific example, but I don't believe it solves the underlying issue.

@zsol
Copy link
Collaborator

zsol commented Feb 14, 2019

Let's keep the discussion in #571

@zsol zsol closed this as completed Feb 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants