Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: use division operator to join paths. #11413

Merged

Conversation

WarrenTheRabbit
Copy link
Contributor

@WarrenTheRabbit WarrenTheRabbit commented Sep 8, 2023

Starting with resolve_package_path() and its associated tests, this refactoring seeks to make path concatenation more readable and consistent.

To be consistent, all uses of .joinpath() need to be replaced with the / operator or the other way around. I have chosen to standardise the use of the / operator across resolve_package_path() and its tests because

  1. it is already used in the tests
  2. it is more succinct and readable
  3. casual profiling with %timeit didn't reveal a performance loss
    / operator: 273 µs ± 86.6 µs
    .joinpath(): 269 µs ± 99.4 µs

Eventual consistency within and across all files is the ultimate goal. This is achievable. There are currently 428 uses of .joinpath() across 45 files. Counting assignment expressions only, there are 111 uses of the / operator across 23 files. However, even if there is a usage pattern of 4:1 against the / operator, I do not think that makes standardising it an unwise choice. There
are gains in readability.

Profiling

/ operator
#operator_profiling.ipynb

from pathlib import Path
from tempfile import TemporaryDirectory

with TemporaryDirectory() as tmp_dir:
    tmp_dir = Path(tmp_dir)
    (tmp_dir / '__init__.py').touch()
    print("operator approach:")
    %timeit (tmp_dir / "something/__init__.py").is_file()
joinpath()
#joinpath_profiling.ipynb

from pathlib import Path
from tempfile import TemporaryDirectory

with TemporaryDirectory() as tmp_dir:
    tmp_dir = Path(tmp_dir)
    (tmp_dir / '__init__.py').touch()
    print(".joinpath approach:")
    %timeit tmp_dir.joinpath("something/__init__.py").is_file()

Housekeeping

  • Allow maintainers to push and squash when merging my commits. Please uncheck this if you prefer to squash the commits yourself.

Starting with `resolve_package_path` and its associated tests,
this refactoring seeks to make path concatenation more
readable and consistent.

To be consistent, all uses of `.joinpath()` need to be replaced
with the division operator or the other way around. I have chosen
to standardise the use of the division operator because
   1) it is already used in the tests
   2) it is more succinct and readable
   3) casual profiling with `%timeit` didn't reveal a performance loss
           division operator: 273 µs ± 86.6 µs
                    `.joinpath()`: 269 µs ± 99.4 µs

Eventual consistency within and across all files is the ultimate goal.
This is not unachievable. There are currently 428 uses of `.joinpath()`
across 45 files. Counting assignment expressions only, there are 111
uses of the division operator across 23 files. This is fewer. However,
even if there is a usage pattern of 4:1 against the division operator,
I do not think that makes standardising it an unwise choice.
Copy link
Member

@RonnyPfannschmidt RonnyPfannschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When its not part of the attribute/variable name, using operator's instead of the method name actually makes it Harder to read

@WarrenTheRabbit
Copy link
Contributor Author

WarrenTheRabbit commented Sep 8, 2023

@RonnyPfannschmidt: When its not part of the attribute/variable name, using operator's instead of the method name actually makes it Harder to read

Clarification

Are you saying that this

ambiguous_variable_name.joinpath(another_ambiguous_variable_name)

is better than

ambiguous_variable_name / another_ambiguous_variable_name

Observation

There is also

path.joinpath(*['this', 'does', 'work')
path / (*['this', 'does', 'not', 'work'])

so it would be impossible to replace all instances of .joinpath() with the / operator.

Question

Is there any legs to the idea of introducing some degree of standardisation when and where appropriate?

@RonnyPfannschmidt
Copy link
Member

There was no deep discussion so far

Personally id gravitate towards joinpath in internal code that uses many segments and any usage where the variable name isn't indicative

@WarrenTheRabbit
Copy link
Contributor Author

WarrenTheRabbit commented Sep 8, 2023

I understand your comment about variables and attributes now.

When the variable and attribute names are uninformative, using .joinpath as an internal default is reasonable.
I'd say the main advantage of joinpath is that it assumes no familiarity with an initially confusing overload.

Let's compare some examples of when semantically expressive names or attributes have been involved and joinpath() has been used:

# Simple single segment
1a.  root.joinpath(target)
1b.  root / target

# Complex single segment
2a. pytester.path.joinpath(os.path.basename(failure_demo))
2b. pytester.path / os.path.basename(failure_demo)


# Static multi-segment
3a. Path(__file__).parent.joinpath("../doc/en/announce/index.rst")
3b. Path(__file__).parent / "../doc/en/announce/index.rst"

# Dynamic multi-segment
4a. self._cachedir.joinpath(self._CACHE_PREFIX_DIRS, path)
4b. self._cachedir / self._CACHE_PREFIX_DIRS / path

# Complex Logic
5a. path.joinpath("Scripts" if sys.platform.startswith("win") else "bin")
5b. path / ("Scripts" if sys.platform.startswith("win") else "bin")

# Pathlib method chaining
6a. example_path.joinpath("__init__.py").is_file()
6b. (example_path / "__init__.py").is_file()

7a. sub1.joinpath("conftest.py").write_text("assert 0", encoding="utf-8")
7b. (sub1 / "conftest.py").write_text("assert 0", encoding="utf-8")

# Non-assignment expression
8a. new_args.append("--basetemp=%s" % self.path.parent.joinpath("basetemp"))
8b. new_args.append("--basetemp=%s" % self.path.parent / "basetemp")

# F-string
9a. temproot.joinpath(f"pytest-of-{user}")
9b. temproot / f"pytest-of-{user}"

# Breaking expressions over multiple lines
0       pytester.path.joinpath("t.py").write_text("def test(): pass", encoding="utf-8")

A1      (pytester.path
                .joinpath("t.py")
                .write_text("def test(): pass", encoding="utf-8"))
A2        ((pytester.path
                / "t.py")
                .write_text("def test(): pass", encoding="utf-8"))

B1      file_path = pytester.path.joinpath("t.py")
          file_path.write_text("def test(): pass", encoding="utf-8")
B2      file_path = pytester  / "t.py"
          file_path.write_text("def test(): pass", encoding="utf-8")

For me, I have a clear preference for joinpath() only when an iterable is passed in

path.joinpath(*iterable)

or when a fluent-like expression is broken over multiple lines

(pytester.path
             .joinpath("t.py")
             .write_text("def test(): pass", encoding="utf-8"))

In general, I don't have very strong feelings about joinpath() compared to /.
The strongest feeling I have may be heretical: I strongly prefer A1 over 0.
I often find / clearer. But mainly I would like to be consistent in application
for each case.

@nicoddemus
Copy link
Member

Hi folks,

My take on this is that we don't need to standardize this across the code base, as it really does not matter much IMO. I think one should use whatever is more readable for that particular code, without the need for us to set some kind of rule/standardization of using one over the other.

But mainly I would like to be consistent in application for each case.

I really would like to avoid having some kind of rule/enforcement here, as it would be one more thing to worry about when reviewing/writing code, and IMHO it is really not something important enough to worry about.

Having said that feel free to go over the code and replacing joinpath by / if you feel this improves that particular code.

@WarrenTheRabbit
Copy link
Contributor Author

WarrenTheRabbit commented Sep 8, 2023

Thank you for the input, @nicoddemus, and your comment about additional worry for something not that important is totally fair! I can certainly be guilty of thinking about things that aren't important.

@nicoddemus: Having said that feel free to go over the code and replacing joinpath by / if you feel this improves that particular code.

I will do that as I come across them. Especially if within a line of each other / is used but then joinpath is used for an otherwise identical structure; such as parent / child but then parent.joinpath(child). That is the main consistency I care the most about in truth. Flipping between the two approaches - / or joinpath - was adding extra mental overhead and challenged the eye to read rapidly.

@WarrenTheRabbit
Copy link
Contributor Author

WarrenTheRabbit commented Sep 8, 2023

@RonnyPfannschmidt: any more input? GitHub is telling me you've requested a change. I think the purpose of your change request was to open up the idea that "/ should be used across the entire codebase" to other viewpoints. But let me know if a material change to the current code is also needed.

To summarise the outcomes of your intervention so far, these are the conclusions I'll be using in future decisions around / and joinpath:

  • code is free to use either / and joinpath
  • consistency within a function is more important than consistency across the codebase
  • it is nice to use / when it is more readable
  • it is nice to use joinpath when there is little context
  • be mindful that joinpath may be clearer when joining multiple segments

@nicoddemus nicoddemus merged commit 71f265f into pytest-dev:main Sep 9, 2023
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants