Skip to content

BUG: validate_docstring uses main python executable instead of sys.executable #57219

@jrmylow

Description

@jrmylow

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

Check:
pip freeze > t.out and make sure flake8 is uninstalled from the main python environment

Run:
python scripts/validate_docstrings.py pandas.set_option()

Error:
Traceback (most recent call last):
  File "...\scripts\validate_docstrings.py", line 474, in <module>
    main(
  File "...\scripts\validate_docstrings.py", line 417, in main
    print_validate_one_results(func_name)
  File "...\scripts\validate_docstrings.py", line 386, in print_validate_one_results
    result = pandas_validate(func_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\scripts\validate_docstrings.py", line 262, in pandas_validate
    for error_code, error_message, line_number, col_number in doc.validate_pep8():
  File "...\scripts\validate_docstrings.py", line 211, in validate_pep8
    line_number, col_number, error_code, message = error_message.split(

Issue Description

When trying to run docstring tests in scripts/validate_docstring the script would fail in calling validate_pep8. The error message would say that flake8 was not installed and the function would fail because this error message is not in the expected format to be parsed with .split().

The root cause is that the function passes "python" instead of sys.executable to subprocess.run. Because my main python environment is barebones, it doesn't include flake8.

Note that in other instances where subprocess.run is used in pandas, sys.executable is passed in.

Expected Behavior

Script should output the docstrings/validation errors as per normal behaviour.

Installed Versions

INSTALLED VERSIONS

commit : 1d1672d
python : 3.12.1.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19045
machine : AMD64
processor : Intel64 Family 6 Model 140 Stepping 2, GenuineIntel
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : English_Australia.1252

pandas : 2.2.0.dev0+1177.g1d1672d1ec
numpy : 1.26.3
pytz : 2024.1
dateutil : 2.8.2
setuptools : 69.0.3
pip : 23.3.2
Cython : 3.0.8
pytest : 8.0.0
hypothesis : 6.97.4
sphinx : 7.2.6
blosc : None
feather : None
xlsxwriter : 3.1.9
lxml.etree : 5.1.0
html5lib : 1.1
pymysql : 1.4.6
psycopg2 : 2.9.9
jinja2 : 3.1.3
IPython : 8.21.0
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
bottleneck : 1.3.7
dataframe-api-compat : None
fastparquet : 2023.10.1
fsspec : 2023.12.2
gcsfs : 2023.12.2post1
matplotlib : 3.8.2
numba : 0.59.0
numexpr : 2.9.0
odfpy : None
openpyxl : 3.1.2
pandas_gbq : None
pyarrow : 15.0.0
pyreadstat : 1.2.6
python-calamine : None
pyxlsb : 1.0.10
s3fs : 2023.12.2
scipy : 1.12.0
sqlalchemy : 2.0.25
tables : 3.9.2
tabulate : 0.9.0
xarray : 2024.1.1
xlrd : 2.0.1
zstandard : 0.22.0
tzdata : 2023.4
qtpy : None
pyqt5 : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugCIContinuous Integration

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions