Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUILD: 2.1.0 not installable via installer because of duplicate files in wheel #54888

Closed
1 task done
mgorny opened this issue Aug 31, 2023 · 13 comments · Fixed by #55206
Closed
1 task done

BUILD: 2.1.0 not installable via installer because of duplicate files in wheel #54888

mgorny opened this issue Aug 31, 2023 · 13 comments · Fixed by #55206
Assignees
Labels
Build Library building on various platforms
Milestone

Comments

@mgorny
Copy link
Contributor

mgorny commented Aug 31, 2023

Installation check

Platform

Linux-6.4.7-gentoo-dist-x86_64-AMD_Ryzen_5_3600_6-Core_Processor-with-glibc2.38

Installation Method

Built from source

pandas Version

2.1.0

Python Version

3.11.5

Installation Logs

Build log
$ python3.11 -m build -w
* Creating virtualenv isolated environment...
* Installing packages in isolated environment... (Cython>=0.29.33,<3, meson-python==0.13.1, meson==1.0.1, numpy>=1.22.4; python_version>='3.12', oldest-supported-numpy>=2022.8.16; python_version<'3.12', versioneer[toml], wheel)
* Getting build dependencies for wheel...
* Building wheel...
+ meson setup /tmp/pandas-2.1.0 /tmp/pandas-2.1.0/.mesonpy-vs0fc5ea/build -Dbuildtype=release -Db_ndebug=if-release -Db_vscrt=md --vsenv --native-file=/tmp/pandas-2.1.0/.mesonpy-vs0fc5ea/build/meson-python-native-file.ini
The Meson build system
Version: 1.0.1
Source dir: /tmp/pandas-2.1.0
Build dir: /tmp/pandas-2.1.0/.mesonpy-vs0fc5ea/build
Build type: native build
Project name: pandas
Project version: 2.1.0
C compiler for the host machine: ccache cc (gcc 13.2.1 "cc (Gentoo 13.2.1_p20230826 p7) 13.2.1 20230826")
C linker for the host machine: cc ld.bfd 2.41
C++ compiler for the host machine: ccache c++ (gcc 13.2.1 "c++ (Gentoo 13.2.1_p20230826 p7) 13.2.1 20230826")
C++ linker for the host machine: c++ ld.bfd 2.41
Cython compiler for the host machine: cython (cython 0.29.36)
Host machine cpu family: x86_64
Host machine cpu: x86_64
Program python found: YES (/tmp/build-env-qqfvpycz/bin/python)
Found pkg-config: /usr/bin/pkg-config (1.8.1)
Build targets in project: 53

pandas 2.1.0

  User defined options
    Native files: /tmp/pandas-2.1.0/.mesonpy-vs0fc5ea/build/meson-python-native-file.ini
    buildtype   : release
    b_ndebug    : if-release
    b_vscrt     : md

Found ninja-1.11.1 at /usr/bin/ninja
+ /usr/bin/ninja
[151/151] Linking target pandas/_libs/join.cpython-311-x86_64-linux-gnu.so
+ meson install --quiet --no-rebuild --destdir /tmp/pandas-2.1.0/.mesonpy-vs0fc5ea/install
[54/65] pandas/_libsmeson-python: warning: Duplicate name: 'pandas/_libs/__init__.py'
meson-python: warning: Duplicate name: 'pandas/_libs/tslibs/__init__.py'
[65/65] pandas/util
Successfully built pandas-2.1.0-cp311-cp311-linux_x86_64.whl
$ python3.11 -m installer --destdir /tmp/z dist/pandas-2.1.0-cp311-cp311-linux_x86_64.whl 
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/usr/lib/python3.11/site-packages/installer/__main__.py", line 98, in <module>
    _main(sys.argv[1:], "python -m installer")
  File "/usr/lib/python3.11/site-packages/installer/__main__.py", line 94, in _main
    installer.install(source, destination, {})
  File "/usr/lib/python3.11/site-packages/installer/_core.py", line 109, in install
    record = destination.write_file(
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/installer/destinations.py", line 207, in write_file
    return self.write_to_fs(scheme, path_, stream, is_executable)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/installer/destinations.py", line 167, in write_to_fs
    raise FileExistsError(message)
FileExistsError: File already exists: /tmp/z/usr/lib/python3.11/site-packages/pandas/_libs/__init__.py
$ unzip -l dist/pandas-2.1.0-cp311-cp311-linux_x86_64.whl  | grep _libs/__init
      691  08-31-2023 04:57   pandas/_libs/__init__.py
      691  08-31-2023 04:57   pandas/_libs/__init__.py
@mgorny mgorny added Build Library building on various platforms Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 31, 2023
@thesamesam
Copy link
Contributor

Note that pip will likely work as it seems to ignore duplicates, while installer rejects them.

@lithomas1
Copy link
Member

Do you know if this issue is present on the uploaded wheels to PyPI?

Inspecting this wheel (the manylinux wheel for Python 3.11) https://files.pythonhosted.org/packages/d9/26/895a49ebddb4211f2d777150f38ef9e538deff6df7e179a3624c663efc98/pandas-2.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl,

I only see 1 __init__.py in _libs.

@lithomas1 lithomas1 added Needs Info Clarification about behavior needed to assess issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 31, 2023
@mgorny
Copy link
Contributor Author

mgorny commented Aug 31, 2023

Indeed the wheel from PyPI looks and works fine. I'll try digging in.

@mgorny
Copy link
Contributor Author

mgorny commented Aug 31, 2023

Apparently auditwheel repair run as part of cibuildwheel pipeline repacks the wheel, effectively hiding the problem.

@mgorny
Copy link
Contributor Author

mgorny commented Aug 31, 2023

Guessing by the warnings from meson install, I've removed the explicit install calls for __init__.py, i.e.:

diff --git a/pandas/_libs/meson.build b/pandas/_libs/meson.build
index f302c649bc..ddce9ea2d6 100644
--- a/pandas/_libs/meson.build
+++ b/pandas/_libs/meson.build
@@ -113,8 +113,4 @@ foreach ext_name, ext_dict : libs_sources
     )
 endforeach
 
-py.install_sources('__init__.py',
-                    pure: false,
-                    subdir: 'pandas/_libs')
-
 subdir('window')
diff --git a/pandas/_libs/tslibs/meson.build b/pandas/_libs/tslibs/meson.build
index 14d2eef46d..a862345c3a 100644
--- a/pandas/_libs/tslibs/meson.build
+++ b/pandas/_libs/tslibs/meson.build
@@ -30,7 +30,3 @@ foreach ext_name, ext_dict : tslibs_sources
         install: true
     )
 endforeach
-
-py.install_sources('__init__.py',
-                    pure: false,
-                    subdir: 'pandas/_libs/tslibs')

and got a correct wheel as a result.

@eli-schwartz
Copy link
Contributor

subdirs_list = [
'_config',
'_libs',
'_testing',
'api',
'arrays',
'compat',
'core',
'errors',
'io',
'plotting',
'tests',
'tseries',
'util'
]
foreach subdir: subdirs_list
install_subdir(subdir, install_dir: py.get_install_dir(pure: false) / 'pandas')
endforeach

They're installed in bulk, and then manually installed again.

@lithomas1
Copy link
Member

No this if for _libs and _libs/tslibs, which aren't part of the subdirs_list I think.

@eli-schwartz
Copy link
Contributor

I see _libs in that list, though.

@lithomas1
Copy link
Member

My bad, I must have typo'ed there.

Thanks for the catch!

@lithomas1 lithomas1 removed the Needs Info Clarification about behavior needed to assess issue label Aug 31, 2023
@lithomas1 lithomas1 added this to the 2.1.1 milestone Aug 31, 2023
@mgorny
Copy link
Contributor Author

mgorny commented Aug 31, 2023

Ok, indeed removing _libs from subdirs_list seems to also be a viable solution. From a quick glance, it means installing less files, though.

Full list of missing files
-pandas/_libs/algos_common_helper.pxi.in
-pandas/_libs/algos.pxd
-pandas/_libs/algos.pyi
-pandas/_libs/algos.pyx
-pandas/_libs/algos_take_helper.pxi.in
-pandas/_libs/arrays.pxd
-pandas/_libs/arrays.pyi
-pandas/_libs/arrays.pyx
-pandas/_libs/byteswap.pyi
-pandas/_libs/byteswap.pyx
-pandas/_libs/dtypes.pxd
-pandas/_libs/groupby.pyi
-pandas/_libs/groupby.pyx
-pandas/_libs/hashing.pyi
-pandas/_libs/hashing.pyx
-pandas/_libs/hashtable_class_helper.pxi.in
-pandas/_libs/hashtable_func_helper.pxi.in
-pandas/_libs/hashtable.pxd
-pandas/_libs/hashtable.pyi
-pandas/_libs/hashtable.pyx
-pandas/_libs/include/pandas/datetime/date_conversions.h
-pandas/_libs/include/pandas/datetime/pd_datetime.h
-pandas/_libs/include/pandas/inline_helper.h
-pandas/_libs/include/pandas/parser/io.h
-pandas/_libs/include/pandas/parser/pd_parser.h
-pandas/_libs/include/pandas/parser/tokenizer.h
-pandas/_libs/include/pandas/portable.h
-pandas/_libs/include/pandas/skiplist.h
-pandas/_libs/include/pandas/vendored/klib/khash.h
-pandas/_libs/include/pandas/vendored/klib/khash_python.h
-pandas/_libs/include/pandas/vendored/numpy/datetime/np_datetime.h
-pandas/_libs/include/pandas/vendored/numpy/datetime/np_datetime_strings.h
-pandas/_libs/include/pandas/vendored/ujson/lib/ultrajson.h
-pandas/_libs/include/pandas/vendored/ujson/python/version.h
-pandas/_libs/index_class_helper.pxi.in
-pandas/_libs/indexing.pyi
-pandas/_libs/indexing.pyx
-pandas/_libs/index.pyi
-pandas/_libs/index.pyx
-pandas/_libs/internals.pyi
-pandas/_libs/internals.pyx
-pandas/_libs/interval.pyi
-pandas/_libs/interval.pyx
-pandas/_libs/intervaltree.pxi.in
-pandas/_libs/join.pyi
-pandas/_libs/join.pyx
-pandas/_libs/json.pyi
-pandas/_libs/khash_for_primitive_helper.pxi.in
-pandas/_libs/khash.pxd
-pandas/_libs/lib.pxd
-pandas/_libs/lib.pyi
-pandas/_libs/lib.pyx
-pandas/_libs/meson.build
-pandas/_libs/missing.pxd
-pandas/_libs/missing.pyi
-pandas/_libs/missing.pyx
-pandas/_libs/ops_dispatch.pyi
-pandas/_libs/ops_dispatch.pyx
-pandas/_libs/ops.pyi
-pandas/_libs/ops.pyx
-pandas/_libs/parsers.pyi
-pandas/_libs/parsers.pyx
-pandas/_libs/properties.pyi
-pandas/_libs/properties.pyx
-pandas/_libs/reshape.pyi
-pandas/_libs/reshape.pyx
-pandas/_libs/sas.pyi
-pandas/_libs/sas.pyx
-pandas/_libs/sparse_op_helper.pxi.in
-pandas/_libs/sparse.pyi
-pandas/_libs/sparse.pyx
-pandas/_libs/src/datetime/date_conversions.c
-pandas/_libs/src/datetime/pd_datetime.c
-pandas/_libs/src/parser/io.c
-pandas/_libs/src/parser/pd_parser.c
-pandas/_libs/src/parser/tokenizer.c
-pandas/_libs/src/vendored/numpy/datetime/np_datetime.c
-pandas/_libs/src/vendored/numpy/datetime/np_datetime_strings.c
-pandas/_libs/src/vendored/ujson/lib/ultrajsondec.c
-pandas/_libs/src/vendored/ujson/lib/ultrajsonenc.c
-pandas/_libs/src/vendored/ujson/python/JSONtoObj.c
-pandas/_libs/src/vendored/ujson/python/objToJSON.c
-pandas/_libs/src/vendored/ujson/python/ujson.c
-pandas/_libs/testing.pyi
-pandas/_libs/testing.pyx
-pandas/_libs/tslib.pyi
-pandas/_libs/tslib.pyx
-pandas/_libs/tslibs/base.pxd
-pandas/_libs/tslibs/base.pyx
-pandas/_libs/tslibs/ccalendar.pxd
-pandas/_libs/tslibs/ccalendar.pyi
-pandas/_libs/tslibs/ccalendar.pyx
-pandas/_libs/tslibs/conversion.pxd
-pandas/_libs/tslibs/conversion.pyi
-pandas/_libs/tslibs/conversion.pyx
-pandas/_libs/tslibs/dtypes.pxd
-pandas/_libs/tslibs/dtypes.pyi
-pandas/_libs/tslibs/dtypes.pyx
-pandas/_libs/tslibs/fields.pyi
-pandas/_libs/tslibs/fields.pyx
-pandas/_libs/tslibs/meson.build
-pandas/_libs/tslibs/nattype.pxd
-pandas/_libs/tslibs/nattype.pyi
-pandas/_libs/tslibs/nattype.pyx
-pandas/_libs/tslibs/np_datetime.pxd
-pandas/_libs/tslibs/np_datetime.pyi
-pandas/_libs/tslibs/np_datetime.pyx
-pandas/_libs/tslibs/offsets.pxd
-pandas/_libs/tslibs/offsets.pyi
-pandas/_libs/tslibs/offsets.pyx
-pandas/_libs/tslibs/parsing.pxd
-pandas/_libs/tslibs/parsing.pyi
-pandas/_libs/tslibs/parsing.pyx
-pandas/_libs/tslibs/period.pxd
-pandas/_libs/tslibs/period.pyi
-pandas/_libs/tslibs/period.pyx
-pandas/_libs/tslibs/strptime.pxd
-pandas/_libs/tslibs/strptime.pyi
-pandas/_libs/tslibs/strptime.pyx
-pandas/_libs/tslibs/timedeltas.pxd
-pandas/_libs/tslibs/timedeltas.pyi
-pandas/_libs/tslibs/timedeltas.pyx
-pandas/_libs/tslibs/timestamps.pxd
-pandas/_libs/tslibs/timestamps.pyi
-pandas/_libs/tslibs/timestamps.pyx
-pandas/_libs/tslibs/timezones.pxd
-pandas/_libs/tslibs/timezones.pyi
-pandas/_libs/tslibs/timezones.pyx
-pandas/_libs/tslibs/tzconversion.pxd
-pandas/_libs/tslibs/tzconversion.pyi
-pandas/_libs/tslibs/tzconversion.pyx
-pandas/_libs/tslibs/util.pxd
-pandas/_libs/tslibs/vectorized.pyi
-pandas/_libs/tslibs/vectorized.pyx
-pandas/_libs/util.pxd
-pandas/_libs/window/aggregations.pyi
-pandas/_libs/window/aggregations.pyx
-pandas/_libs/window/indexers.pyi
-pandas/_libs/window/indexers.pyx
-pandas/_libs/window/__init__.py
-pandas/_libs/window/meson.build
-pandas/_libs/writers.pyi
-pandas/_libs/writers.pyx

Most of them seem fine to skip but:

  • .pyi files are missing
  • pandas/_libs/window/__init__.py is missing too

@lithomas1 lithomas1 self-assigned this Sep 1, 2023
@mgorny
Copy link
Contributor Author

mgorny commented Sep 17, 2023

Ping.

@lithomas1
Copy link
Member

Sorry for the wait. I'll try and get something up tomorrow.

@mgorny
Copy link
Contributor Author

mgorny commented Sep 20, 2023

Thanks. I can confirm that current git seems to install fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants