Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test and build wheels for Py3.{7,8,9,10} #3298

Merged
merged 50 commits into from
Mar 18, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
a1b5672
test and build wheels for Py3.{7,8,9,10}
mpenkov Feb 26, 2022
e715d4a
fix yaml
mpenkov Feb 26, 2022
707042c
upgrade pip inside the wheel building environment
mpenkov Feb 26, 2022
4ed3da4
use xfail with a broader filter instead of skip
mpenkov Feb 26, 2022
0b2847b
debugging pip issues with Py3.10
mpenkov Feb 26, 2022
2ea6638
upgrade setuptools as well as pip
mpenkov Feb 26, 2022
d10b0cd
fix xfail
mpenkov Feb 26, 2022
f57d1d6
more multibuild wizardry
mpenkov Feb 26, 2022
8b7691a
override install_run
mpenkov Feb 26, 2022
ca45cb8
Revert "override install_run"
mpenkov Feb 26, 2022
ce47cb8
try a different way to upgrade pip install the docker container
mpenkov Feb 26, 2022
6fb2f1e
try to upgrade pip inside the docker container
mpenkov Feb 26, 2022
b77a435
Revert "try to upgrade pip inside the docker container"
mpenkov Feb 26, 2022
eedd8ea
yet another attempt to upgrade pip
mpenkov Feb 26, 2022
9efd72b
replace curl with python call
mpenkov Feb 26, 2022
c369822
remove 3.6 from wheel build matrix
mpenkov Feb 26, 2022
fe5625b
adjust config.sh
mpenkov Feb 26, 2022
3eeac90
urlretrieve script
mpenkov Feb 26, 2022
18dce86
improve urlretrieve.py hack
mpenkov Feb 27, 2022
b16fd0c
argh, no f-strings
mpenkov Feb 27, 2022
5dd1d33
upgrade setuptools as well
mpenkov Feb 27, 2022
947aef5
remove .egg files before install_run
mpenkov Feb 27, 2022
d3130df
try upgrading importlib_metadata
mpenkov Feb 27, 2022
3084e96
more .egg file removal
mpenkov Feb 27, 2022
e539ec3
disable some builds for faster turn-around while hacking
mpenkov Feb 27, 2022
0ed8566
disable some builds for faster turn-around while hacking
mpenkov Feb 27, 2022
9be3caf
update build-wheels.yml
mpenkov Feb 27, 2022
31ec423
update numpy for py3.10
mpenkov Feb 27, 2022
fdd2f1b
get rid of morfessor in wheel build
mpenkov Feb 27, 2022
09c4549
trim TEST_DEPENDS
mpenkov Feb 27, 2022
299fd7d
clean up, lock scipy version for py310 wheel builds
mpenkov Feb 27, 2022
21b5fd9
upgrade manylinux version for Py310 linux build
mpenkov Feb 27, 2022
acfa0f1
bump Cython version to 0.29.28
mpenkov Feb 27, 2022
5fbc7be
adjust build_wheels workflow
mpenkov Mar 11, 2022
1c8966b
double quotes bad, single quotes good
mpenkov Mar 11, 2022
ecbf41c
re-enable all wheel builds
mpenkov Mar 11, 2022
367765f
re-enable all wheel builds
mpenkov Mar 11, 2022
bd2c276
separate test step for wheels
mpenkov Mar 11, 2022
06fe0d9
fix yaml
mpenkov Mar 11, 2022
69b8eba
argh tabs dammit
mpenkov Mar 11, 2022
3cc0365
delete old test step
mpenkov Mar 11, 2022
3b6b71f
fixup
mpenkov Mar 11, 2022
0a2ca7c
actually install wheel prior to test
mpenkov Mar 11, 2022
526b37a
copy pytest command from config.sh
mpenkov Mar 11, 2022
9cef296
do the tests really fail on Linux Py3.10?
mpenkov Mar 11, 2022
74cb490
fix testing under Windows
mpenkov Mar 12, 2022
2ca0964
mark blinking test as always xfail
mpenkov Mar 12, 2022
7571885
remove unused import
mpenkov Mar 12, 2022
2ca00ae
make wheel tests simpler
mpenkov Mar 12, 2022
a3c909f
re-enable tests, increase tolerance
mpenkov Mar 18, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 64 additions & 21 deletions .github/workflows/build-wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [3.6, 3.7, 3.8, 3.9]
python-version: ['3.7', '3.8', '3.9', '3.10']
os: [ubuntu-latest, macos-latest, windows-latest]
platform: [x64]
include:
Expand All @@ -43,11 +43,6 @@ jobs:
# https://github.com/scipy/oldest-supported-numpy/blob/master/setup.cfg
# with the exception that we enforce the minimum version to be 1.17.0.
#
- os: ubuntu-latest
manylinux-version: 2010
python-version: 3.6
build-depends: numpy==1.17.0

- os: ubuntu-latest
manylinux-version: 2010
python-version: 3.7
Expand All @@ -63,11 +58,10 @@ jobs:
python-version: 3.9
build-depends: numpy==1.19.3

- os: macos-latest
travis-os-name: osx
manylinux-version: 1
python-version: 3.6
build-depends: numpy==1.17.0
- os: ubuntu-latest
manylinux-version: 2014
python-version: "3.10"
build-depends: numpy==1.22.2 scipy==1.8.0

- os: macos-latest
travis-os-name: osx
Expand All @@ -87,10 +81,11 @@ jobs:
python-version: 3.9
build-depends: numpy==1.19.3

- os: windows-latest
manylinux-version: 2010
python-version: 3.6
build-depends: numpy==1.17.0
- os: macos-latest
travis-os-name: osx
manylinux-version: 1
python-version: "3.10"
build-depends: numpy==1.22.2 scipy==1.8.0

- os: windows-latest
manylinux-version: 2010
Expand All @@ -107,14 +102,19 @@ jobs:
python-version: 3.9
build-depends: numpy==1.19.3

- os: windows-latest
manylinux-version: 2010
python-version: "3.10"
build-depends: numpy==1.22.2 scipy==1.8.0

env:
PKG_NAME: gensim
REPO_DIR: gensim
BUILD_COMMIT: HEAD
PLAT: x86_64
UNICODE_WIDTH: 32
MB_PYTHON_VERSION: ${{ matrix.python-version }} # MB_PYTHON_VERSION is needed by Multibuild
TEST_DEPENDS: Morfessor==2.0.2a4 python-levenshtein==0.12.0 visdom==0.1.8.9 pytest pytest-cov mock cython nmslib pyemd testfixtures scikit-learn pyemd
TEST_DEPENDS: pytest mock testfixtures
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will removing pyemd (& perhaps others, as well) mean that certain tests we want run are skipped because of missing-libraries?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible, but this is for the purpose of testing the wheel, so running tests that require those dependencies isn't strictly necessary.

DOCKER_TEST_IMAGE: multibuild/xenial_x86_64
TRAVIS_OS_NAME: ${{ matrix.travis-os-name }}
SKIP_NETWORK_TESTS: 1
Expand Down Expand Up @@ -144,7 +144,7 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install virtualenv
- name: Build and Install Wheels (Multibuild)
- name: Build Wheel (Multibuild)
if: matrix.os != 'windows-latest'
run: |
echo ::group::Set up Multibuild
Expand All @@ -156,17 +156,16 @@ jobs:
before_install
echo ::endgroup::
echo ::group::Build wheel
find . -type f -name "*.egg" -exec rm -v {} \;
build_wheel $REPO_DIR ${{ matrix.PLAT }}
echo ::endgroup::
echo ::group::Install run
install_run ${{ matrix.PLAT }}
echo ::endgroup::

#
# We can't use multibuild on Windows, so we have to roll our own build script.
# Adapted from
# https://github.com/RaRe-Technologies/gensim-wheels/commit/084b863390edee05bbe15d4ec05d1ab726e52202
#
- name: Build and Install Wheels (Windows)
- name: Build Wheel (Windows)
if: matrix.os == 'windows-latest'
run: |
echo ::group::Set up dependencies
Expand All @@ -190,6 +189,50 @@ jobs:
#
mv dist wheelhouse

- name: Prepare for testing
run: |
#
# FIXME: Why are these eggs here?
#
# These eggs prevent the wheel from building and running on Py3.10
#
find . -type f -name "*.egg" -exec rm -v {} \;
python -m venv test_environment

#
# Multibuild has a test step but it essentially just installs the wheel
# and runs the test, and requires a lot of magic to get it working.
# It also does not work under Windows.
# So, we create our own simple test step here.
#
- name: Install and Test Wheel (Linux, MacOS)
if: matrix.os != 'windows-latest'
run: |
. test_environment/bin/activate
pip install pytest testfixtures mock
pip install wheelhouse/*.whl
cd test_environment
python -c 'import gensim;print(gensim.__version__)'
#
# This part relies on the wheel containing tests and required data.
# If we remove that from the wheel, we'll need to rewrite this step.
#
pytest -rfxEXs --durations=20 --disable-warnings --showlocals --pyargs gensim

#
# We need a separate testing step for windows because the command for
# activating the virtual environment is slightly different
#
- name: Install and Test Wheel (Windows)
if: matrix.os == 'windows-latest'
run: |
test_environment/Scripts/activate.bat
pip install pytest testfixtures mock
pip install wheelhouse/*.whl
cd test_environment
python -c 'import gensim;print(gensim.__version__)'
pytest -rfxEXs --durations=20 --disable-warnings --showlocals --pyargs gensim

- name: Upload wheels to s3://gensim-wheels
#
# Only do this if the credentials are set.
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,12 @@ jobs:
- {name: Linux, python: 3.7, os: ubuntu-20.04, tox: 'flake8,flake8-docs'}
- {name: Linux, python: 3.7, os: ubuntu-20.04, tox: 'py37-linux'}
- {name: Linux, python: 3.8, os: ubuntu-20.04, tox: 'py38-linux-cov'}
- {name: Linux, python: 3.9, os: ubuntu-20.04, tox: 'py39-linux'}
- {name: Linux, python: '3.10', os: ubuntu-20.04, tox: 'py310-linux'}
- {name: Windows, python: 3.7, os: windows-2019, tox: 'py37-win'}
- {name: Windows, python: 3.8, os: windows-2019, tox: 'py38-win'}
- {name: Windows, python: 3.9, os: windows-2019, tox: 'py39-win'}
- {name: Windows, python: '3.10', os: windows-2019, tox: 'py310-win'}
env:
TOX_PARALLEL_NO_SPINNER: 1

Expand Down
14 changes: 13 additions & 1 deletion config.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,20 @@ function build_wheel_cmd {
function run_tests {
# Runs tests on installed distribution from an empty directory
set -x
python --version
pip freeze
pytest -rfxEXs --durations=20 --disable-warnings --showlocals --pyargs gensim
set +x
}

#
# We do this here because we want to upgrade pip before the wheel gets installed.
# docker_test_wrap.sh sources this file before the wheel install. The sourcing
# happens from multiple places, and some of the Python versions can be really
# ancient (e.g. when working outside a virtual environment, using the default
# Python install).
#
# We don't use pip to do the actual upgrade because something appears broken
# with the default pip on the Python 3.10 multibuild image. This is really
# dodgy, but I couldn't work out a better way to get this done.
#
python continuous_integration/upgrade_pip_py310.py
10 changes: 10 additions & 0 deletions continuous_integration/upgrade_pip_py310.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# This script needs to be able run under both Python 2 and 3 without crashing
# It only achieves the desired effect under Py3.10 on Linux and MacOS.
import subprocess
import sys
import tempfile
if sys.platform in ('linux', 'darwin') and sys.version_info[:2] == (3, 10):
import urllib.request
with tempfile.NamedTemporaryFile(suffix='.py') as fout:
urllib.request.urlretrieve("https://bootstrap.pypa.io/get-pip.py", fout.name)
subprocess.call([sys.executable, fout.name])
Copy link
Collaborator

@gojomo gojomo Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a security-in-depth perspective, fetching & immediately running unversioned raw code off the web spooks me, & I'd avoid it if at all possible. Sure, this appears to be from the right pypa.io domain, & thus reflecting the software of a team/project (Python Packaging Authority) which we're already implicitly trusting via things like pip/PyPI installs that the same people/processes control.

But I know the PyPI repository is both a vulnerability point, & a major focus of attackers, and thus also a place where (I hope & expect) code-provenance & system-security matters are taken very seriously. Maybe pypa.io is as rigorous – they should be! – but the naked https://bootstrap.pypa.io/ directory isn't quite offering the signals-of-rigor I'd most like to see.

Maybe the added risk is considered acceptable, maybe we have to use this as a hacky workaround for a while – but we should try to retire it the moment it's not necessary, and we could take extra steps to ensure we're getting a consistent version of get-pip.py. For example: we could grab it from the github project by version-hash at https://github.com/pypa/get-pip/blob/main/public/get-pip.py, to be sure we only get a new one when we choose to roll forward – so our builds always use the exact version we verified OK during testing of this PR, nothing later (either improved or compromised). If this in fact misses a useful true update, causing something to break someday – maybe that's OK, because that's enough time passed to double-check if this kludge is still required.

7 changes: 4 additions & 3 deletions gensim/test/test_translation_matrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import logging

import numpy as np
import pytest

from scipy.spatial.distance import cosine
from gensim.models.doc2vec import Doc2Vec
Expand Down Expand Up @@ -61,9 +62,9 @@ def test_translate_nn(self):
for idx, item in enumerate(self.test_word_pairs):
self.assertTrue(item[1] in translated_words[item[0]])

@unittest.skipIf(
(sys.version_info.major == 3) and (sys.version_info.minor == 9) and (sys.platform == 'darwin'),
'blinking test, can be related to <https://github.com/RaRe-Technologies/gensim/issues/2977>'
@pytest.mark.xfail(
sys.platform == 'darwin',
reason='blinking test, can be related to <https://github.com/RaRe-Technologies/gensim/issues/2977>'
)
def test_translate_gc(self):
# Test globally corrected neighbour retrieval method
Expand Down
2 changes: 1 addition & 1 deletion gensim/test/test_word2vec.py
Original file line number Diff line number Diff line change
Expand Up @@ -834,7 +834,7 @@ def test_parallel(self):
# the exact vectors and therefore similarities may differ, due to different thread collisions/randomization
# so let's test only for top10
neighbor_rank = [word for word, sim in sims].index(expected_neighbor)
self.assertLess(neighbor_rank, 2)
self.assertLess(neighbor_rank, 3)

def test_r_n_g(self):
"""Test word2vec results identical with identical RNG seed."""
Expand Down
4 changes: 1 addition & 3 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -270,7 +270,6 @@ def run(self):
'mock',
'cython',
'testfixtures',
'Morfessor>=2.0.2a4',
]

if not (sys.platform.lower().startswith("win") and sys.version_info[:2] >= (3, 9)):
Expand Down Expand Up @@ -319,13 +318,12 @@ def run(self):
# to build with any sane version of Cython, so we should update this pin
# periodically.
#
CYTHON_STR = 'Cython==0.29.23'
CYTHON_STR = 'Cython==0.29.28'

install_requires = [
NUMPY_STR,
'scipy >= 0.18.1',
'smart_open >= 1.8.1',
"dataclasses; python_version < '3.7'", # pre-py3.7 needs `dataclasses` backport for use of `dataclass` in doc2vec.py
]

setup_requires = [NUMPY_STR]
Expand Down
2 changes: 1 addition & 1 deletion tox.ini
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tox]
minversion = 2.0
envlist = {py36,py37,py38, py39}-{win,linux}, py38-linux-cov, flake8, docs, docs-upload, download-wheels, upload-wheels, test-pypi
envlist = {py37,py38,py39,py310}-{win,linux}, py38-linux-cov, flake8, docs, docs-upload, download-wheels, upload-wheels, test-pypi
skipsdist = True
platform = linux: linux
win: win64
Expand Down