Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update pip, & prep for customization #70

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

aalexanderr
Copy link
Contributor

@aalexanderr aalexanderr commented Aug 26, 2021

Benefits:

  • easier to change parallelization to have futures returned for downloads for fetchcode users to know if the download has succeeded
  • easier to move from interactive prompting user for input to class attributes/method args
  • few additional notes in first commit message on this branch on the reasoning behind updating pip

Questions:
What are the licenses from https://github.com/nexB/fetchcode/tree/master/src/fetchcode/vcs pertaining to? I mean the ones that have just {license-identifier}.LICENSE Should they be removed after _vendor removal?

I've checked previous PRs and saw that there were concerns regarding downloading stuff in tests, I hope pip's setup is OK:

  • has not changed since 2013 (https://github.com/pypa/pip-test-package)
  • for CI purposes all tests requiring network access are marked as flaky to be rerun on fail (when CI env is set)
  • some vcs fetchers are run only in when CI env is set

Please note that fetchcode's vcs might not work on this branch.
Initially pip was commited without it's history and with few changes
applied. This update approaches this differently- by commiting pip code
in a single commit & applying changes on top of it in separate commits.
While much of pip's code will be stripped from this repository,
the goal of this is to make it easier to take changes from upstream,
even after the code will be modified.
While git-subtree could be used it brings it's own set of issues.

Update should make it easier to track potentially replicated issues from
pip when taking their vcs pkg.

It also made cleaning up easier, due to some maintenance activities done
in pip:
- dropping Python 2 & 3.5 support
  pypa/pip#9189
- modernized code after above - partially done, tracked in:
  pypa/pip#8802
- added py3.9 support
- updated vendored libraries (e.g. fixing CVE-2021-28363)
  multiple PRs

pip._internal.vcs (and related code) changes between
20.1.1 and 21.2.4
- Fetch resources that are missing locally:
  pypa/pip#8817
- Improve SVN version parser (Windows)
  pypa/pip#8665
- Always close stderr after subprocess completion:
  pypa/pip#9156
- Remove vcs export feature:
  pypa/pip#9713
- Remove support for git+ssh@ scheme in favour of git+ssh://
  pypa/pip#9436
- Security fix in git tags parsing (CVE-2021-3572):
  pypa/pip#9827
- Reimplement Git version parsing:
  pypa/pip#10117

In next commits, most of pip's internals will be removed from fetchcode,
leaving only vcs module with supporting code (like utils functions,
tests (which will be submitted alongside this change))

This will allow for changes such as ability to add return codes
(probably via futures) from long running downloads and other features.

Switching to having own vcs module might also be a good call due to
pip._internal.vcs integration with pip's cli in vcs module (some
pip code has been commented out in commit mentioned below)

While generally copy-pasting code (rather than using
submodules/subtrees etc) makes it harder to track, my git-foo is not great
enough for me to attempt regrafting subset of pips history that is of
note from fetchcode perspective.
It has been agreed with @pombredanne & @TG1999 that history from pip
will be regrafted on fetchcode by @pombredanne (thanks!). It will be done
only for the files that are of concern for fetchcode to limit noise in
git history.

The code submitted in scope of this commit is work of many pip's authors
that can bee seen here:
https://github.com/pypa/pip/blob/21.2.4/AUTHORS.txt

Pip is licensed under MIT (https://pypi.org/project/pip/)

Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>
Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>
To reproduce:
find src/fetchcode/vcs/pip -type f -name '*.py'i \
    -exec sed -i 's/from\ pip._/from\ fetchcode.vcs.pip._/g' {} + \
    -exec sed -i 's/import\ pip._/import\ fetchcode.vcs.pip._/g' {} + \
    -exec sed -i 's/"pip._/"fetchcode.vcs.pip._/g' {} +

This is very similiar to what was done in:
8046215

Signed-off-by: TG1999 <tushar.goel.dav@gmail.com>
Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>
There is still some pip-specific code left but it should be a lot easier
to further clean-up and adjust pip's vcs module for fetchcode needs.

Additionally init logger in pip root pkg.

Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>
Steps to reproduce:
mkdir ~/tmp/pipdiff && \
git clone --depth 1 --branch 21.1.4 https://github.com/pypa/pip.git && \
cd pip && \
find . -type f -name '*.py'i \
        -exec sed -i 's/from\ pip._/from\ fetchcode.vcs.pip._/g' {} + \
        -exec sed -i 's/import\ pip._/import\ fetchcode.vcs.pip._/g' {} + \
        -exec sed -i 's/"pip._/"fetchcode.vcs.pip._/g' {} + && \
cd /tmp/pipdiff && \
git clone https://github.com/nexB/fetchcode.git && \
cd fetchcode && \
git checkout ab65b2e && \
cd /tmp/pipdiff && \
diff -Naur fetchcode/src/fetchcode/vcs/pip pip/src/pip

This commit contains changes reproduced between copy of pip's src and
8046215 thus authors SoB was added.

Signed-off-by: TG1999 <tushar.goel.dav@gmail.com>
Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>
Obtain does not support passing just vcs type anymore.

There is git protocol, thus git:// was changed to git+git://
bzr, hg & bzr do not have their own protocols thus file protocol was
used.

Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>
As rest of fetchcode does not use vendoring, it has been dopped
alltogether.

Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>
WARNING: tests will fail on this commit.
Next commit will fix import paths and remove unnecessary stuff for the
tests to work.

File structure has been modified from pip, below is the mapping of
fetchcode to pip. Files were copied without any modifications in this
commit. Paths from respective repositories roots:
{
"tests/conftest.py":"tests/conftest.py",
"tests/test_vcs_pip_bazaar.py":"tests/functional/test_vcs_bazaar.py",
"tests/test_vcs_pip_git.py":"tests/functional/test_vcs_git.py",
"tests/test_vcs_pip_mercurial.py":"tests/functional/test_vcs_mercurial.py",
"tests/test_vcs_pip.py_subversion":"tests/functional/test_vcs_subversion.py",
"tests/test_vcs_pip.py":"tests/unit/test_vcs.py",
"tests/test_vcs_pip_mercurial_unit.py":"tests/unit/test_vcs.py",
"tests/lib/__init__.py":"tests/lib/__init_.py",
"tests/lib/git_submodule_helpers.py":"tests/lib/git_submodule_helpers.py",
"tests/lib/local_repos.py":"tests/lib/local_repos.py",
"tests/lib/path.py":"tests/lib/path.py",
}

It has been agreed with @pombredanne & @TG1999 that history from pip
will be rebased on fetchcode by @pombredanne (thanks!). It will be done
only for the files that are of concern to fetchcode.

I'm leaving this commit without SoB intentionally, as this is not my
work, but that of the many pip's authors:
https://github.com/pypa/pip/blob/21.2.4/AUTHORS.txt
License of pip: MIT (https://pypi.org/project/pip/)

add conftest

Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>

Add dependencies for pip's vcs tests

Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>

Fix imports in tests

find tests -type f -name '*.py' \
    -exec sed -i 's/from\ pip._/from\ fetchcode.vcs.pip._/g' {} + \
    -exec sed -i 's/import\ pip._/import\ fetchcode.vcs.pip._/g' {} + \
    -exec sed -i 's/"pip._/"fetchcode.vcs.pip._/g' {} + \
    -exec sed -i "s/'pip._/'fetchcode.vcs.pip._/g" {} +

Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>

Fix pip tests to run in fetchcode

Fix pytest opts
Register pytest.markers used by pip's tests
Remove unused markers
Fix tests to run in CI

Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>
Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>
Copyright notice is based on original pip copyright:
	Copyright @ 2008-2021 The pip developers (see AUTHORS.txt file). All
	rights reserved.
The year was removed being unnecessary and AUTHORS file name was
adjusted.

Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>
Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>
Signed-off-by: Alexander Mazuruk <a.mazuruk@samsung.com>
@TG1999
Copy link
Member

TG1999 commented Nov 21, 2021

Hey @aalexanderr it's a big diff can you please filter out the files that need special attention here, it will help us to review this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants