# Pipstry - a little helping tool for non-developers to solve the pip mistry

If you are not a developers (e.g. Data Scientist/ Analyst) who never use virtual enviroments and one day you want to know which packages you have actively install, which ones are dependencies that was automatically installed. Good luck! Here is a simple jupyter notebook to help you solve this pip mistry. And please, start to use virtual enviroments tools such as:

* [pipenv](https://github.com/pypa/pipenv)
* [venv](https://docs.python.org/3/library/venv.html)
* [virtualenv](https://virtualenv.pypa.io/en/stable/)
* [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)

*Thanks to my friend [Alexander Hendorf](https://github.com/alanderex) for the inspiration*

You know that you can check which packages you have installed using `pip list`

In [1]:
!pip list

Package            Version
------------------ -------
appnope            0.1.0  
attrs              19.1.0 
backcall           0.1.0  
bleach             3.1.0  
decorator          4.4.0  
defusedxml         0.6.0  
entrypoints        0.3    
ipykernel          5.1.1  
ipython            7.6.1  
ipython-genutils   0.2.0  
ipywidgets         7.5.0  
jedi               0.14.1 
Jinja2             2.10.1 
jsonschema         3.0.1  
jupyter            1.0.0  
jupyter-client     5.3.1  
jupyter-console    6.0.0  
jupyter-core       4.5.0  
MarkupSafe         1.1.1  
mistune            0.8.4  
nbconvert          5.5.0  
nbformat           4.4.0  
notebook           5.7.8  
pandocfilters      1.4.2  
parso              0.5.1  
pexpect            4.7.0  
pickleshare        0.7.5  
pip                19.1.1 
prometheus-client  0.7.1  
prompt-toolkit     2.0.9  
ptyprocess         0.6.0  
Pygments           2.4.2  
pyrsistent         0.15.3 
python-dateutil    2

Let's use this to capture all packages that you have installed

In [2]:
import subprocess

In [3]:
all_packages = subprocess.check_output(["pip", "list"]).decode().splitlines()[2:]
all_packages = list(map(lambda x: x.split()[0], all_packages))

In [4]:
all_packages

['appnope',
 'attrs',
 'backcall',
 'bleach',
 'decorator',
 'defusedxml',
 'entrypoints',
 'ipykernel',
 'ipython',
 'ipython-genutils',
 'ipywidgets',
 'jedi',
 'Jinja2',
 'jsonschema',
 'jupyter',
 'jupyter-client',
 'jupyter-console',
 'jupyter-core',
 'MarkupSafe',
 'mistune',
 'nbconvert',
 'nbformat',
 'notebook',
 'pandocfilters',
 'parso',
 'pexpect',
 'pickleshare',
 'pip',
 'prometheus-client',
 'prompt-toolkit',
 'ptyprocess',
 'Pygments',
 'pyrsistent',
 'python-dateutil',
 'pyzmq',
 'qtconsole',
 'Send2Trash',
 'setuptools',
 'six',
 'terminado',
 'testpath',
 'tornado',
 'traitlets',
 'wcwidth',
 'webencodings',
 'widgetsnbextension']

Also, you can use `pip show <package-name>` to check what the package requires and required by (last 2 lines)

In [5]:
!pip show six

Name: six
Version: 1.12.0
Summary: Python 2 and 3 compatibility utilities
Home-page: https://github.com/benjaminp/six
Author: Benjamin Peterson
Author-email: benjamin@python.org
License: MIT
Location: /Users/rentaluser1/.pyenv/versions/3.7.3/envs/pydataldn/lib/python3.7/site-packages
Requires: 
Required-by: traitlets, python-dateutil, pyrsistent, prompt-toolkit, jsonschema, bleach


Let's save them all (it takes a while... or a long time if you have many packages)

In [6]:
packages_cache = {}
for package in all_packages:
    packages_cache[package] = subprocess.check_output(["pip", "show", package]).decode()

Find all packages that is not required by any thing else

In [7]:
top_packages = []
for package in all_packages:
    if len(packages_cache[package].splitlines()[-1].split()) == 1:
        top_packages.append(package)

In [8]:
top_packages

['Jinja2', 'jupyter', 'pip', 'Pygments']

Oppos, seems there is a problem as `notebook` requires `jinja2`, but is not shown in `Required-by` at `jinja2` 

In [9]:
!pip show Jinja2

Name: Jinja2
Version: 2.10.1
Summary: A small but fast and easy to use stand-alone template engine written in pure python.
Home-page: http://jinja.pocoo.org/
Author: Armin Ronacher
Author-email: armin.ronacher@active-4.com
License: BSD
Location: /Users/rentaluser1/.pyenv/versions/3.7.3/envs/pydataldn/lib/python3.7/site-packages
Requires: MarkupSafe
Required-by: 


In [10]:
!pip show notebook

Name: notebook
Version: 5.7.8
Summary: A web-based notebook environment for interactive computing
Home-page: http://jupyter.org
Author: Jupyter Development Team
Author-email: jupyter@googlegroups.com
License: BSD
Location: /Users/rentaluser1/.pyenv/versions/3.7.3/envs/pydataldn/lib/python3.7/site-packages
Requires: terminado, pyzmq, traitlets, jupyter-core, ipykernel, ipython-genutils, jinja2, Send2Trash, tornado, nbconvert, jupyter-client, nbformat, prometheus-client
Required-by: widgetsnbextension, jupyter


For each of the `top_packages` that we found before, double check that it is not require by others by checking `all_packages`

In [11]:
real_top_packages = list(top_packages)
for the_package in top_packages:
    for package in all_packages:
        all_dependencies = packages_cache[package].splitlines()[-2].lower().replace(',',' ').split()[1:]
        if (the_package.lower() in all_dependencies) and (the_package in real_top_packages):
            real_top_packages.remove(the_package)   

In [12]:
real_top_packages

['jupyter', 'pip']

After all these hard work, we know which ones are the one that we only need to install to recreate the enviroment.