This is a test visualization to map PyPi package dependencies using graphistry.
Run 1.get_pypi.py to produce pypi.tsv (included in the repo, so optional). This is a tab separated file with package names, description, and version.
Run 2.get_meta.py to download meta data (a json file) for each package. Optionally, you can filter down to packages to ones that have meta data. This should be all of them, but I haven't finished parsing at the time of writing this README, so there may be some bugs with the PyPi API to be missing package data. Note that this was originally part of my repofish project (and will continue to be :O) )
The input is a simple csv file with target,source, and value column headers. For value, I'll first try using the package monthly downloads. This is generated with 3.map_dependencies.py
The code to generate the graph is in the ipython notebook pypi.ipynb. It's so ridiculously easy I just-can't-even!
You can see the visualization here
Note that this only includes packages with links, which reduces the subset down quite a bit.
I've started to look at comparing journals in Pubmed Central based on the functions they use. Early analysis can be seen in the python-in-pubmed notebook and a direct link to a simple clustering of journals can be seen here.