This online demo allows you to search npm packages by keyword and then sorts the results according to their PageRank:
PageRank is computed based on npm dependencies graph (do not confuse with Google's web pages PageRank).
Let's start with brief introduction into PageRank. PageRank is a node ranking
algorithm which assigns each node in the graph a number. Since this algorithm
was created by Google, this number answers a simple question: "What is
the probability of a user visiting page A
". If node A
has PageRank 0.1
, then
the probability of a user visiting page A
is 10%
.
npm packages and their dependencies can be viewed as graph. Packages are nodes,
and dependencies are edges. Since PageRank is a very generic algorithm we can
compute score of each package. In the npm world the PageRank would answer the
question: "What is the probability of your package depend on package A
".
This has an interesting implication, that you can find popular or important packages easily.
The npm keywords database and graph's PageRank is computed offline. First, I
download the entire catalogue of npm packages, using
skimdb.npmjs.com
. This is ~410MB of data. Then I convert
the raw response into ngraph.graph
instance.
Finally, these 26 lines of code collect all tags, compute pagerank,
and dump results into json file.
When you search something on the website I'm making lookup in the tags database, and then sort each matching package by their PageRank score.
Initial results are not bad, but far from being perfect. There are several ways to improve it, so your contribution is very welcomed here!
- Instead of search by keywords only, it's worth to look at
readme
anddescription
fields as well. - Explore topic-sensitive PageRank for each keyword.
- This is my first react application and I know it is ugly. Feel free to send PR with improvements.