PyDigger implemented in Rust
See also the source code of the front-end
See DEVELOPMENT
-
Get the details in the json file
-
Create reports in JSON format and build a front-end that can show the data requiring only front-end code.
-
From all the project JSON files
- Total number of projects
- Number of projects that have no
home_pagefield - Check if the
home_pagefield is one of the well-known VCS systems. - Download the source code from pypi
- Run
mypyon the code and report the errors - List the packages that have VCS but have problems with
mypy
-
The report folder will contain a file called
report.jsonthat contains,totalstats
-
There will be a file called
projects.jsonthat lists all the project names. -
For each project there will be a file called
projects/<PROJECT>.jsonwith the collected meta-dataversionvcs
-
The reports folder will be the
docsfolder of a separate repository -
There will also be a repository with the front-end code which is an HTML page and some JavaScript that will load the
report.jsonon when loaded.
-
Download RSS feed of recent uploads, in memory
-
Download meta-data from each recently uploaded project, but only if we have not processed it yet. (We can chect this using the file we save in the next step)
-
Extract some data from the meta-data that we would like to display and save it in a hashed folder that will be in a github repo.
-
Download the package from pypi, analize it and save some more data in the per-project file.
-
Download the git repository of the project and run some further analuzis on it and save the result in the per-project file.
-
Collect some stats from the saved files and save that too in the "data-repo"
-
Copy the files from the data repo to the web site
-
Copy the HTML file and other static files to the web site.
-
Copy the front-end code to the web site
Collect
- Save how many were in the RSS feed in the most recent run.
- Save the start time of data collection.
- Save elapsed time of data collection.
- Collect more fields.
-
- Save how many projects were download in the most recent run
- In the current run cache the names of the projects we see and only download each one once - this will be more important when we move to async.
- Save how many unique names were in the most recent RSS feed
- Switch data collection to be async
Report
- Show the total number of packages.
- List the N most recent packages.
- Create stats page with list of licenses, show a pre-defined list of licenses. List of projects with other licenses, list of projects without license.
- Use the
home_pageto create stats page with list of VCS hosts, how many projects use each host. List the projects that has unrecognized VCS. List project without VCS. -
- Use sources other than
home_pagefor the VCS report - For each package create its own page
Other:
- Create a favicon
- Improve the design,
- Start using the production version of Vue.