by User:GreenC (en.wikipedia.org)
June 2020-2024
MIT License
Pgcount generates Wikipedia:List of Wikipedians by article count
- Designed for unlimited scalability, Wikipedia database size does not matter.
- Low memory and CPU use.
- Designed to fail and recover mid-process, state information is preserved.
- No SQL or queries, API driven.
- Caches between runs.
- Flexible for use with multiple wiki languages.
- GNU Awk 4.1+
- BotWikiAwk (version Jan 2019 +)
- A bot User account with bot permissions for your target wiki.
-
Install BotWikiAwk following setup instructions. Add OAuth credentials to wikiget, see the EDITSETUP instructions.
-
Clone Pgcount. For example: git clone https://github.com/greencardamom/Pgcount
-
Set ~/Pgcount/pgcount.awk to mode 750, and change the first shebang line to the location of awk on your system
-
Edit pgcount.awk in the "BEGIN{}" section is a place for you email address to send error reports to, and a few harded coded paths for common unix utilities.
Example crontab entry
4 3 1 * * /home/greenc/toolforge/pgcount/pgcount.awk -h en -d wikipedia.org