IRS Form 990-PF Fetch :: Node.js Edition
Lightweight but powerful Node.js scripts to fetch all machine-readable IRS Form 990-PFs and insert into a MongoDB database.
Datasets: Public IRS data set hosted by Amazon AWS
What it Does
- Fetch each year's index of Form 990s
- Parse JSON and limit to Form 990-PFs
- Fetch XML tax filings
- Insert into MongoDB
indexes.js- Fetches index listings relating to IRS Form 990-PF filings for the specified tax year and inserts into MongoDB
filings.js- Fetches all IRS Form 990-PF filings for the specified tax year and inserts into MongoDB
Grantmakers.ioScripts Used by
fetch.js- Inserts individual index and filing info into MongoDB
aggregate.js- Combines info by EIN
normalize.js- Pulls specific information across tax year
utilities/- Various scripts and Mongo queries.
experiments/- In process experiments. These are often one-off items built on rainy Saturday mornings to scratch an itch, so tread carefully.
iMac with 16GB RAM
ulimit -n 4096 && mongod --dbpath ./data/db/
ulimit -n 4096 && node combined
MacBook Air with 8GB RAM (struggles)
ulimit -n 2048 && mongod --dbpath ./data/db/
ulimit -n 2048 && node combined
Note: The IRS no longer offers a single index for all filings. Thus, each script must be run once for each year (e.g. toggle the year).
Testing out Google Cloud functions (see
gcf_http folder) - first up is a simple script to check for updates.
A huge thank you to Joseph Lepis for the architectural guidance and mentorship. If you find these scripts useful and appreciate hard-boiled fiction, check out Joe's debut novel, On the Edge.
Copyright (c) 2016 Chad Kruse, SmarterGiving
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.