-
Notifications
You must be signed in to change notification settings - Fork 861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IRS 990 2019 index file discrepancy #651
Comments
Hi Brian, thank you for reporting! The majority of datasets here are managed by individual third parties (in this case the IRS). For most datasets there is a contact listed. In this case, however, it may be a little tricky for you to report to the IRS. I have contact information, and will make sure that your report gets to them. I'm not sure if the data provider is set up to interact here on Github, but I'll follow up in this thread with what I hear back. |
Right on, thank you Peter! We will likely do some more analysis of the other 2013-2020 index files to see if this issue arises in other years. Would it be helpful at all to share a summary of those findings here? |
I'll happily pass along any other issues that you find! |
Quick update - I ran some scripts against the index files. It looks like the ~20K extra returns are found in the 2019 CSV index file and were submitted submitted between 12/11/2019 and 12/30/2019. However, the JSON index file stops at 12/10/2019. This does not seem to be an issue for prior years, though I did notice that the 2020 CSV index file was updated today (8/31/2020) but the date on the JSON file is from 8/11/2020. |
Much obliged—I'm sure that will be useful in for them! Sounds like maybe the processes just happen at slightly different times and maybe the end of 2019 just didn't get updated. Will pass this along. |
Looks like the IRS finally caught the 2019 JSON index file up with the 2019 CSV index file. Our weekly import process picked up 20,693 new filings from the 2019 JSON index this week. |
While doing some analysis on 990 filings, I noticed a discrepancy between the number of filings in the 2019 CSV and JSON index files. It appears that the CSV index file has 416,880 while the JSON has 396,217. The CSV file also looks to have been updated much more recently than the JSON file (4/2020 vs 12/2019). I have not checked the index files for other years, though there may be conflicting counts there as well.
Wasn't sure if this is the best place to report it, but the ~20K difference seemed pretty significant. I haven't done any additional analysis yet to rule out something like duplicate records - figured I'd start here. Happy to lend a hand if I can help in any way.
The text was updated successfully, but these errors were encountered: